

Understand that basically ANYTHING that “uses AI” is using you for training data.
At its simplest, it is the old fashioned A/B testing where you are used as part of a reinforcement/labeling pipeline. Sometimes it gets considerably more bullshit as your very queries and what would make you make them are used to “give you a better experience” and so forth.
And if you read any of the EULAs (for the stuff that google opted users into…) you’ll see verbiage along those lines.
Of course, the reality is that google is going to train off our data regardless. But that is why it is a good idea to decouple your life from google as much as possible. It takes a long ass time but… no better time than today.








Yes, they are. Not sure why you are bringing that up.
For those wondering what the actual difference is (possibly because they don’t seem to know):
At a high level, training is when you ingest data to create a model based on characteristics of that data. Inference is when you then apply a model to (preferably new) data. So think of training as “teaching” a model what a cat is, and inference as having that model scan through images for cats.
And a huge part of making a good model is providing good data. That is, generally speaking, done by labeling things ahead of time. Back in the day it was paying people to take an amazon survey where they said “hot dog or no hot dog”. These days… it is “anti-bot” technology that gets that for free (think about WHY every single website cares what is a fire hydrant or a bicycle…)
But that is ALSO just simple metrics like “Did the user use what we suggested”. Instead of saying “not hot dog” it is “good reply” or “no reply” or “still read email” or “ignored email” and so forth.
And once you know what your pain points are with TOTALLY anonymized user data, you can then “reproduce” said user data to add to your training set. Which is the kind of bullshit facebook, allegedly, has done for years where they’ll GLADLY delete your data if you request it… but not that picture of you at the McDonald’s down the street because that belongs to Ronjon Buck who worked there one summer. But they’ll gladly anonymize your user data so the picture of you actually just corresponds to “User 25156161616” that happens to be the sibling of your sister and so forth…
That is literally just a feedback loop and is core to pretty much any “agentic” network/graph.
There also tend to be laws about opting in and forced EULA agreements. It is almost like the megacorps have acknowledged that they’ll just do whatever and MAYBE pay a fee after they have made so much more money already.