| General > General Technical Chat |
| The impudence of Microsoft has reached new (criminal?) heights |
| << < (10/54) > >> |
| NiHaoMike:
--- Quote from: Siwastaja on June 21, 2020, 02:29:07 pm ---It won't - web browsing is slow enough as it is, both for communication bandwidth and browser CPU/memory resources, you don't afford generating realistic-looking fake internet use at 100x rate unless you have a 10GB/s fiber optic connection and a massively powerful 32-core machine with 64GB of RAM. Even then, I doubt realistic 100:1 would be possible. --- End quote --- It takes very little bandwidth and client resources to do a Web search. And how often do you use the search anyways? Generating 100x as many searches per day as you normally do is not hard at all. Now if you want to generate 100x fake video streaming sessions, that would indeed be harder. Even then, as long as you're not looking for full HD on every stream, it's quite possible to do 100 streams in less than a gigabit. Skip the rendering and you'll save a lot on compute power. But there's a much better way to make fake Youtube views that are indistinguishable from your real ones - use youtube-dl to download videos to a temp directory, keep the ones you're really interested in, and discard the rest. --- Quote ---Note that normal internet usage is far from noise; it's a very specific "signal", depending on who's using it. And detecting a signal on top of noise is easy-peasy. The key would be not to generate truly random noise, but noise that seems similar to that signal. --- End quote --- So then make the "noise" a bit less random. Or indeed, base it on real traffic. For example, if I make a lot of fake searches similar to a real search about topic A, it would appear that I'm a lot more interested in topic A than topic B. |
| Mr. Scram:
--- Quote from: NiHaoMike on June 21, 2020, 02:14:57 pm ---What if the fake traffic outnumbers real traffic by a wide margin, say 100 to 1 or even more? --- End quote --- If I'm completely honest, asking these questions is admitting you don't understand the subject matter well enough. People in general seem surprisingly naive what data collection and the subsequent processing entails. This is a wonderful quality, but also means companies can collect with impunity. This is an industry built on deducing marketable conclusions from huge amounts of raw and noisy data. It's far beyond some guy or basic algorithm looking at your browsing and going "you looked at a backpack, you must want a backpack". That happens too, but is a incredibly crude and clunky approach compared to the profiles built of people and their behaviour. What's done is preferably collecting large amounts of data and matching patterns to other know patterns. Facebook can say with an uncomfortable degree of accuracy whether you're about to engage in or end a relationship. They don't do this by looking at the content of your messages, but at the metadata like how much messages you send when and similar patterns. Note that Facebook also collects large amounts of data about non members. This is also why large scale data collection is a dangerous game as you can make accurate predictions about things people aren't even aware of themselves. This includes illnesses, pregnancies and all kinds of medical data but also political matters. Remember Cambridge Analytica? It's industrialized manipulation. Other than on the micro level people are ridiculously predictable. To create meaningful noise means creating an adversarial model. "The other side" not only has a massive headstart but is also ridiculously well funded as it's literally an industry worth many billions of dollars, so there's not really a chance of even putting up a fight. Note that I'm simplifying a couple of things here as it's a massive, sprawling and quickly evolving industry. |
| NiHaoMike:
So in the end, the first thing to do to fight back is to use adblocking to cut down on the source of revenue? The noise generation, even if not effective at obscuring the real data, still makes good protest that increases their costs. If the end goal is to discourage data collection, I wonder if generating a ridiculous number of ad "views" (using AdNauseam) in the hopes that the views will be marked as invalid (and won't give them any revenue) would actually be a good idea. Or just use a regular adblocker and take away the revenue. If you're not actually using the service you're feeding fake data to, the question of whether the noise is effective at obscuring real data is irrelevant when there is no real data to find. Would there be a chance that when presented with 100% noise (especially if not completely random), the algorithms would find a false pattern? |
| SilverSolder:
--- Quote from: Mr. Scram on June 21, 2020, 06:28:08 pm --- --- Quote from: NiHaoMike on June 21, 2020, 02:14:57 pm ---What if the fake traffic outnumbers real traffic by a wide margin, say 100 to 1 or even more? --- End quote --- If I'm completely honest, asking these questions is admitting you don't understand the subject matter well enough. People in general seem surprisingly naive what data collection and the subsequent processing entails. This is a wonderful quality, but also means companies can collect with impunity. This is an industry built on deducing marketable conclusions from huge amounts of raw and noisy data. It's far beyond some guy or basic algorithm looking at your browsing and going "you looked at a backpack, you must want a backpack". That happens too, but is a incredibly crude and clunky approach compared to the profiles built of people and their behaviour. What's done is preferably collecting large amounts of data and matching patterns to other know patterns. Facebook can say with an uncomfortable degree of accuracy whether you're about to engage in or end a relationship. They don't do this by looking at the content of your messages, but at the metadata like how much messages you send when and similar patterns. Note that Facebook also collects large amounts of data about non members. This is also why large scale data collection is a dangerous game as you can make accurate predictions about things people aren't even aware of themselves. This includes illnesses, pregnancies and all kinds of medical data but also political matters. Remember Cambridge Analytica? It's industrialized manipulation. Other than on the micro level people are ridiculously predictable. To create meaningful noise means creating an adversarial model. "The other side" not only has a massive headstart but is also ridiculously well funded as it's literally an industry worth many billions of dollars, so there's not really a chance of even putting up a fight. Note that I'm simplifying a couple of things here as it's a massive, sprawling and quickly evolving industry. --- End quote --- So... What can we do, if we intensely dislike the way the Internet has ended up? ... At least ad blocking means we don't have to look at the intrusion, even if it is still there... |
| Mr. Scram:
--- Quote from: SilverSolder on June 21, 2020, 09:04:23 pm ---So... What can we do, if we intensely dislike the way the Internet has ended up? ... At least ad blocking means we don't have to look at the intrusion, even if it is still there... --- End quote --- You roll up your sleeves and do what you can. Proper ad blocking doesn't just hide the ads and is an important first step. Reduce your fingerprint. In the case of telemetry you'll want an independent firewall or PiHole type deal. It's definitely an uphill battle, but I consider not doing anything at least as unattractive. |
| Navigation |
| Message Index |
| Next page |
| Previous page |