My header ended up looking like a permuted version of this:
en-US,en;q=0.9,zh-CN;q=0.8,de;q=0.7,ja;q=0.6
I never manually configured any of those extra languages in the browser settings. All I had done was tell Chrome not to translate a few pages on some foreign news sites. Chrome then turned those one-off choices into persistent signals attached to every request.I'd be surprised if anyone in my vicinity share my exact combination of languages in that exact order, so this seems like a pretty strong fingerprinting vector.
There was even a proposal to reduce this surface area, but it wasn't adopted:
https://github.com/explainers-by-googlers/reduce-accept-lang...
I imagine Chrome is really adding the language to your browser preferences when you choose not to translate a page, and the HTTP client in the browser is generating request headers based on your preferred languages. A small (and largely unimportant) semantic point, but it's possible that the Google translate team weren't aware of how adding a preferred language might impact user privacy. That isn't to excuse the behaviour; they should have checked.
The DOJ is totally spineless and refuses to squash Google's absurd monopoly on the internet. We are literally the last line of defense, even though we really don't amount to much.
Perhaps we could start a grassroots movement.
Mozilla Foundation is rudderless. I'm convinced the leadership are all Google plants who are keeping the "antitrust litigation sponge" from doing anything damaging to Chrome.
A clueless person might not know any better, but you clearly do, and also you seemingly care. So why do you use Google all the same?
If the browser header says windows but the fonts available says linux, that's a very distinctive signal.
And if the UA says Chrome but some other signal says not-chrome, that's very distinctive as well.
I think, you are falling for a technical fallacy. It's not costing them any more time.
AFAIK some of these options are there to be used by the Tor browser, which comes with strict configuration assumptions, and it doesn't translate well to normal Firefox usage. Especially if you change the window size on a non-standardized device. Mind you, the goal is not to block fingerprinting, but to not stand out. Safari on a macbook is probably harder to fingerprint than Firefox on your soldering iron.
However, judging by the fact that every data hungry website seemingly has a huge problem with VPN usage, I'd presume they are pretty effective and fingerprinting is not.
I will say one downside to using it is far more bot detection websites freaking out over generic information being returned to them, causing some sites to break (some of their settings breaking webgl games too due to low values). Using a different profile avoids this, or explicitly whitelisting certain sites in privacy.resistFingerprinting.exemptedDomains - obviously if a site is using a generic tracking service for bot detection, that kills a fair amount of the benefit of the flag, so a separate profile might be best. I wish firefox had a container option for this.
... and. not too sure what you mean by changing window size on a non-standardised device. They do try to ensure the window sizes are at standard intervals, as if they were fullscreened at typical widths to reduce fingerprinting, but surely that applies to using Tor too? I mean, people don't use Tor on dedicated monitors at standard sizes.
for bonus points, is there no way to strip all headers on chrome on control it better?
I sometimes do use Safari which is a more convenient browser - it would be ironic if DDG browser is less private than Safari.
All "fingerprint" tests I've run have returned good results.
Clearly it thinks you prefer Chinese to German. Was that correlated with the frequency of your requests on Google Translate? With your browsing history? With your shopping history?
>> Instead of sending a full list of the users' preferred languages from browsers and letting sites figure out which language to use, we propose a language negotiation process in the browser, which means in addition to the Content-Language header, the site also needs to respond with a header indicating all languages it supports
Who thought that made sense? Show me the website that (1) is available in multiple languages, and also (2) can't display a list of languages to the user for manual selection.
I use to do some work in this area. The first question is difficult and the second is no. We had the best results when we used various methods to detect the preferred language and then put up a language selector with a welcome message in that language. After they made a selection, it would stick on return visits.
Also, when using VPN, Google typically uses a language based on IP address, not my language header. I assume the header is only useful for fingerprinting today.
I live in Hyderabad, Telangana, India. I do not yet speak enough Telugu or Hindi or Urdu to be useful, and cannot read Hindi or Urdu at all; but I’m a foreigner who grew up on English only, rather rare around here, so let’s consider native Indians instead. Many can speak these languages but not read them in their native scripts, only romanised (in which case they can probably speak English tolerably). And many (many) come from other parts of India (or even Nepal) and can’t speak Telugu. Or are Muslim and at least prefer to deal in Hindi, often not having very good Telugu. And so on. It’s messy.
Some IP geolocation doesn’t even get the city right—I’ve seen Noida suggested, which is up north in Hindi territory.
Judging by... a large number of websites, you make the list available in a topbar, and each language is named in itself. You don't apply one language to the entire list.
Here's the first page that popped into my head as one that would probably offer multiple languages (and it does!):
They've got the list in a page footer instead of a header, but otherwise it's an absolutely standard language selector. It does technically identify countries rather than languages. The options range from Azərbaycan to Україна. They are -- of course -- displayed to every visitor.
Why would you want to force someone to consume your website in the wrong language?
And why would the list be in a single language, again?
> Most non-native speakers have a hard time finding the link
You might notice the colorful flag right next to it.
Why not have this negotiation implemented at the browser level?
The page defaults to the locale that you request in the URL. https://www.apple.com/ shows up in English, regardless of your country;† https://www.apple.com/bg/ shows up in Bulgarian. Switching your preferred location simply takes you to the page for that location. (Dyson does the same thing.) Some locations support more than one language; there's https://www.apple.com/lae/ for Latin America (English) and https://www.apple.com/la/ for Latin America (Spanish). If you're on the page for a location like this, a language selector (with language names) displays next to the location selector. In the case of Latin America, only two languages are supported, and the language selector automatically displays "Español" if you're on the English site and "English" if you're on the Spanish site, which makes sense but won't generalize.
Apple's selector is inconspicuous because it refuses to display flags, which I would guess is due to much higher political exposure than Dyson. So it's lower-quality in two ways, but fundamentally the same approach. The user asks for a language, and the site honors that.
Given that I presented Dyson as an example of doing language selection correctly, I'm confused about what you wanted me to see on apple.com. They're trying to do the right thing, but less effectively.
† I tested this by accessing the site(s) from Mongolia, Vietnam, and Morocco using ExpressVPN.
If this was designed and implemented as a standard at the browser level, we would get something better in the end, rather than re-implementations on each and every website.
From a pragmatic perspective, we are forcing two very different networks to run on the same protocols:
The Business Internet: Banking, SaaS, and VC-funded content (Meta/Google).
The Fun Internet: Hobby blogs, Lego fan sites, and the "GeoCities" spirit.
You cannot have a functioning "Business Internet" without identity verification. If you try to perform a transaction (or even just use a subsidized "free" tool like Gmail) while hiding behind a generic, non-unique fingerprint, you look indistinguishable from a bot or a fraudster.
Fingerprinting is often just the immune system of the commercial web trying to verify you are human.
The friction arises because we expect the "Fun Internet" to play by different rules. A Lego fan site shouldn't need to know who I am. But because we access both the Lego site and our Bank using the same browser, the same IP, and the same free tools (Chrome/Search), the "Fun Internet" becomes collateral damage of the "Business Internet's" need for security and monetization.
We can't have it both ways. We accepted the SLA for the "Business Internet" in exchange for free, billion-dollar tools. If you want 100% anonymity, you are effectively asking to use the commercial web's infrastructure without providing the identity signal it runs on.
As the OP notes, mitigation is hard. But that’s not just because advertisers are "evil"—it's because on the modern web, anonymity looks exactly like a security threat.
Yes, you can. Just like you can have a functioning grocery store without checking the identity of each shopper that walks through the door.
What you cannot have is a free and democratic society or an efficient free market without robust protections for individual privacy. Privacy is the best shield the less powerful have from being abused and exploited by the more powerful.
> We accepted the SLA for the "Business Internet" in exchange for free, billion-dollar tools.
No, we did not accept. There was no informed consent. The full consequences of our use of these services was and is still is kept hidden from us. Tracking happens invisibly, without our knowledge or consent. This deprives us of the opportunity to express our true preference and opt out and choose an alternative. It's employing deception in order to subvert the consumer's ability to make a rational choice that represents their best interests.
> on the modern web, anonymity looks exactly like a security threat
An anonymous user who just uses the service normally and does not attempt to access sensitive information without authorization does not look like a security threat.
"But won't you miss XYZ?" Nope, don't care, want it gone. If you can't be bothered to go to the store and get it then it probably didn't matter very much.
There would have to be laws which require site owners to answer that question honestly, so that users have a choice and such a search engine can be built. But states are interested in fingerprinting too, so I guess such will never happen.
literally everything useful works on the business internet. Also how do you think local businesses near you operate? They don’t call each like the 1900s lol. They order stock from distributors, some local and some overseas. Often they are doing this on the business internet. Today’s global supply chain makes this a non-starter.
It’s OK if YOU personally prefer doing everything slowly and in person and don’t value the convenience of the business internet. No judgment. But don’t pretend this would be an easy transition at all. Or that most people would prefer to live that way.
IMO it would make _way_ more sense to introduce reasonable privacy regulations that are better thought out than GDPR and have proper enforcement.
Maybe a formal “community” version of the internet would be appropriate as well.
Yes, and I really dislike traveling when I have to. I personally wish that air travel would become unaffordable for most people, including myself.
>don’t have mobility issues (ordering groceries / household essentials online)
No, but mobility issues existed before the modern internet.
>don’t participate in online banking (do you write checks? carry cash with you all the time? go to an ATM weekly?)
I do online banking, but I also write checks and use cash. I don't use Venmo or similar services. Once I can get ahead of my chores & projects I'm thinking about getting a local branch at a credit union and totally avoiding a banking app on my smart phone.
>you don’t stream movies or tv shows
Sometimes, but I'm getting away from it. Interestingly I'm getting a blank screen (but there's still audio) when attempting to stream on Linux. I haven't fully researched it, but some preliminary research suggests that it's anti-Linux blocking. (at least one user reported the problem went away -- in a repeatable fashion -- when switching their use agent to Windows) So although this is not confirmed, I'm preparing for a time when this is unavoidable and I won't stream movies or TV whatsoever at that point.
> and you enjoy looking for apartments to rent in your local newspaper listing
Don't see anything wrong with this. I would actually argue that you don't need authentication in the way discussed in this conversation for this, though -- all you need is the listing, which can be totally anonymous. The actual application for the apartment can happen in person, and that's when verification needs to occur.
>you enjoy using paper maps when traveling around the city and world.
I use an old-fashioned GPS in my car that I paid well over $100 for a number of years ago. There's no tracking whatsoever, unlike the GPS used in a smart phone.
>literally everything useful works on the business internet. >Also how do you think local businesses near you operate? They don’t call each like the 1900s lol. They order stock from distributors, some local and some overseas
Except we did fine for most of human history before all this. And I'm sure there were no such think as stock orders or warehouses or supply chains before the modern internet. They cropped up over night the moment the first tracking cookie existed. /s
Also recommended to have a separate sandbox for "projects" - basically things that you do that each might require their own research, toolchain, files you create, etc. I'd highly recommend doing this in a virtual machine though - oftentimes you need to install apps to do your project work, and that presents its own attack vector. Plus if it's all in a VM you can just backup the VM and start fresh on new hardware without having to install all the dependencies, while if you're just saving random files and backing them up they probably won't work as software gets updated and dependencies get out-of-date.
That some aspects may be used to push bots away is a minor effect.
An encrypted cookie from a company such as cloudflare encapsulates a multi dimensional datum such that, generating one in a legitimate browser, and letting a botnet using it will get detected and blocked.
I think there is still some hope that technical solutions could be developed so that only the "Business Internet" gets access to verified identity, with the user somehow understanding this, while the "Fun Internet" doesn't have such capabilities. This is what stood behind, e.g., Google's proposed WEI [1] that got such huge backlash, or Apple's Private Access Tokens [2] which are essentially the same thing but quietly slipped under the community radar.
Other proposals are Google's in-limbo Private State Tokens [3], or the various digital-wallet/age verification proposals (I think Apple and Google both have stuff in that space).
But even basic stuff, like IP protection, can really throw off the anti-fraud and anti-botnet mechanisms. Your Lego fan site wants to be behind a CDN for speed and protection from DDOS? Well, people using VPNs or in Incognito mode might end up inconvenienced, because the CDN thinks it's dealing with bots. Rough stuff.
[1]: https://en.wikipedia.org/wiki/Web_Environment_Integrity
[2]: https://developer.apple.com/news/?id=huqjyh7k
[3]: https://privacysandbox.google.com/protections/private-state-...
I don't think so.
Yes, it is being used for such purposes, but the older reason for tracking users was the hunger of the ad networks to serve ads with higher impact, and I think 'personalization' is still the big driver here.
Re insurers knowing you've been browsing heart disease etc, I have sometimes had issues like that, more you get a cheap initial price from an insurer/airline/car hire and then they jack it up when you visit again. You can sometimes do better by having a go from a different browser. I regard that more as me trying a hack to get a discounted price than a privacy nightmare but whatever I guess.
Everything is obfuscated. And this is not the situation on iOS and Android.
I am working on multiple products which use webassembly and cameras on mobile devices. It's impossible to reliably know how many workers to spin up, what's the safe memory limit and how much memory the device has, which compile-time optimized bundle to load, which camera to select for ideal focus lengths...
Especially on iOS.
And I often get customer complaints that the product is crashing. Eventually it ends up being a single (iPhone) that needs a restart to stop it from aggressively managing memory in Safari. Fingerprinting a device would solve SO MANY issues. And this is, again, possible on native apps.
In addition, I block most known advertizing/tracking domains at the DNS level (I run my own server, and use Hagezi's blacklists).
Finally, another suggestion would be to block all third party content by default using uBlock Origin and/or uMatrix. This will break a lot of websites, but automatically rules out most forms of tracking through things such as fonts hosted by Google, Adobe and others. I manually whitelist required third party domains (CDNs) for websites I frequently visit.
That's also why it's indeed useful when using Tor, because you're not identified by your base IP.
Unless we make this part of the culture, you have basically 0 recourse to browser fingerprinting except using Tor. Which can itself still be a useful fingerprint depending on the context.
EDIT: I'll add that using these tools outside of normal browsing use can be useful for obfuscating who's doing specific browsing, but it should be emphasized that using fingerprinting masking in isolation all the time is nearly as useful as not using them at all.
That's what Mullvad Browser attempts to solve i guess:
I’d like a feature in my HN reader that sticks a red button at the bottom anytime XKCD has already made the points I’m reading.
I’m not kidding at all, that my guess is he was doing drugs and stopped.
Must be an interesting place, that originates these "arguments".
I’m not saying I agree or that I even think his takes have gotten worse, just clarifying what the other poster said.
(Bonus points for the alt-text argument being isomorphic to nothing-to-hide.)
It's fine to like the comics before around 2016, and dislike the ones afterwards, but there's nothing objective about that. Various people have put forward various thresholds for when xkcd "stopped being good", but ultimately it boils down to a combination of what TV Tropes would call "Tone Shift" and "They Changed It, Now It Sucks!".
Specifically it was at this point in 2016: https://xkcd.com/1756/
> I’m not kidding at all, that my guess is he was doing drugs and stopped.
I don’t know if he stopped or started, but something changed.
In the USA, 2016 and onwards wasn't "just an election". It was something between a mildly harmful establishment candidate or a useless new face on one side, and "holy fucking shit are we actually letting this deranged wannabe monarch run for office?!?" on the other.
Give the man a break, it was (the start of) a crazy time, I'm actually surprised more creators didn't do something like this. If anything, it was barely even a political statement, more of a "hey fellow dems, go vote!" type thing.
> block all third party content
It's not going to work, because the fingerprinting script can be (and is often served) from first-party domain.
Also imagine if browser didn't provide drawing API for canvas (if you would have to ship your own wasm rendering library). Canvas would become useless for fingerprinting and its usage would drop manyfold. And the browser would have less code and smaller attack surface.
My GPU is reported as simply "Mozilla" by https://abrahamjuliot.github.io/creepjs/.
The number of cores is also set to 4 for everyone using this config and/or Tor.
> It's not going to work, because the fingerprinting script can be (and is often served) from first-party domain.
This may be true, but allowed third party content makes it trivially easy for Google and others to follow people around the Internet through fonts delivery systems among others.
I think it is Ghostry that is faking the responses but I still have a pretty unique fingerprint according to https://coveryourtracks.eff.org/kcarter?aat=1
All that said, the main project looks to be open sourced under a GPL3 license, so distrust and verify: https://github.com/ghostery
I have had it installed so long I don't even remember when I did it.
Ill look more into it and perhaps re-evaluate
How do prosecutors in any modern country/state not charge this behavior when done by a website owner?
Inshallah
.. would lead to all modern electronics being illegal, not just web pages with javascript.
In Europe we have the GDPR which does exactly this
Consider the swaths of dark patterns surrounding cookie terror banners. The GPDR language is extremely clear that none of them are legal, but virtually nobody is ever punished.
While the GDPR does not directly prescribe prison sentences, it absolutely enables countries to establish criminal offences for severe data protection violations, and they will clearly extradite!
https://ico.org.uk/about-the-ico/media-centre/news-and-blogs...
https://ico.org.uk/about-the-ico/media-centre/news-and-blogs...
> But ignoring that,
No don't ignore that. When you're so completely wrong about the first thing you say, everything that follows is going to be even more wrong.
> Consider ... cookie ... banners. The GPDR language is extremely clear that none of them are legal
You are confusing the ePrivacy directive (2002/58/EC) with the GDPR (2016/679).
Is it based on the Tor browser?
Some solutions, like Tor browser or GrapheneOS, are engineered for the purpose.
Some free online tools are an aggregation of ideas from social media and someone's personal understanding. These solutions can have limited benefits or be worse than the problem. Many settings don't work as expected, there are unintended consequences (such as making the browser more unique and easier to fingerprint), unusual combinations of settings can have unintended consequences or break things (Mozilla can't test every combination of about:config settings).
https://help.kagi.com/orion/privacy-and-security/preventing-...
sounds like they block "known" fingerprinting scripts and call it a day.
I love Kagi, but that is a laughable statement. Brave has been offering ad and fingerprint blocking for years now. The reason why they don't have full first party blocking ("aggressive" mode blocking) on by default is because it tends to break things.
(Also, what is a 'fingerprinter'? Isn't it something that runs server-side, out of reach of the browser, based on data collected?)
https://www.ghacks.net/2018/03/01/a-history-of-fingerprintin...
The flag was in fact designed to control the activation of the Tor browser uplift features, and reduce maintenance issues. That way the Tor browser could pretty much just be Firefox with certain flags turned on.
Firefox exposes a massive amount of identifiable information via canvas, audio device and feature detection methods. There's also active methods to detect private windows, use of the developer console and more.
-make client load something
-client doesn't load it
-add.fingerprint.point(client,'doesnltloadthings',1)
-detect if client does something only a certain browser does
-client does it
-add.fingerprint.point(client,'doesthisbrowsderthing',1)
-window was resized/moved, send a websocket snitch to the backend
- keep a consistent web socket open, or fetch a backend-api call for updates on X events - more calls are made, means user is probably scrolling, inject more things/different things.
I see some js obfuscators out there where I look at the js file and it's all mumbo jumbo.
It is indeed a privacy nightmare, where whatever we do feeds the algorithms to aide in making other people do things.
But it's also used in network security, organizations etc. Staff/employees will use the system a certain way, if something enters it without the behaviors, it's detectable. I assume that's what you mean in anti-fraud.
Sad part is we don't know what the data is ever used for, and it's often bought and sold and the cycle repeats.
Whether one breaks a lot of websites or not depends on the type of user one is. People who regularly use the Google ecosystem, Amazon and Social Media etc. cannot afford to break sites for obvious reasons, they too are those that websites are most interested in tracking and fingerprinting.
Those who use the web in the way advertisers and Big Tech intend users to use it are the most vulnerable, they're the ones who most need protection.
I break websites regularly but it doesn't worry me, I browse with the premise that there are more websites on the internet than I'll ever be able to visit and if I break sites or are blocked by paywalls then there are usually alternatives and workarounds.
But then I'm not a typical user, I block ads, I usually browse with JS off, kill cookies, use block lists, use multiple browsers (there are six on this deGoogled, rooted phone), browse from multiple machines—Windows, Linux and use multiple ISPs. Also, I've no Social media or Google accounts and rarely ever purchase stuff online. Internet access is via dynamic IP addresses and routers are rebooted often. There's more but you get the picture.
I assume browsing sans JS makes me a first-class target for fingerprinting and that websites know about me but it doesn't matter. Whatever I'm doing seems to work, over the years I've had very little trouble doing everything on the web that I want to do. Clearly I'm of little interest to advertisers and I never see ads let alone targeted ones. I used to use uBlock Origin but I don't bother now as browsing sans JS is just so effective at blocking ads.
I'm lucky in the fact that I use no service that would benefit from fingerprinting me. Whilst my web browsing is atypical of most users I reckon many could benefit by being more proactive—using multiple machines, browsers, ISPs etc.—to disrupt the outflow of personal data. For example, this is being written on a rooted Android using Privacy Browser from F-Droid sans JS and with block lists. If I really need to go to a site where JS is required, I can simply hit a toggle and turn on JS or alternatively use another browser.
No. It's LARP. You either don't care or go with Tor Browser and/or commercial antidetect browsers.
But you shouldn't care, this issue of fingerprinting is overblown. (really reminds me of AI)
In theory you could use Tor browser with Tor stripped (I heard this is what mullvad browser is?) or go tor-then-proxy (this is what I often do, because I sometimes use whonix at work). I don't know about libgen or Anna's archive, I don't use them.
So what do we care about? If you care about being untrackable, then you have a couple of options, rotate VPNs, or cycle your public facing IP often. Additionally, every request you make MUST change up the request headers. You could cycle between 50 different sets of headers. Combine these two and you will likely be very hard to fingerprint.
If you only care about being identified, use Tor + the Tor browser which makes A LOT of traffic look identical.
I have a gut feeling that we've been tricked (by ad companies) into thinking that this is somehow realistic and that casual "content creators" can get meaningful money from us reading their articles.
Realistically, while professional content creators can make a living, writing a blog post every once in a while will not provide meaningful income. Instead of trying to "monetize" everything, we would be better off with free content like on the internet of old. There are other means of making money.
It seems that the current situation means that the "content creators" earn insignificant money, while ad companies earn huge money because of scale, and we all somehow keep believing that this is necessary for content to appear.
Should people receive meaningful income for writing a blog post every once in a while?
I feel like that's the real question and not everyone agrees on the answer
> we would be better off with free content like on the internet of old
Well as someone who was there you used to need meaningful income to use the Internet of old. Nowadays everyone needs the Internet and it's a pretty big expense in most peoples' budget, and I think that's why so many people are willing to try something at it,
I figure if you just gave everyone meaningful income we could have that again
Nor, generally, should it. Sitting down one or two Saturday afternoons a month to write a blog post shouldn't be generating the income of a FTE.
What if it could? Or should (be able to produce FTE or close income)?
In that world, the amount of pointless shite - questing to “go viral” - would be reduced to near zero. That is, if the incentive were more quality, and less quantity, we’d be better off, yes?
Metrics are hard. Just making sure they reward one particular desired outcome doesn't mean you'll escape the unintended consequences.
Also, note that we are past the point of being able to reasonably able to manage any of this. Today, you'd need to come up with a reward function that cannot be maximized by AI. (And lest you think you can fix that by using site visitors to evaluate, most of them will be bots too.)
... but that's also not, nor should it be the median. I'm not sure how the economy functions if, say 8h/mo effort generates a median living wage.
It's very simple, it's what they've been doing in print media for centuries: contextual advertising.
Print media was also trying to guarantee their audience was an actual person by charging nominal fees, the difference was how much info required to do so.
I'd venture to say contextual advertising would be more effective than whatever we've been trying to squeeze out of fingerprinting etc. All this supposed "data" they are gathering feels like a scam perpetuated by ad companies about how important it is to the people who buy ads. It's not.
Even Facebook and Instagram, which pretty much should know you to a tee is completely ineffectual at advertising to me - like at all.
Later on in life I got pissed at cable-TV advertisers shoved into my favorite movies every 5-10 minutes ... ruining any ambience or artistic merit in them ... so I got rid of cable TV. By the time analog TV went away, I'd got rid of my television set. No return address on an envelope? junk mail, into the garbage unopened.
Now the pollution's ruined the 'net ... it's YouTube (re-routed) and some websites (blocked). So long, boing-boing and wired and your 'native ads'. Sites demand subscription? blocked. How much longer before advertisers realize how much they're getting ripped off?
Odd. In the midst of a (well-deserved) anti-ad rant, you throw in the primary non-ad alternative and discard it.
> How much longer before advertisers realize how much they're getting ripped off?
A while longer, if the same people who reject ads are also the people who reject alternatives to ads. The advertisers can safely ignore those people's opinions.
(I'm not saying subscriptions are the answer. I don't have an answer. I'm just saying that companies wanting subscription money is not part of the problem where companies want to shove ads in our faces 24/7.)
Targeted ads concentrate control over the market into a few players, which can do things like acquire competitors or run them out of business with loss leaders.
With AI, the supply of ad real estate will go to infinity, so the only thing that will matter is the quality of the places the ads run.
This would be a good time to ban targeted advertising, or for the content producers to form a cartel that only purchases contextual ads.
That cartel will probably be even worse than what we have now, since it’s going to be 2-3 mega conglomerates like Disney, and they already have handed editorial control over to the White House.
Hopefully the invisible hand of capitalism will somehow fix this.
How does tracking me and invading my privacy make ads perform better? In my case it does not. As the tracked ads are usually worse as they will keep advertising me things I don't need anymore. Context based ads worked fine in the past and I don't really see why they cannot.
Also why does every web store need to show me ads? Don't they make money out of selling things? If they really have to, do they have to invade privacy? This is like walking into a physical store and them doing facial recognition, then showing you tailored ads/inventory. That feels creepy to me.
If you don’t want to be tracked, you shouldn’t be, but how could it not? At a very simple level, an ad targeted towards a 50 year old woman isn’t going to be the same ad to show a 14 year old boy. Different people like different things and ads targeting you as an advertising profile are going to be better than ones that aren’t. You may not like the targeting and think it's invasive, because it is, but let's not pretend the tracking doesn't do something.
BTW, targeted ads need to be 100% to 700% more efficient than regular ads to be as profitable: https://www.sciencedirect.com/science/article/pii/S016781162...
Would you pay per view? Most people (me included) would probably hesitate to say yes, because we’re used to not paying for that. But what if it meant that ad based model is gone and everything you buy is cheaper because the price does not include the cost of running ads?
Except in practice we see the opposite.
There's something interesting going on with companies when they want to get paid directly versus by ads: they demand 3x - 4x or more for subscriptions or pay per view versus what they make from ads.
Easiest place to see this is ad supported non-linear TV in the years you could get without ads, or with ads. You pay significantly more to not see the ads, than they make from the ads.
Perhaps this is justified because ad-free subscriptions reduce the audience size for ad buys, but when you look at the numbers watching with ads versus paying, it wouldn't seem like the "no ads" buyers make a dent in whatever pricing tier.
In the 90s when we were young and naive, we imagined a library card model, with a library fee and then you have fractions of a cent cost to read a post, and using (hand waving) technology to uncouple viewing history from payables to content creators. That, or the British TV license model, an Internet license of some kind.
It's curious to me the ad networks haven't gotten together to preemptively offer this. Arguably Brave tried, but from an adversarial (to the ad companies) stance. It would work better from the inside with a simple regulation: if you serve ads for ad-supported content, you have to participate in the library card system at CPM rates no greater than you receive for ads to skip the ads for card holders.
The only companies that we directly allow to do this are schools, but having a premium version lets you approximate this.
it takes a lot of $0.10-$0.25 views to make up for the loss of a $5/month recurring revenue stream that might last for years.
The problem is skewed incentives, of course. Advertising is acceptable to most users and easy to integrate, so why should website authors go out of their way to please a minority of their users who object to it?
you're describing the model of a product called blendle, a service which i loved but which totally failed. they failed to attract users, and they failed to attract publishers. this isn't some new idea that nobody had tried. it's been done. and it failed, not just for blendle. people have tried micropayments, they've tried subscriptions, if you can imagine a PPV model, it's probably been tried. readers and publishers both hate it.
Advertising is ubiquitous on the web. Integrating it into web sites is simple, it works well for generating revenue at scale, and users have been conditioned from every other media industry to accessing content for "free". There is practically no friction for users, save for the degraded user experience, which most people have learned to live with or ignore.
So right off the bat, anyone trying to deploy alternative business models is going against the current of a trillion-dollar industry, and well established consumer expectations.
> readers and publishers both hate it.
Why do you think that is? Is it because the micropayment model is inherently bad, or because implementing it is difficult for website owners, it is annoying to use for users, and ultimately brings little revenue?
What if implementing it were as easy and convenient as advertising is today? What if users had an easy and convenient way to link their payment method into their browser, and from then on it required no maintenance? What if they understood that the web is not "free", but someone on the other end should be paid for their work if they find it valuable? What if this model actually generated significant revenue for publishers? What if all this was simply the way the web operated from the start?
Clearly this model hinges on a bunch of hypotheticals, but hopefully you get the point. There's nothing fundamentally wrong about users paying for consuming content. This is the way business transactions work in most respectable industries. You want something, you pay for it directly. You don't ask a third party to step in between you and the seller, to show you manipulative content that directly benefits them and their associates, while indirectly paying for the thing you actually want to buy. The fact we've accepted this corrupt business model as normal in many facets of life is absolutely insane. Never mind the fact that it's being used to manipulate us into thinking and acting in ways which corrupt democratic processes and cause sociopolitical instability, or that it's abusing our right to privacy and exploiting our data. To hell with all of that.
The PPV model, like Ads, works well for websites that you're not well associated with. Random blogs and websites that you otherwise wouldn't be willing to share your credit card info with.
I can't speak for all web sites, but I reckon a combination of factors could explain why such a solution hasn't been deployed:
1. Advertising is ubiquitous, easy to integrate, and provides a safe revenue stream.
2. There is little to no infrastructure for the PPV model. Whoever builds it would need to maintain their own version of it.
3. People expect the web to be "free". This is even true within technical crowds who understand that it's really not free. And a large part of that group doesn't mind advertising.
So, really, it would require a substantial amount of effort to implement, it would add additional friction to users, and ultimately only a minority would appreciate it.
Had this model been in place from the beginning of the web, things might be different today. Alas, if my grandma had wheels...
A critical mass of publishers would need to team up and form a cooperative/etc where a user could register once, deposit some money, and then that money would be spent every time they view an article. But that requires cooperation between competitors, which is already hard enough, and the cancer that is the advertising industry wouldn't like this potential existential threat and would be more than happy to pour fuel onto the fire to ensure it never succeeds.
What's surprising is why the card networks themselves don't get in on it. They could do so in a completely backwards-compatible manner, introducing a new card number range that only works with transactions under a certain amount and have different fraud protection/chargeback rules.
i don't believe NYT has ever tried a pay-per-view model.
Spotify's model is more that your monthly amount gets disproportionately redistributed to the artists that bring more interest and listens to Spotify, regardless of whether you were one of those listeners. Smaller and niche artists suffer under Spotify's model.
That's what people are bemoaning the loss of: the before times, when people did interesting stuff without regard for whether it could be monetized or not.
Yes, but only after viewing, of else I'd pay for "editorial" or AI generated slop which would be generated like link farms pointing to Amazon etc.
And that's the chicken-and-egg problem ...
In theory that could be resolved by registering for free at reputable sites and then paying per view with micropayments. Or by a scheme where one would register and only pay when I actually did read stuff, not with the currently en-vogue monthly fee for each and every site.
It is a shame that this feature gets lumped together with claims of crypto scams, and similar nonsense. Yet this is precisely the right model that could work at scale to eliminate the advertising middleman, and make the web a safer and more enjoyable experience for everyone.
From my perspective I couldn't care less if one bad guy is stealing from another bad guy.
Why would you assume ads are worth $5 a month? Its more like paying 10cents to read the blog.
I wrote a longer post on this[0] but to save you the click I will state the biggest problem from a privacy point of view - if you think privacy is bad now with ads imagine how much worse it would be with a payment processor knowing your every click.
Yes, I know about certain cryptocurrencies that maintain privacy, they are a non-starter for micropayments for different reasons.
Even if a magically technical solution to privacy were to emerge there is nothing more valuable than information about paying customers and sites would use browser fingerprinting anyway.
The article you lists assumes a "conventional" credit card system with chargebacks, massive fees, etc. which makes micropayments ecosystem impractical in the first place. Proposals for micro-payment systems usually describe a way top enable low-fee payments.
The author doesn't take into account modern cryptocurrency tech like payment channels. I really doubt that payments have a natural fixed floor of 10s of cents - Payment providers charge these fees simply because they are in a natural monopoly position, thanks to lock-in and regulation. The need to control fraud is caused by regulatory requirements, which are in turn caused by monopolization.
Despite being technologically less efficient, even traditional cryptocurrency payments are cheaper than bank transfer fees due to competition and low regulation.
Secondly, you assume that no one wants to do micropayments. The infrastructure doesn't exist for it yet. If you don't build it, they will not come.
As for browser fingerprinting, it can be solved on the client side with enough effort. Look at tor browser. Just have a system where cookies, WebGL, etc. are opt in on a browser level in the same way that WebUSB is. Artificially limit the performance of javascript to prevent bench-marking. I think it is possible to solve this architecturally.
Check it out!
https://en.bitcoin.it/wiki/Payment_channels
https://lightning.network/lightning-network-paper.pdf
Also, there are GNU Taler/Chaumian cash type systems that inherit the efficiency of centralized systems with an added privacy benefit.
That “if” is doing a lot of heavy lifting there.
But my point is that even if a magical technical solution existed tomorrow then the same sites that collect data for ads would continue to do so for the much more valuable data on paying users.
There have been a number of proposals, I think the oldest is DLSAG: https://eprint.iacr.org/2019/595.pdf There are other ones based on time-lock puzzles, but those have always been kinda crappy.
It may be possible with some ZK magic I'm unfamiliar with. But the core of the problem is that we need to find a way to make a transaction valid but only after a certain block height, and make it so that validators can't learn any specific heuristics about the transaction (like what the block height is exactly).
>But my point is that even if a magical technical solution existed tomorrow then the same sites that collect data for ads would continue to do so for the much more valuable data on paying users.
Sure, but after the micropayments revolution there will also be a change in the types of sites people use, enabled by the new form of monetization. You could rely more on people posting things like videos to their personal blogs and interlinking them instead of having to shack up with one of the few sites large enough to support ad-funded monetization. The internet would have a basic spam-resistance function, so it would be less reliant on the existing players to gatekeep (for example, email, forum moderators, etc).
I think it would be more competitive. Let's say you have a site like twitter that says "now that there are micropayments, we will charge you 1 cent per pageview AND force you to login and collect your data", well then you will have a competitor like xcancel.com which can charge 2 cents per pageview and not require login. The market would decide what the best model is. Right now proxy sites like xcancel have to do it for free. Even if they wanted to run ads, the ad market isn't competitive in the same sense because it is more profitable for larger players.
I think you mention in your blogpost that no one would want to support micropayments because of piracy. I consider this a massive advantage of the micropayment system. It's pro-piracy by default. If you look at the origins of ad-funded sites like youtube, they started out as hubs of (light) piracy. The content of social media sites should be pirated and mirrored: they are just getting rich off of network effects in the first place. If you combine micropayments with some sort of bittorrent-like system, this could be very powerful. Imagine a decentralized archive site, where you take advantage of TLS to archive a verifiably timestamped version of a page, and anyone else can send you money that is conditional on you providing them a copy of that archive in return.
Micropayments don't fund the development of new intellectual work, but they let you recoup the cost of bandwidth. He who does not host, also does not earn. If you want to fund the development of new work, I think you need patronage. We are already seeing this with a lot of videographers from youtube depending mostly on sites like patreon and donations from dedicated fans. In a micropayments world, you wouldn't have sites like patreon taking a cut. Aside from just having ~0.1c micropayments-per-pageview, you could have very easy p2p "mini-payments" on the order of ~$1 in exchange for donation rewards.
With less money in the annoying ads economy, google and others would have less power to alter the web standards to their whim, and we could claw back features that enable fingerprinting. I don't know, that is just my dream.
Sending emails costs $0.50.
Also wonder if it will really work out, i open too many articles that are pretty bad when you start reading them. So i quit after 1 or 2 paragraphs.
Now if you get the first 2 paragraphs for free, contents writers will start to optimize for good first 2 paragraphs, and afterwards quality will drop. Also, many blog posts or news articles don't have more than 2 paragraphs of good content.
But yes, I always thought some form of network syndication would emerge on the Web, where creators could register for their share of aggregated periodic payments made by users.
Still not sure why that's not a thing. I would pay $50/month to a syndicate in return for never having to deal with paywalls on any sites affiliated with them. But only as long as the vast majority of sites participated, and that is probably the showstopper, I guess. We'd end up paying 20 different 'syndicates' for absolutely no good reason, just as we now have to deal with 20 different streaming services.
One option: a fund where you buy tokens, that you can spend reading an article. That will, however, lead to more clickbait and AI slop and snowing under serious blogs with low volume.
Aggregation of tips and payouts would help, but that requires network effects (achievable only at scale) to be viable. I believe this approach has been tried in recent years, but I am not sure where those efforts went.
The point of paying creators is so that they can focus on creating content instead of making other things. Giving money to a creator is basically saying "you're so good at what you do, and it has so much cultural/intellectual value, I'd rather have you make content instead of stocking shelves or making food". But this should be reserved for people that publish good content because they can and are passionate about it, not just anyone putting out slop with the instrumental goal of paying their bills. If the friction of clicking a button and filling in payment details is enough to deter people from paying them, then maybe their content isn't worth paying for and they should find some other way to make a living instead.
This is false: We're the ones who pay the creator, because:
> I'm not going to pay $5/month for every blog that I occasionally read
If that upsets you, please understand it upsets me to, because
> but at the unacceptable price of privacy
I want you to consider a different toothbrush brand, or maybe a hot location for a holiday, and the idea that I am "invading" your privacy in trying to do this is disconcerting.
I understand there are actors who want to use your private personalising data to harm you. I think that is bad, but I am telling you friend, that isn't me.
> I'm not quite sure what the answer is.
Listen, as an insider I am not quite sure what the answer is either, but I'm telling you that content creators need to eat because you have threatened them with capitalism which murders you if you don't participate, and I am the one feeding them and not you.
I think though, it probably takes the form of better laws that prevent people from using personalised data to harm you without public (judicial) review, and I think that is going to require people like you thinking of the outcome that you want, instead of foolishly trying the impossible to conserve your personal privacy.
With like 12 students, that's 4 bits, and it often ends up with 2-3 questions. It starts off with the obvious ones - man/woman/diverse, but then a realization comes in: An answer usually contains more information than just that one bit. If you have long hair, you're most likely a woman and/or a metalhead for example. That part will get shaken out later on.
And those thoughts make these browser fingerprinting techniques all the more scary: They contain a lot of information and that quickly cuts the possible amount of people down. Like, I'm a Linux Firefox user with a screen on the left. I wouldn't be suprised if that put me in a 5-6 digit bucket of people already.
That means there is less information in the question "do they have long hair?", not more. Asking "long hair?" and then "woman?" is probably, in most groups, roughly the same as just the first or second question alone. So the second question added much less than one bit of information because the answer is probably "yes". "Long hair" and then "metalhead" is the same, except that the answer to the second question is probably "no".
Yes/no questions on average contain the most information each when they partition the remaining possibilities 50:50. Then each answer gives you exactly one more bit. The closet you get to either a 100:0 or 0:100 yes:no split, the smaller the fraction of a bit you encode in the answer.
"Metalhead?" usually gives you lots of bits of information (probably 4 in an "average" group of 16 containing at least one metalhead) if the answer is "yes", but on average that's outweighed by the very high chance that the answer will be "no". If there are no metalheads or only metalheads, it gives you zero information.
In this case, it was often an interesting exercise in bias as well. "Woman?" would usually single out 1-2 persons out of the 15, so it was a terrible question. It was CompSci after all. "Long hair?", lumping women and metal heads into one group would often split it into half and half. That was much better, and then spurred creative thoughts like travel distance, or bus stations.
Isn't the point to ask yes or no questions?
You can think of all sorts of questions and answers like this, and when you combine with the assumptions and answers from previous answers you can make even more assumptions. They won't always be correct, but you don't have to be "perfect", depending on your use-case. For example for advertising purposes assumptions(even if incorrect) can still go a long way.
There is a reason Target got sooo good at identifying pregnant women[0] before the women knew they were pregnant that they creeped out women, and had to pull back what they did with that information. This was like a decade or more ago. It's only gotten more accurate since then.
0: one example from 2012: https://techland.time.com/2012/02/17/how-target-knew-a-high-...
https://www.predictiveanalyticsworld.com/machinelearningtime...
Why would they do that, if they didn't think their system was that good?
Target isn’t going to do something that scares away consumers, like say “our ad tracking is TOO good”, unless there’s another benefit that makes it net positive for them.
That's why I pay with cash and do not have a loyalty card (other customers often offer theirs at cash register anyway). And of course I don't even go to Target.
The goal of these decision trees is to have as few questions that divide the group in two balanced halves (and also recursively).
If you imagine a binary tree with questions in each internal node, and in each leaf there is a person. You want the height of the tree to be minimized.
It is indeed not possible for it to give more, because it only has a single bit answer, which by the pigeonhole principle can't give you more than one bit.
The best yes/no questions are the ones which are independent of each other and bisect the group evenly. "Are you female" is typically good because it will be approximately half the population. Then you want independent questions that bisect the population again, like "does your first name have more than the median number of letters" which should be mostly independent of the first question. Another good one is conditional questions like "are you taller than the median for your sex" since a pure height question wouldn't be independent of sex but that one is.
Whereas bad questions would be ones with highly disproportionate responses, like "do you have pink hair with black and green highlights" which might be true for someone somewhere but is going to have >99% of people answering no, or "were you born on the planet Mercury" which will be 100% no and provide zero bits of information.
When I was young I used to think of him as that eccentric pedantic mit guy but now I see him as a true warrior for freedom.
Imagine if you said: I'm going to undermine facebook by building another social network which will be Free software, and will be compatible with facebook. I'll federate facebook whether they like it or not, and I'll do that by reverse engineering how facebook servers talk to each other. That wouldn't work because it takes you huge effort to pull off, and it takes facebook zero effort to change the interface in a tiny way that breaks everythign for you. (Ok the analogy isn't perfect, but hopefully you get the idea of diminishing something's value by forcefully opening it up)
But he hugely contributed to win a battle like this in the late 80s, then Linus Torvalds came in and finished the job in 1991 or so. RMS doesn't get the credit or even appreciation he deserves. I think he's one of the most tragic figures in the history of computers.
It really isn't, because there's plenty of fingerprinting scripts that run on the same domain, especially fingerprinters from security providers like cloudflare or akamai.
Lets see what he says on the subject.
And neither does the page on LibreJS, which is the tool he created to attempt to address the problem[1]
> Did you? The whole thing is about how JavaScript allows running nonfree (that means not GPL to him) software on your computer without realising it.
No, Free is not the same as GPL you ignoramus.
I do block ads on the web with UBlock Origin because there’s no pay option to opt-out of it and ads ruin the experience. But I don’t give a fig about tracking. Change my mind. Why would the average person enjoy a better life if they became untrackable on the Web?
However, just because the average person doesn't notice and doesn't care, it doesn't mean that their life can't be ruined at some point because of these things. You never know when you're suddenly going to be targeted for something you may or may not have done.
I think it’s not possible for me to say if my life is really better, because it’s the whole road not taken thing. It’s not possible to know and so it’s not worth agonising over, but I’m choosing to live according to my values at least, and that seems valuable.
Though, from what I understand, overall the fingerprinting success rate is only about 30%.
This reminds me of the anti-vax logic a tad in that they lack the imagination on the seemingly obvious effects of their ideal world.
Being indifferent to companies and political parties (which becomes your Gov't when voted into power) indirectly states that you are indifferent to others attempts to influence you and/or foolhardy enough to believe that all of your beliefs consistently originates from objective personal experience.
Another way is the security and peace of mind it gives me while living in a country that has a behemoth population of bad actors online. Everyone I know has fallen to at least one targeted cyber-scam or the other. I haven't.
I'm 100% in the need for personal privacy camp, but mention this only because without addressing the underlying issues, it's hard to come up with larger solutions.
And the big issues really come down to fraud and cyber attacks:
- Years ago, the NYTimes was found to be doing some kind of homegrown fingerprinting with canvas. They have plenty of ways of doing analytics and tracking subscribers, they were trying to root out ad fraud.
- When Spotify has people every day putting up AI-generated streams and attempting to "listen" to them with bot networks, they start to look at things like fingerprinting.
- Massive credential stuffing attacks on sites are often thwarted at the technical level by fingerprinting.
- Bot traffic (and in particular AI indexing bots not respecting robots.txt) has shot up dramatically in the last year, and fingerprinting is one of the strongest ways of bot identification.
Again, want the personal privacy, but think we need to fix the professional problems to get there.
> By following users over time, as their fingerprints changed, they could guess when a fingerprint was an ‘upgraded’ version of a previously observed browser’s fingerprint, with 99.1% of guesses correct.
https://coveryourtracks.eff.org/static/browser-uniqueness.pd...
Technically for fonts, there’s no API for listing installed fonts, so trackers have to check each font by name. Likely they won’t be checking super obscure font names.
That method might help for other signals though.
Services with a large enough fingerprinting database can filter out implausible values and flag you as faking your fingerprint, which is itself fingerprintable.
The point is that a sufficiently motivated actor could use a very broad array of tactics, some automated and some manual, to identify, observe, track, and/or locate a target. Maybe they can’t pin you down with your browser fingerprints because you’ve been smart enough to use tools that obfuscate it, but that’s not happening in a vacuum. Correlating one otherwise useless datapoint that happens to persist long enough to tie things together at even low-ish confidence is still a hugely worthwhile sieve with which to filter people out of the possibility pool.
The problem isn’t that it doesn’t affect most average people, or that it it’s terribly imprecise. The problem is that it’s even a little effective, while being nearly impossible to completely avoid. It’s also a problem if that’s used by a malicious state actor against a journalist, to pick a rather obvious example. Because even in isolation, this kind of violation of civil liberties necessarily impacts all of society.
The public should be given more information and control, broadly speaking, for when they are asked to trade their rights for convenience, security, and/or commerce. In particular, I think the United States has allowed bad faith arguments against regulatory actions and basic consumer rights so corporate lobbyists can steamroll any chance of even baseline protections. It would behoove all of us to be more distrustful of companies and moneyed interests, while being more engaged with, and demanding of, our governments.
The ironic thing is that because of GDPR and CCPA, ad tech companies got really good at "anonymizing" your data. So even if you were to somehow not have an alias linking your various anonymous profiles, you will still end up quickly bucketed into a persona (and multiple audiences) that resemble you quite well. And it's not multiple days of data we're talking about (although it could be), it's minutes and in the case of contextual multi-armed bandits, your persona is likely updates "within" a single page load and you are targeted in ~5ms within the request/response lifecycle of that page load.
The good news is that most data platforms don't keep data around for more than 90 days because then they are automatically compliant with "right to be forgotten" without having to service requests for removal of personal data.
Basically they are used as spy-tools. Many anti-features are pushed into them - a simple example is the disable-right-click functionality. I do understand that some of this have a useful functionality (for instance during an exam on-campus-site, to restrict what the students may do), but I always hate that I need e. g. a browser extension just to disable this antifeature. That's a super-simple example; there are many more severe examples such as fingerprint spy-sniffing here.
This is a part of a W3C browser spec and every web browser has to implement it. But you're right that people writing the spec work for companies selling web browsers.
I am concerned about the detail here: does this mean per hardware class (e.g. same model of GPU), or per each individual device?
Is the implication that there are certain graphical operations that - perhaps unintentionally - end up becoming akin to a physically unclonable function in hardware?
per combination of hardware(GPU, resolution of display) and software(exact drivers)
We've made our world a scary place.
This feels like a regulatory question, not a technical one. We've repeatedly proven that with math and code alone, we can fingerprint and identify almost every unique person on the planet, given enough data points. The long-term solution seems like it should be severe consequences for data breaches (as in, corporation-destroying penalties for disclosure of PII, including fingerprint data) such that everyone only collects the data they need to provide the service in question and not a single bit more, deleting it as soon as it's no longer necessary. Right now there's no consequence if Google or Meta disclose huge swaths of user data, and thus no disincentive to collecting as much as they possibly can.
Punish the leaking of data, and suddenly you've raised it's cost to the point that casual players will nope out entirely. From there, it's the eternal back and forth of governments waffling between business and electorate interests.
I'm very skeptical of this claim, especially in practice. Contrary to what many fingerprinting sites claim ("you're unique of everyone we fingerprinted!!"), browser fingerprinting can't possibly uniquely identify someone. Smartphones are pretty locked down and there's very few customization options that allow for fingerprinting. In the US Apple has around 50% market share in the US, and there are 30 iPhones models that are still in support. That means if you're an iPhone user in a city of 1 million, there are, on average, approximately 16.6k (500k / 30) other people with the same exact model of iPhone (and therefore fingerprint) as you. As long as you don't do anything to stick out (eg. living in the US but setting Denmark as your locale), you'll be reasonably anonymous.
Often had the same thought, if not shared same opinion. On the other hand, stiffer penalties have the trade off of incentivizing cover-ups, i.e. disincentivize honest disclosure.
The best browser for protection is https://mullvad.net/en/browser because it makes the connection uniform, to better blend in.
I guess that really depends on how you classify "best"
Tor is pretty good for protection. Then there's always i2P as well…
Saying one browser can protect the best is pretty hard to prove.
I wouldn't say Tor Browser is the best because it requires custom configuration to be usable conveniently, which will make the connection non-uniform (and the user will stand out).
>Tor is pretty good for protection. Then there's always i2P as well…
Tor and i2P does nothing for (anti)fingerprinting - the program which render the web pages does.
>Saying one browser can protect the best is pretty hard to prove.
Not a proof but things to consider: https://privacytests.org/
Among all the available browsers, mullvad/tor browser is the best we have in terms of fingerprinting resistance.
It's been obvious for a decade and a half that technical solutions won't be practical to implement.
I don't understand how temporary containers are still not a built-in Firefox feature, it seems like such a no-brainer solution for privacy.
If you're on a VPN and using Firefox containers, is the only way to identify me to look at my mouse movement and correlate it?
I know one particular online car store that shares user data with insurance companies and they use that in their models to compute a "willingness" to pay more for insurance as well as of establishing the user profile. Let's say you look a sports car but you end up buying a family van, they charge you more for that.
The very interesting part is that they create a "customer profile score" they is just a number and sell that number to other companies. So, by pipping your habits they aggregate data and technically do not violate some local laws.8
Firstly, on the counter argument side, when you visit a website, you are using their hardware, they have every right to make any requirements they want to use their hardware, they are not public spaces.
But more importantly, the fix is actually easy, use more than one browser, use private browsing sessions, use more than one device, only log in to services you dont mind tracking you, use ad blocking. everywhere. Dont use sites that dont behave. All things you should be doing anyway.
However, I also think the whole concept of browser fingerprinting is exaggerated. None of the things that can be used for fingerprinting are long lived, meaning any fingerprint probably has a shorter life span than the average cookie, and also far less reliable than say an IP address, which absolutely doesnt personally identify you.
meanwhile, it is quite rediculous to log in to all these services with 2FA, then expect any kind of technical or legal measure to prevent them from knowing exactly who you are with 100% accuracy.
Mostly thinking out loud, truly anonymous browsing is a tor node away, but a long time since I used that, there wasn't anything there I was interested in after intel exchange went down.
Any particularly interesting angles to this that you wished there was research on?
I'm sick to death of companies thinking they have any right to keep tabs on me because they think it'll make them a buck.
Basically, we can identify browsers based on the supported ciphers in TLS handshake (order matters too AFAIK). Then when your declared identity is not matching the ja3 hash, you're automatically suspicious, if not blocked right away. I think that's the reason for so many Capchas.
At best they identify the family of browser, and spoofing it is table stakes for bad actors. https://github.com/lwthiker/curl-impersonate
These will still help against the masses of dumb actors flooding your stuff.
It's a better explanation that I can provide.
The last time I looked at this seriously I was trying to find out how much fidelity (if it was possible at all) was necessary to identify someone by their mouse and keyboard input.
It's not just what you do but how you do it.
Now shameless adverting: of course I present the solution: https://counter.dev
I switched to the Mullvad browser. The other recommendation, LibreWolf, provides the following warning on install which scared me away: "Warning: librewolf has been deprecated because it does not pass the macOS Gatekeeper check! It will be disabled on 2026-09-01."
That’s not to say you shouldn’t use a browser that blocks ads etc but I don’t think people should immediately think that they’re not fingerprintable because they’re running these. There definitely needs to be more discussion on the reality of how much these browsers can “protect” you.
Sure even with the gatekeeper test you can’t be sure it’s built against only the claimed code but it does guarantee:
1) the binary hasn’t been modified since it was signed 2) the binary was signed by somebody in possession of the private key 3) there is some measure of identification via Apple on who or what signed the binary 4) somebody was willing to fork over $99 to sign the binary
It’s not perfect security by any means but it is something. Otherwise the binary you are running might as well have come from some sketchy email attachment. And fuck that. Why would I want that on my machine?
I get that the $99 might be a hurdle for “non-organized open source” (ie most open source… doesn’t have a non-profit entity to take up the expense and credential management, etc…)… and there are probably ways apple could make it easier for such “collectives”… but ultimately I’d argue that signed binaries are good for everybody. While imperfect, they provide some form of traceability and accountability.
obviously it’s not a 100% guarantee of being fuckery-free. The private key might have been compromised, the appleid might have been hijacked and the developer program might have been enrolled with stolen credit cards… but it’s still a hurdle to filter out a large swath of low effort nonsense.
This isn’t an easy problem! I’d argue signed binaries are good for everybody… They are good for the end user because it provides some assurance the thing hasn’t been tampered with and provides at least some form of audit history. It’s good for the developers too! It ensures that users are running the binaries the dev intended them to run! It’s good for the platform maker as it reduces the attack surface…
The problem is… getting the keys to sign binaries requires getting a private key! And not just any key but one that been blessed somehow by something that all parties can trust. And trust isn’t a technical problem but a meatspace human some. Apple solves it by requiring the dev to cough up 100USD and probably some other personal information. I have no idea how Ubuntu does it or Microsoft…. But something, somewhere has to bless that signing key.
Edit: Apparently Brew doesn't sign stuff because they don't trust the code they are being asked to sign. Apparently you can just get brew to build the package locally with `brew install --build-from-source librewolf` though which is useful.
On windows you just need a certificate from a known authority. This will still probably cost you money but you have a lot more options at different price levels. Also that certificate is a widely useful thing rather than an apple dev account which is only useful in the apple walled garden.
If you want to avoid being uniquely identifiable stick to Chrome, signed into a Google account, running on a PC from Best Buy.
I'm going to steal this nice analogy, for when I try to explain this point and some related points.
Given the scale of scrapers these days (AI companies with VC money have no problem spinning up thousands of VMs running Chrome), fingerprinting at the browser level is the only realistic option.
(obligatory: my personal opinion, not necessarily my employer's)
There are pros/cons.
It should be obvious by now that using any free service of scale is being paid for by your interactions which are made more valuable through fingerprinting.
Trying to circumvent that just makes it more expensive for the rest of us.
Why use a third party service when cookies can do exactly that? They load their .js from the same domain they set up a cookie and there's no limitation to read that cookie, correct?
Just make sure it’s sufficiently illegal to keep this info. Find and make big visible examples of fining companies that trade in this info. If a company sells a product that fetches ads based on an ”identifier” their little js snippet computed then just pay them a visit. Fine both them and their customers to the max extent of the gdpr (or equivalent).
Unfortunately there is no way to tell advertisers, "No, I'm not interested in your product. I never will be. Don't waste your money."
The top offender is Hims. No, I don't have hair loss. I don't want hair loss supplements. I also don't have ED, and I object strongly to ads for that showing up unexpectedly when I'm showing a YouTube video to someone else.
The second top offender is whoever it is (they keep changing their name) who thinks that I need some kind of Christian motivational course to get control of "the P-word". (Their phrase, not mine.) No, I don't have a problem with pornography. I am very rarely interested in it. And when it comes up every few months, I don't feel any guilt about it afterwards. Furthermore I'm an atheist. A Christian motivational course isn't going to work well for me regardless.
Yes, Google does offer a report function, and a block function, for ads. The report function seems to have gotten rid of the unwanted ED ads. The block really doesn't work when the ads are all very similar AI slop that is rotated frequently. Block this ad, and then next unwanted ad from the same source will be coming along soon enough. (The reason why I particularly dislike Hims is that they are more aggressively rotating their ads.)
It means that, when you need a new dishwasher, you will never see the actual best dishwasher for you, only dishwashers that are a bit more expensive than you actually need but you will end up buying one of them anyways.
It means that you are more likely to see products you would impulse buy just after you get your paycheck. Or slightly inflated prices on things you usually buy.
It means ads designed to take advantage of addictions to sugar, alcohol, gambling etc
Finding stuff you actually want to buy has never been easier, you can find hundreds of reviews and comparisons instantly. People who opt into personalised ads don't end up being more savvy online shoppers, they just end up buying more junk.
I do not have those problem addictions. Of course I am going to comparison shop for any large purchases. I am good enough about controlling spending that excess junk isn't one of my problems.
But what I do have a problem with is coming up with creative ideas for people in my life. So, for example, I would have never thought to look for https://www.zazzle.com/cup_equation_love-168099175298227864. But I'm very glad that someone out there knew enough about me to guess that this might be an item that I'd like. And my wife liked the cup a whole lot.
Does this happen often? No. But I'm perfectly happy to pay a premium for a product when an advertiser gets it right.
Maybe you truly are above the influence of advertising. However, almost no one believes that they are affected by advertising yet clearly almost all of those people are wrong.
I find it safer to assume I am part of the vast majority of people who would be influenced by personalised advertising. Given that online advertising is basically the biggest business in the world, I assume that it would find a way to get money from me.
But it would be nice if you worked on your listening skills as well.
You gave a list of major evils that consuming advertising leads to. I don't suffer from those evils. Or at least if I do, then I must also in serious denial to be unaware of it.
You also seem to think that I said that I am unaffected by advertising, and it doesn't lead to me spending money. This is a bizarre conclusion given that I said that I am affected by advertising, and I gave an example of where it did lead to me spending money.
But the critical difference is this. You treat advertising as an assault on your mind. Whose job is to enable evil corporations to steal your money. I view advertising as a discovery method. The world is full of innovators coming up with things that they think others may want. They then use advertising as a way to let people know that there is a thing that they may want. I rarely want it. But I'm willing to waste a bit of time on the pitch.
And on the rare occasions that I do get something, I actually enjoy it regularly. That cup I mentioned? I just made tea for my wife, and served it to her in that cup.
We are different people. I have a very different relationship to advertising than you do. The fact that it is different, doesn't mean that I'm wrong to be me.
[1] https://mullvad.net/en/help/dns-over-https-and-dns-over-tls#...
[3] https://revanced.app/patches?pkg=com.google.android.youtube
A general "show me no ads" solution is not my preference.
Back in the early days of Privacy Sandbox, before that crashed and burned against the UK CMA not even letting Google remove third-party cookie support [0], there was a lot of optimism about how we were going to completely solve cross-site tracking, even in the face of determined adversaries. This had several ingredients; the biggest ones I can remember are:
1. Remove third-party cookie support 2. Remove unpartitioned storage support 3. IP protection at scale 4. Solving fingerprinting
In the end, well... at least we got 2, which has some security benefits, even if Chrome gave up on 1, 3, and 4, and thus on privacy. Anyway, everyone could tell that 4 was going to be the hardest.
The closest I saw to an overarching plan was the "privacy budget" proposal [1], which would catalogue all the APIs that could be used for fingerprinting, and start breaking them (or hiding them behind a permission prompt, maybe?) if a site used too many of them in a row. I think most people were pretty skeptical of this, and the main person driving it moved off of Chrome in 2022. Mozilla has an analysis suggesting it's impractical at [2]. Some code seems to still exist! [3]
A key prerequisite of the privacy budget proposal was trying to remove passive fingerprinting surfaces in favor of active ones. That involved removing data that is sent to the server automatically, or freezing APIs like `navigator.userAgent` which are assumed infallible, and then trying to replace them with flows like client hints where the server needed to request data, or promise-based APIs which could more clearly fail or even generate a permissions prompt. This was quite an uphill battle, as web developers (both in ad tech and outside) would fight us every step of the way, because it made various APIs less convenient. Elsewhere people have cited one example, of reducing Accept-Language [4]. The other big one was the user agent client hints headers/API [5], which generated whole new genres of trolls on the W3C forums.
As Privacy Sandbox slumped more and more towards its current defeated state, people backed off from the original vision of a brilliant technical solution that worked even in the face of determined adversaries. Instead they retreated to stances like "if we just make it hard enough to fingerprint, it'll be obvious that fingerprinting scripts are doing something wrong, and we can block those scripts"; see e.g. [6]. Maybe that would have worked, I don't know, but it becomes much more of a cat-and-mouse game, e.g. needing to detect bundled or obfuscated scripts.
And now of course it's all over; the ad tech industry, backed by the UK CMA, has won and forced Google to keep third-party cookies forever, and with those in place, there's not really any point in funding the anti-fingerprinting work, so it's getting wound down [7]. The individual engineers and teams are probably still passionate about launching opt-in or Incognito-only privacy protections, but I doubt that align with product plans. I'm sure Google doesn't mind the end result all that much either, as having to migrate the world to privacy-preserving ad tech was going to be a big lift. Now all that eng power can instead focus on AI instead of privacy.
[0]: https://privacysandbox.com/news/privacy-sandbox-next-steps/
[1]: https://github.com/mikewest/privacy-budget
[2]: https://mozilla.github.io/ppa-docs/privacy-budget.pdf
[3]: https://chromium.googlesource.com/chromium/src/+/36dc3642bee...
[4]: https://github.com/explainers-by-googlers/reduce-accept-lang...
[5]: https://developer.mozilla.org/en-US/docs/Web/API/User-Agent_...
[6]: https://privacysandbox.google.com/protections/script-blockin...
[7]: https://privacysandbox.com/news/update-on-plans-for-privacy-...
The whole article never mentions the gold standard of anti-fingerprinting, Tor Browser. It just shows how shallow the article is when it mentions Mullvad Browser, a fork of TBB, instead of TBB itself! There's also no mention of using an upto-date DNS block list to thwart fingerprinting attempts even more
I don't use it for daily browsing, but when I want to search for something I don't want associated with me (for example, health concerns) I just use tor browser and don't worry about tracking.
The Tor Browser as a privacy measure is likely no better than a normal browser with uBlock if you’re also using it like a “normal” browser, signing into the same accounts you always use etc. My opinion obviously but I dislike people recommending the Tor Browser as a lot of it’s primary benefits are lost if you’re just using it as a daily driver browser.
I always point people to https://fingerprint.com/ to see if their browser can defeat it. Most of the time you can’t without clearing cookies, changing device resolution, change VPN location etc. something the average person can’t/won’t do. Even JS aside there are a ton of different ways to track people based off even just getting server side data when a site’s stylesheet is fetched.
> Tor Browser as a privacy measure is likely no better than a normal browser with uBlock
Why tell the whole world how fucking dumb you are? Tor browser has been developing anti-fingerprinting techniques long before you even heard the word being thrown around. 2M+ userbase is enough to hide among and stop fucking complaining. It's a non-profit, people run relays voluntarily and that's the current best way to thwart fingerprinting attempts
All of these have limitations and exceptions in a complex legal system. But to issue a blanket statement like the comment above is no really correct - just trying to make a point, I guess
Ask any celebrity how much privacy they have. They can’t even buy Starbucks without people commenting on how fat their comfy clothes make them look. Because they have no anonymity.
Seems like we all need to come together and use the same technique to "we are borg, we are browsing your internet as one, tracking is futile"
Email validation doesn't work. Ip blocking doesn't work. Captcha? Kind of. Fingerprinting? Very efficient.
Because the sites that still offer feeds, at least those for which a feed makes sense, well, you can read them comfortably via RSS.
Yes, I know that's ski-mask bla bla bla, but I still don't want my browser to be doing this nonsense.
When I think of all the tracking that goes on, these are becoming more lucrative.
However, you might also want to access HTTP and HTML, and to do so without needing to load fonts, pictures, etc; you might use a web browser that omits many of these features. However, it also can result in some problems; there are a few ways to work around some of these, such as adding your own scripts to handle some services, adding proxy services for handling some services (although some of these can use other protocols such as Gemini), and/or using the HTML/CSS commands in other ways (e.g. using ARIA to decide the formatting rather than using CSS). However, there are other issues, e.g. if the web page you download includes more junk than the actual main text.
Giving the surveillance economy access to your habits means making them slightly better informed about everyone. That won't directly endanger you; the SE will just become slightly better informed about how people like you function.
This will enable it to increase the amount of risk faced by some other person that you will never hear of (and vice versa) if any of you is even suspected of endangering the SE, in proportion to the risk to the SE which people like you may hypothetically pose, as quantified by the methods of nepotism-powered pseudoscience.
Perhaps what is missing is a criminal law that forbids deliberate non-consensual tracking of a person's activity. Even in public.
Recording someone as you happen to be recording something in public (including CCTV) is not deliberate or targeted towards an individual. But even in public, if someone followed you around tracking what you're doing (even without recording you), that shouldn't be lawful. Public figures and law enforcement activity based on probable cause being the exceptions.
Can anyone think of any reasonable counter-arguments to this?