Understanding The Web Security Model (Outtake): Cookies and Behavioral Advertising
Posted by ekr on 13 Mar 2022
This post was originally part of Post II of my series on the Web Security Model but kind of broke up the flow of that post, so it got pulled out. But a blog means never having to kill your darlings, so here it is. In Post II I wrote about how Web applications use cookies for statekeeping on a single site, but it turns out to be trivial to extend that functionality to provide targeting for behavioral advertising. There's nothing new technically here, it's just a new combination of several existing elements we've already seen.
Ad Networks #
Most advertising on the Web is done by ad networks. It's of course technically possible to just sell ads on your own site, but for obvious reasons this doesn't really work unless you're a big prestige site like Google, Facebook, or the New York Times. Instead, the typical thing to do is for the publisher to work with some third party ad provider who places ads on a lot of different sites.
The technical details of the system are unbelievably complicated. It's traditional at this point to show the baffling diagram below, called the "LUMAscape", which maps out the various entities in the ad ecosystem. However, at the level we need to be concerned with, matters are fairly simple.
In order to show advertising from a given ad network, the publisher embeds an element on their site with content of the element being loaded off of the ad network's server.[1] When the user visits the publisher's site the browser automatically loads the content from the ad network, which invisibly decides what ad to show. Recall that there's no rule that the content at a given URL has to remain constant, so the server can dynamically select the specific ad based on any information it has.
There are a variety of options for the element type. The simplest thing to do is just to use an image or an or an IFRAME. A fancier alternative is to first load some JavaScript off the ad network site; that JavaScript can then insert an image or IFRAME into the DOM of the page. Whatever the method, the browser ends up loading some content from the ad network. Note that I'm radically oversimplifying here; describing the ad sales process is out of scope for this post.
Determining Context #
There are a variety potential ways for the ad network to know the context of the page. First, browsers add a header called Referer which indicates the original site (yes, it's spelled "Referer". It's a typo that we're now stuck with). Increasingly, however browsers are sending less useful Referer headers (for privacy reasons). Another major option is to carry this data in the URL. In the simplest version, the publisher can be given a per-publisher URL. If the ad was inserted by ad network JavaScript, then that can insert the page into the URL. In any case, the ad network can generally tell what page the ad was on.
The question then becomes what ad the network should show. You could obviously show the same ad everywhere, but that's not going to do a very good job of showing interesting ads. The next most interesting thing is to show what's called a "contextual" ad, which is to say an ad that is relevant to the content of the page on which it is being shown. For instance, if you were on Runner's World you might get an ad for running shoes.
However, a lot (most?) of Web advertising isn't contextual but rather "behavioral". What this means is that it's not just based on the page the user is currently is on but based on their previous behavior. That behavior is measured using cookies.
Behavioral Tracking with Cookies #
If the advertising network has contracts with multiple publishers this allows them to observe the user's behavior across those publishers. The first time that the user goes to a page served by a given ad network, that ad network sets a cookie. From then on, they get to see every site that the user goes to and can link them all up using the cookie. Based on that information, they can build up a profile of the user's behavior and use that to decide which ads to show (recall that the server can serve any image it wants, regardless of the URL). The diagram below shows an example of this process.
The user first
visits sneakers.example
, which embeds an image from
the advertiser's site. The advertiser only knows that the
user is on sneakers.com
but nothing about the user
so it serves a contextual ad for sneakers. However, when
it returns the ad it sends a cookie. Later, the user
visits recycling.example
, which also embeds an image
from the same advertiser. This time, when the user
visits the advertiser, it sends the cookie, so the
advertiser knows that (1) the user was on sneakers.com
before and (2) they are on recycling.example
now,
so it shows the user an ad suitable for both interests:
recycled sneakers.
You can also use this seem basic technique for what's called retargeting. Suppose you go to a site and look at some product. If the ad network has a presence on the site (this can be an invisible element) then they can record this event and use it to target ads specifically at people interested in that product.
The Bigger Picture #
The use of cookies for behavioral advertising is basically an unintended consequence of the design of cookies, specifically, allowing them to be used in what's often called a "third party" context, in which the site you are sending the cookie to is different from the site you are on. One the one hand, this is an example of the power and extensibility of a few basic primitives: you can build a global ad network based on not much more than the ability to load third party content onto a site and attach cookies to those requests. On the other hand, the result is a system built on ubiquitous surveillance.
At the time cookies were first introduced, people did understand that there were privacy implications. However, a lot of the attention focused on first party tracking (i.e., of your behavior on a single site). The original cookie RFC has a fairly extensive discussion of privacy, but the section that most clearly addresses the third party context is kind of confusing and seems almost to be discussing what is now called cookie syncing:
A user agent should make every attempt to prevent the sharing of session information between hosts that are in different domains. Embedded or inlined objects may cause particularly severe privacy problems if they can be used to share cookies between disparate hosts. For example, a malicious server could embed cookie information for host a.com in a URI for a CGI on host b.com. User agent implementors are strongly encouraged to prevent this sort of exchange whenever possible.
My sense is that people were sort of aware of the problem but just didn't anticipate the scale of tracking that would eventually result. It's also worth noting that early browsers would often prompt users before accepting cookies, thus making this kind of tracking more difficult. Eventually, of course, every site wanted to set a zillion cookies and the permission prompts got too annoying so they were removed, only to be replaced years later by the arguably even more annoying GDPR cookie consent dialogs.
This is a theme we'll be seeing throughout this series: a lot of the early Web features were designed to solve specific problems and without much of understanding of the broader implications. It took years for the security and privacy community to catch up and develop a more comprehensive understanding of the security of the Web platform, and, as with advertising, we're still dealing with the implications of those original choices.
Technically, this third party is called a supply-side platform (SSP). There are also demand-side platforms (DSP)s which serve the advertisers, plus a bunch of other stuff. ↩︎