Ensuring Privacy For Age Verification
Posted by ekr on 11 Feb 2022
The BBC reports that the UK has revived it's online safety bill, which was shelved back in 2019. There has been a lot of concern about the policies embodied in this bill from organizations ranging from ISOC to Big Brother Watch but I want to focus on what's essentially a technical point, which is that it represents a threat to user privacy that we don't really know how to fix.
The bill appears to require require adult (i.e., pornography) sites to verify the age of their users. This has been widely interpreted as effectively requiring the use of some kind of age verification system. Regardless of the wisdom of age verification requirements in general (see, for instance, this BBC article), it's going to be difficult to build a system which doesn't run the risk of creating a database of everyone who goes to a porn site. Given that what kind of porn people watch or whether they watch porn at all is generally considered private information this seems fairly undesirable.
Age Verification Providers #
The basic problem here is that determining whether someone is over 18 requires learning a fair bit of information about them, generally enough to determine their identity. The UK Age Verification Providers Association lists a variety of different methods for determining age, such as government identity documents, mobile phone record, credit reference agency, credit cards, etc., most of which are directly tied to your real-world identity.
There are two major ways in which these age verification systems can work, neither of which is great:
The site itself is verifying your age, e.g., by collecting the above information and using some third party service.
The site somehow bounces/redirects/embeds some third party age verification site.
In both cases, the age verification service learns your identity and the site that you are going to (because the site has an account with the service). In the first case, the site probably also learns your identity and so can associate it with the exact pages you view rather than just the site you visit.
The general assumption by the UK government seems to be that this privacy issue will be dealt with by policy controls, i.e., by restricting use and mandating security measures. In April 2019, the British Board of Film Classification designed an Age-verification Certificate Standard for age verification providers (AVPs) which prescribes a bunch of data retention policies as well as a set of procedures for attempting to ensure that the provider's network is secure (penetration testing, cryptographic key lifetimes, monitoring requirements, etc.). This Twitter thread by well-known security guy Alec Muffett does a good job of analyzing this standard and comes to some pretty negative conclusions. I have a bigger concern, though, which is the disclosure of your identity in the first place: even if you trust that the AVP will follow its own policies, they could still be hacked (see, for instance this 2007 Equifax Breach), or their records could be subpoenaed. The bottom line is that you're placing a lot of trust in someone you have no real relationship with. A better system would be one in which nobody ever got both your identity and the fact that you were on a given site.
Anonymous Age Verification #
The good news is that we now have technical mechanisms that enable this kind of anonymous verification of people's ages. The cryptographic details are complicated (see here for a description of one such system), but the basic idea looks like this:
- You go to the age verification provider and prove your age (most likely by proving your identity).
- The AVP issues you an unlinkable, anonymous credential.
- When you go to the porn site you provide the credential as proof of age.
This way the site knows you are of the appropriate age but doesn't learn who you are. And because the credential is unlinkable the porn site and the AVP can't collude to discover which users are which. This is all reasonably well understood technology cryptographic technology (see, for instance, Privacy Pass) and while it might be a bit challenging to integrate it with the Web, it's far from impossible. Unfortunately, I'm not sure how much this helps.
The problem is that even if the credential which the AVP provides to the user is anonymous, the AVP still sees the user's identity at the time they prove their age to the AVP. If the main reason that people need to do age verification is to watch porn then this is a pretty strong signal of the user's behaviors, and so they still need to trust the AVP's discretion. Ironically, this is a case where privacy would be better if people had to routinely demonstrate their age. For instance, if you needed to demonstrate you were over 18 ever time you bought something on Amazon or read the New York Times—or even used Facebook—then it wouldn't tell the AVP much when you signed up with it. However, if it's mostly just to access porn sites, then users don't really get to hide behind the less embarrassing use cases.
Regardless of the wisdom from a policy perspective of this kind of age verification, it seems like a real privacy threat. I'm well aware that the privacy situation on the Web is extremely bad, but that's something that browser makers are hard at work preventing, with technologies ranging from cookie restrictions to IP address-hiding proxies, and so we're gradually moving towards a world where you don't have to trust either Web sites or the trackers embedded on them. However, requiring this kind of age verification would effectively require people to trust that the AVPs protect their privacy. This is exactly the kind of trust we usually try to avoid via technical controls, but in this case those don't seem like they will be effective, leaving users with nothing but trust.
There are some AVPs which offer face-based age estimation. While this technically doesn't involve learning your identity, I'm not sure people should be that much happier about having the AVP have their photo, and of course given the capabilities of facial recognition, it will often be possible to determine your identity anyway. In any case, the most common mechanism for providers to offer seems to be based on government documents. ↩︎
This is of course true to some extent with the porn site itself, but they don't necessarily have your name and IP addresses aren't necessarily sufficient to identify you. Plus, you could use a VPN. ↩︎
What unlinkable means in this context is that the credential that the AVP sees is different from and can't be connected to the one that is presented to the porn site. ↩︎