Privacy Preserving Vaccine Credentials
Posted by ekr on 07 Dec 2021
As I noted previously, we're seeing each jurisdiction design their own vaccine passport system (New York, California, EU, New Zealand). While these systems differ in detail, they're conceptually pretty similar: a digital signature over a record consisting of the user's identity and some information about the subject's vaccine status.
This has the obvious privacy problem that the verifier can record the credential (or the information in it) and use it for tracking where someone has proved their vaccination status (and hence visited). It's not really possible to do better with a single static credential printed on a piece of paper. Obviously, the paper isn't going to change and so whatever the contents are they can be used for tracking. Moreover, the credential has to be verified by some kind of software—unless you can do elliptic curve math in your head—and that software can just record the information or transmit it back to some central location. Typically the official apps are supposed to just discard the credential after verifying it, but obviously you're just trusting them to do that.
If we relax the assumption that the credential is a single piece of paper then the design space seems like it opens up a bit, but—as we see below—probably not enough to really provide privacy.
Digression: Anonymous Credentials #
Before looking at the vaccine passport problem, it's helpful to look at a somewhat simpler problem: privacy preserving authentication.
Suppose that we want to build a system which gives people access to some resource but that doesn't identify them. As an example, I might want to let people pay road tolls but not be able to track them when they do so. Conventional systems just give each user an account number that they use to authenticate to the toll plaza, but then whoever operates the toll plazas can look at what credential was used and thus build a profile of each user.
There's a straightforward solution to this problem, which is to give each user a large pile of single-use credentials, each of which is good for one transaction. That prevents the toll plaza from connecting visits unless it colludes with whoever issued the token. However, in the real world, the same state agency probably issues the tokens as runs the toll plaza, so they're automatically colluding. Fortunately, there is a cryptographic solution, called blind signatures. A blind signature is a construction which allows someone to digitally sign a value without seeing it, like so:
Alice can then compute $Unblind(Sign(Blind(r))) \rightarrow Sign(r)$ to recover a valid signature over $r$, even though the issuer never saw $r$.
It's pretty easy to see how to turn this into an anonymous credential system: Alice generates a pile of random tokens, gets the issuer to sign them, and then redeems them one at a time. The toll plaza just verifies that each one is fresh (i.e., it hasn't been used before) and if so, accepts it.[1] This is what's called a "bearer token" which means that it's secret and just the possession of the token is sufficient to prove your identity, but you can also have a public key in the token so you can authenticate with a digital signature. There's a lot of much fancier stuff you can do here, including rerandomizable credentials that don't require you to get a pile of tokens and credentials which let you prove specific properties (e.g., that you're over 21) but we don't need to worry about that for now.
Anonymous Credentials for Vaccine Passports #
Naively, it seems pretty obvious how to use this kind of anonymous credential for vaccine passports:
- Replace the signed vaccine passport with an anonymous credential that just says "the holder of this credential is vaccinated", potentially with an expiration date.[2]
- Everyone's app is able to get a pile of these credentials.
- When you need to prove your vaccination status, you show the next credential.
- When your app runs out, it just gets some more.
Unfortunately, this has a number of problems, the most important of which is that the credential isn't bound to the user, which opens up a number of attacks. Perhaps the simplest is that a relying party can replay a credential that is provided to it to another relying party. For instance, suppose that I am the host at a restaurant charged with checking people's vaccine status: I can collect all the credentials people show me and then use them to prove that I—or others—are vaccinated.
This simple version of the attack can be addressed by replacing the bearer token with one which requires the person to authenticated. e.g., via a digital signature of a verifier-provided challenge. However, this leaves open what's called a "relay attack" in which the cheating verifier simultaneously authenticates themselves to another verifier, like so:
This isn't that great an attack because the cheating verifier has to be online and authenticating to another verifier at the same time as the vaccinated person (though not in the same place because the challenge and response can just be transmitted from place to place). However, there is a related attack that is worse in which a malicious vaccinated person with a valid credential helps someone else pretend to be vaccinated. This is pretty much the same message flow with different labels:
The practical version of this attack is that someone (or someones) get vaccinated and then get a set of valid credentials. They stand up a server on the Internet which accepts challenges and responds with signed responses, thus enabling arbitrary people to pretend to be vaccinated. And because the system is anonymous, tracking down the operator of the server and revoking their credentials is not easy.
Less Anonymous Credentials #
This kind of relay attack is well known in the literature; it's really just the interactive version of giving someone one of your anonymous bearer credentials. The underlying problem is that the verifier's isn't actually able to identify the person claiming to be vaccinated: all they have is a message that says "the person transmitting this to you is vaccinated" but that could be the person holding the phone or someone across the world.
The fix, of course, is to have the credential contain some information that lets you identify the person it's describing. There are a number of alternatives here:
- A biometric such as a picture
- The person's name, which can then be used in concert with their photo ID to confirm their identity
The obvious problem here is that this information has to be consistent enough to identify the person and therefore it can be used for tracking. In particular, if the credential contains the person's name and birthday, then you can just record that and use it for tracking.
There are some small things one could imagine doing to improve the situation. For instance, instead of having one photo of the person, you could use a different picture every time so that it wasn't bitwise identical. This can be done trivially by compressing with slightly different parameters or you could do something more complicated like automatically generating lookalike images with some sort of AI system. The problem, of course, is you can run the process in reverse to generate a hash of the image that is resistant to these kinds of manipulation (remember perceptual hashing from my posts on Apple's child sexual abuse material scanning system.) Moreover, this kind of hashing is a lot easier because you don't need to conceal the original image so you can ship quite a rich hash that is very accurate.
One approach I've seen proposed for dealing with names and birthdays is to just encode some subset of the letters, e.g., "E... Re.....a"[3] and maybe just the month and day of birth (the Dutch CoronaCheck system encodes initials and birth day/month; I hope to write something about that soon). This doesn't provide great privacy for two reasons. The first is that it's only k-anonymous[4]and k can't be that big; this has to be the case because it has to be sufficiently identifying to prevent me from using your ID to prove my vaccination status. This is already a problem but it's made worse by the fact that people's behavior isn't random. For instance, if we have four authentications for the initials ER within an hour with two at outdoor stores in Mountain View and two in bars in Los Angeles, it's likely that the first two are one person and the second two are another. This kind of constraint solving problem is something computers are very good at; you might not get a complete record of someone's behavior, but you'll learn a lot.[5] The more serious issue is that the initials/birthday need to be used with a photo ID, which of course has the person's full name. This allows the verifier to record that—even assuming that they don't just scan it, which is common in many places—which really reduces the privacy value of having the vaccine credential contain limited information.
The Bigger Picture #
I don't mean to suggest here that anonymous credentials can't work at all. There are plenty of settings where what you're authenticating is just the messages you're sending. For instance Privacy Pass and Private Access Tokens are systems designed to prove that someone is an authorized user (for some meaning of authorized) without revealing anything else about them. These systems can work because the only thing you are trying to authenticate is the person's messages, not the person themself.[6] The reason that these credentials don't work well in the vaccine setting is that you are trying to prove something different, namely that they apply to a particular human. This requires identifying that person, which makes the whole thing non-anonymous. This is a general limitation of anonymity systems: they do well in settings where the actual interaction you are trying to perform is easily anonymizable (e.g., over the Internet) and poorly when it is not (e.g., doing something in the physical world).[7] This is of course bad news for privacy because it's only getting easier to do surveillance in the physical world.
Note that deployed systems usually have license plate cameras which can be used in cases where someone doesn't pay the toll, but of course can also be used for surveillance of every car which goes through, which kind of defeats the whole purpose of this. ↩︎
Getting the expiration date encoded is a little tricky because in the simple system I showed above, the issuer knows nothing about what it's signing. There are a few alternatives, with perhaps the simplest one being to use a separate signing key for each expiration date. ↩︎
Kind of like what United does with their upgrade list. I am "RESE". ↩︎
Which is to say that the credential applies to a k-sized set of people and thus each person is hiding in a set of that size. ↩︎
You might be able to improve the situation some by revealing a different set of letters in the name each time. This would require some more analysis. ↩︎
And even with these systems you have to defend against attacks where the person gives others copies of their token. ↩︎