ELI15: Private Information Retrieval
Let's see if I can explain this with just high school math...
Posted by ekr on 30 Aug 2022
In my post on Safe Browsing I mentioned that one possible solution to the problem of querying the Safe Browsing database is Private Information Retrieval (PIR) and then waved my hands vigorously about it being crypto magic. In this post, I'm going to attempt to explain how PIR works with as simple math as possible. You will, however, want to read the Web version of this post because there is a fair bit of math and I use LaTeX to render it with MathJax, which looks bad in the newsletter version.
The PIR Problem #
The basic version of the PIR problem looks like this:
-
You have a server with some database
consisting of a set of elements -
The client wants to retrieve the
th element but doesn't want the server to know which element it retrieved.
There is an obvious trivial solution[1] in which the server sends the client the entire database and the client just looks up the value it wants to know[2] This provides privacy but at the expense of communication cost because you have to send the entire database. The challenge, then, is to build a system which has involves sending less data, has comparable privacy, and which doesn't chew up too much computational power.
There are two main flavors of PIR:
- Single server schemes
- Multiple server schemes
The single-server schemes are designed under the assumption that the server is malicious and use cryptographic mechanisms to protect against it. The multiple server schemes are designed under the assumption that some subset of the servers is non-malicious and are insecure if all the servers misbehave. In this post, I'll be talking solely about single-server PIR schemes; at some point in the future I might talk about multi-server.[3]
A Simple Insecure Solution #
The first observation to make (due to Beimel, Ishai, and Malkin) is that any single-server system must involve the server computing some function over every element in the database. Otherwise, the server could simply look at which elements were touched and learn something about which were retrieved. This tells us something about how things need to be constructed.
Let's start with a solution that's insecure but can serve as the basis for a secure solution. Take the database and arrange it in a square arrangement (a "matrix") like so:
In order to make a query, the client creates a list of numbers,
that consists of all 0s except for the number corresponding
to the column in the matrix that it wants to read. For instance,
if it wants to read value
The server constructs its response as follows. For each row in the matrix, it then goes column by column multiplying the value in its database times the value in the same row provided by the client and adds up the values for each column in the database. This produces a list that is the same length as the client's input, where each value is constructed by multiplying the elements in the matrix times the elements in the client's input. In this case, we would then get:
As you can see, what's happened here is that the 0s erase the columns we're not interested and we're just left with a list of the column of interest (in this case the rightmost one), shown in read. The client can then just read out the value of interest by looking at the right row.
Those of you who have taken linear algebra will recognize this as conventional matrix multiplication, where we multiply the database times the selection vector. However, you don't need to know that in order to understand what's going on.
It's worthwhile to stop and look at the properties of this
design. In the trivial solution, the server had to send
A More Secure Solution: Homomorphic Encryption #
Partially Homomorphic Encryption #
It's been known for a very long time how to do partially homomorphic encryption. As a concrete example, consider the case where you encrypt some data by XORing it with a key, i.e.,
With this system, you can have the server compute the XOR of two plaintexts
The server returns:
Which the client XORs with
When you cancel out the keys (
The difference between partially and fully homomorphic encryption is that with a partial homomorphic system you can compute some functions on encrypted data but not others. With a fully homomorphic system you can compute any function, whereas this system is homomorphic with respect to XOR but not (say) to multiplication. The problem of fully homomorphic encryption had been open for a long time until Craig Gentry finally showed how to do it in 2009.
The reason that our simple approach was insecure is that the server has to know which values in the client's list are 0 and which are 1, and so can easily determine which column the client wants. But what if the server could perform this computation without determining which of the client's values was 1? Kushilevitz and Ostrovsky figured out how to do this in 1997, using a technique called homomorphic encryption. A homomorphic encryption system is one in which you can operate on encrypted data without seeing the content of the data (see the sidebar for some intuition on this).
Specifically, we want a homomorphic encryption scheme which is
homomorphic with respect to addition. I.e., if we have
two ciphertexts
The way this works is that the client sends:
The server would then compute:
The client receives this value, decrypts, and it's got the result. One thing that might be sort of confusing here is that I'm showing the server both adding, and multiplying, as in:
However, because the server is multiplying the encrypted value by a known value, it can do this just by addition, as in:
So, all you need is an addition operation. There are, of course,
tricks to make this faster. For instance, you can compute powers of
two (1, 2, 4, 8, etc.) and then just build up the final value from
those. If you want to multiply two encrypted values, e.g.,
Of course, finding a suitable homomorphic encryption scheme is tricky because you want something that is cheap to compute and has a small ciphertext. The original K-O scheme used a fairly inefficient homomorphic encryption system and much of the work here has been in finding better systems.
Detail: Homomorphic Encryption using ElGamal #
You don't need to understand how to build a homomorphic encryption algorithm in order to understand PIR, but it's sometimes helpful to see things written out. In this section, I describe a simple well-known scheme based on the ElGamal encryption algorithm.
In the ElGamal encryption system client and server share a known
value
The recipient—who recall has
- Take
from the message and raise it to to get - Divide the second part of the message by
to recover
Note that in ordinary integer math, given
However, it turns out that system is homomorphic with respect to multiplication, not addition. Consider the pair of ciphertexts:
If we multiply the first parts and the second parts together, we get:
You can decrypt this exactly as before to get
However, I said above, that we wanted something that was homomorphic
with respect to addition not multiplication. The trick here is that
instead of encrypting message
And you just need to take the discrete log to recover
To use this system in practice, the client is just going to encrypt to itself by generating a key that it knows, but otherwise we just use this system as-is.
Complexity #
This is all pretty cool and it's better than nothing, but it's
also not very efficient: in order to retrieve a single value
you need to send
In terms of computational complexity, the server has to
compute over each database entry, so that's
Improvements #
As described in the original K-O paper, it's possible to significantly improve the basic scheme by being a little clever.[6]
Reusing the Client's Vector #
The original K-O scheme was even worse than what I have presented above in that you could only extract single-bit values. This meant that if you wanted to extract multiple-bit values, you naively just repeat the protocol for each bit, so both the database size and the PIR protocol scale linearly with entry size.
This leads to an obvious improvement: say that I want to read values from a database where each entry is 8 bits rather than 1. As noted above, the client could just send 8 input vectors, but why? The client's vectors all pick out the same column in the database and they're not specific to anything on the server side.
Instead, the server can just compute its results for each of the 8
bits of the database using the same client input. The server then
sends back
Where
Recursion #
The next optimization requires a little more cleverness. Recall that the
server sends the client values corresponding to each row in the
database but that the client only cares about one of the rows.
Say we have a database that is consists of
The key insight here is that this itself is a PIR problem, with the
database consisting of
But why stop there? We can keep using the same trick!
Imagine we have a really big database of
Further Improvements #
Even with the optimizations above, we're still left with a system
which isn't very efficient, especially for smaller data sets,
where it's quite a bit worse than just transferring the entire
database (the advantage goes as a factor of
This seems to fall into a small number of buckets.
Improving the Inner Loop #
If you step back and look at the basic design of the K-O protocol,
it looks like this (I'm using the linear algebra matrix multiplication
notation here, but really the
In other words, the input vector supplied by the client operates on each row of the database, picking out the column of interest to the client and ignoring the other values (remember that rows in the input vector correspond to columns we want to select). The server sends back each resulting row and the client reads the row of interest, ignoring the others. This basic structure holds whether the operation being performed is simple multiplication (as in our insecure example) or homomorphic encryption.
This means that the cost of the system is determined by the
basic scaling properties of
Reducing the Client's Input Vector #
There is another cute trick we can play, that's a natural extension of the techniques we have already seen. Suppose that we have a homomorphic encryption scheme that lets me:
- Add as many encrypted values as I want
- Do a single multiplication of two encrypted values
In this case, we can reduce the communication cost further, as described by Boneh, Goh, and Nissim. Instead of sending a single list of encrypted values, containing a single (encrypted) 1, the client sends a pair of lists, each containing a single (encrypted) 1. The server then computes the product of each pair of values in each list, e.g.,
We can then lay this out in a deterministic order left to right and top to bottom (though any rule will work) as a single list, like so:
This list can then be used as the input to the standard K-O
protocol, and we've just reduced the total number of values
the client sends from
Precomputation #
One interesting recent development in PIR is the design of systems which use precomputation to make the PIR process cheaper. The basic idea is that with a suitable homomorphic algorithm the server and client can perform some initial exchange, presumably involving some computation and the exchange of some data (a "hint"). Once the hint has been exchanged, the client can make individual queries much more cheaply. This makes sense for applications like Safe Browsing where the client is likely to make a lot of queries and so you can amortize the hint.
The specific precomputation techniques vary. In some designs, the client and server perform some client-specific precomputation and in others like SimplePIR, the server just does the computation itself and distributes the hint to every client.
Other Designs #
I've focused here specifically on designs that follow this K-O model, largely because they are intuitively easy to explain. There are also designs (for instance Cachin, Micali, and Stadler and Gentry and Ramzan) that are based on other structures and involve sending less data but at increased computation cost. The math here is a lot harder—I only somewhat understand it myself—so I'm not going to try to explain them here.
The Big Picture #
In conclusion, I'd like to make two points here. First, this is a really counterintuitive —at least to me—result: we can allow a client to read some fraction of the server's data without the server learning anything about which values the client wants and in a fashion more efficient than just sending the client all the data. Hopefully, this post gives some intuition for why that's possible, thus rendering it less counterintuitive if not precisely obvious.
Second, PIR is an immensely powerful primitive. There are a whole pile of problems which would be much easier if we had efficient PIR, ranging from Safe Browsing, to messaging interoperability, to authentication for phone calls. We're not yet at the point where you can just drop in PIR the way you would drop in TLS, without really thinking about the cost, but we are getting closer to the point where some of these applications are practical. In fact we may already be there in some cases.
Acknowledgement #
Thanks to Henry Corrigan-Gibbs for assistance with this post. All mistakes are of course mine.
This is the standard leadin to this problem, as seen, for instance, in the Wikipedia article. ↩︎
Indeed, the longer hashes version of Safe Browsing is precisely this. ↩︎
As an aside, it's known that it's not possible to have information theoretic security with a single server. You have to depend on some cryptographic assumption. There are information theoretically secure versions of multi-server PIR, as long as some of the servers are not malicious. ↩︎
Yes, yes, or on an elliptic curve or something. ↩︎
Recall that you have to send two values for each ciphertext. ↩︎
Note: I am using a somewhat different presentation order which I think is easier to understand. ↩︎