Improving vaccine registration
Posted by ekr on 17 May 2021
Here in the United States we've rapidly gone from a situation where there was overwhelming demand for the COVID vaccine to one where supply far outstrips demand and the major concern is how to get people to take it. However, until late April and early May, there was a huge amount of contention for vaccination appointments. I think it's clear that this process did not go as smoothly as possible. In particular:
Vaccines were unevenly distributed, with areas where there was extra supply not too far away from areas where there was high demand. For instance, people here in the Bay Area were driving an hour or two to Stockton or Tracy to get vaccinated before they could here.
Every time a new tranche of people became eligible (e.g., 50 or above, 16 or above), there would be a mad rush for appointments, with the Web sites taking time to be updated, people having trouble registering, etc.
Networking people will recognize this as as basically a queueing problem: we have a fixed amount of capacity and demand that exceeds that capacity, so we have to find a way of scheduling the rate at which things happen. It's important to to recognize that in a situation like this, some people are going to have to wait. There's really nothing to be done about that other than make more capacity or wait for demand to subside. Our scheduling objective is to ensure an efficient process. Specifically, this means:
Make sure you're using all of your available capacity. In this case that means that you're giving out doses as fast as possible and that none go to waste.
Serve the highest priority customers first.
Minimize the amount of overhead in the scheduling system itself. For instance, it's inefficient for people to be constantly trying to reload the Web site, waiting in giant lines, or have to subscribe to some service to find out where they can get doses.
Schedule people at convenient times
One of the things that has complicated the vaccine rollout has been trying to balance the first and second objectives. It's relatively easy to use up all your capacity if you don't care who you serve first, but that may mean that you're only serving rich people or -- especially important in the case of COVID -- that you're serving young healthy people rather than people who are at high risk. Conversely, you can enforce strict prioritization but if you don't have enough people in a given tranche at a given time -- or can't reach them -- then you aren't going to efficiently use all your capacity. In the worst case, you won't give out all of your doses; there were certainly plenty of reports of this happening in the early days of the US vaccination effort. Which of these to prioritize is a policy judgement, but there's inherently a tradeoff.
I'm most familiar with the system in California, which looked something like this:
Break people up into priority tranches based on -- or at least intended to be based on -- risk level.
Open up registration for the highest priority closed tranche.
Continue at the current level until current demand subsides and you start to have open appointments.
Repeat steps 2-3.
This wasn't terrible but had fairly predictable results: because the tranches were so large compared to the amount of vaccination capacity, as soon as registration opened for a new tranche you would have a period of chaos until demand died down to a sustainable level. The primary cause of this problem is fragmentation, both horizontal and vertical. If there was just one place for people to get vaccinated and all times were equally good, then you could just take people in the order they registered and have a strict queue and life would be simple. However, in reality there are multiple vaccination sites (in Santa Clara County, each with its own registration process) and multiple time slots, and so instead of just taking the first slot, people spend time looking through different location/time possibilities to find one that works. Unfortunately, everyone else is doing the same thing, which means that by the time you have discovered that that 2 AM appointment in Lodi is really the best you can get someone else has snapped it up. In the worst case, people may even find that appointments have been taken in the time between they are shown the list and the time they pick one.
Of course, people respond to this by holding (or even booking) the first apppointment they can get, figuring they will cancel if they find something better, which of course just makes the system more unstable, with slots opening and closing and everyone just getting frustrated. Because everyone is repeatedly trying to book appointments, there is far more load on the system -- and churn in what apppointments are available -- than if people just got on, registered, and got off. In some cases, the site itself can be overloaded: I had the Santa Clara site just stall on me when I was trying to book one appointment.
I want to emphasize here that this is just people responding rationally to the situation they find themselves in; the problem here is the design of the system itself. What's going on here is that every time you open up registration to a new tranche you create a temporary state of very high demand, which overloads the registration system. Because there is finite demand, this will eventually fix itself, but the situation can be improved by avoiding the surge in the first place. Consider what happens if instead of allowing everyone over 50 to book at the same time we randomly selected one eligible person, let them book (say, up to a week out) and then moved on to the next person. In this case, each person would fairly quickly pick the appointment that was best for them without worry that they would lose a slot.
While efficient in terms of time spent by customers, this system is impractically slow: even if people choose relatively rapidly, it will still take too long to fill all the slots. However, we can approximate this by letting people register in small batches:
Allow everyone to register with the system for a place in line. This can be done well before you open up a new tranche, because you're just adding them to a list. Some states already did this via systems like MyTurn.
Periodically, randomly select people out of the registered group and offer them the right to actually book an appointment.
Monitor the load on the appointment system and once the rate of appointment requests starts to decline, go back to step 2.
This system has a number of nice properties. First, it lets you continuously adjust the rate at which you admit people into the booking system: if things get too full and there's too much churn, you just wait a little while and then admit fewer people next time. If things get too slow, you admit some more people. Second, it's a lot less frustrating for users because the set of appointments is reasonably stable; rather than forcing them to take the first appointment they see whether it's one they like or not, they can look at the appointments and take something like the best one for them. Of course, this all works better if you have centralized scheduling for a large region and works badly if you have very decentralized scheduling because it's hard for the individual sites to know who is currently eligible to book. In those cases, one can use a simpler algorithm where you just do a lottery by birthdate. This doesn't give you as fine-grained control of the input rate and still requires some way to announce when each day's tranche becomes available, but would work better than the current system.
I want to note that this is fundamentantally different from having more fine-grained eligibility criteria, which seems problematic. It's already the case that there was a lot of debate about the specific prioritization and probably some bad behavior around the edges, and making the criteria more fine-grained just makes the situation worse. The idea here isn't to try to prioritize people better, it's just to meter the flow of people into the system so that we don't overload scheduling capacity. With that said, it arguably is fairer because you're allocating appointments randomly rather than on who based on who presses reload on their browser the fastest.
Of course, we're still left with the problem of people booking appointments and then just not showing up. If you book precisely as many appointments as you have doses available, then some people won't show and you'll have some leftover. This isn't that big an issue in a big vaccine site because even the mRNA vaccines can be stored in the refrigerator for a few days and so extra can mostly just be kept around (or alternately, you can overbook a little but the way that airlines do). For a small site, it's important to have some way of offering leftovers to people who don't have appointments -- or perhaps aren't even eligible -- better to get shots in arms than to have it go to waste.
I don't mean to sound ungrateful here: turning around a new vaccine in less than a year is nothing short of miraculous and after a bit of a slow start local officials have done an amazing job of quickly getting people vaccinated under very tough conditions. However, once the most serious phase of the emergency is over it's always important to ask what we could do better; this seems like one place for potential improvement.
To be clear, the people writing these bots did us all a service, but it's unfortunate that they had to do it. ↩︎
This precise formulation isn't something you face in networking, but a related one is. Consider what happens if you are trying to use the same network for real-time conferencing and bulk data transfer: You need the channel empty at the right times so that you can send the video and audio frames, otherwise they get queued and you get jitter. You want to schedule the data transfer for periods where the channel would be idle. ↩︎
The cost/benefit calculation here is tricky: obviously it's far better for any unvaccinated individual if they get a given vaccine dose than if someone else does, but it's even worse for the vaccine dose to go to waste, because other people being vaccinated helps you to some extent. This means that we need to optimze the overall system, not just ask if the "wrong person" occasionally gets vaccinated. ↩︎
The analogous situation in networking is called "congestion collapse". If you have a network link which cannot handle all the traffic people want to send through it, it responds by dropping packets. If the endpoints respond by retransmitting, you can get into a state where everyone is trying to send aggressively and the link gets clogged with retransmissions, with the result that the link is full but not carrying much useful data. See Van Jacobson and Karels for a good description of how this happens and how to avoid it. The situation is actually quite a bit more complicated in networking because the endpoints don't know the total capacity of the network. ↩︎
You could of course do first come first served from the line you established in step 1, but then you create a race for initial registration. It's probably better to just randomly pick or to pick out of all the people who registered in a given week, etc. ↩︎