Authors: Nathan Wolfe
In one particularly notable application, he’s taken the tedious and complex work of cell culture, where cells from mammals and other organisms are grown under laboratory conditions, from the bench to the chip. The chips he and his team have created, just a few centimeters long, house ninety-six separate compartments where cells grow for weeks at a time and can be carefully measured and manipulated. While there are many applications for having cell culture on an automated and compact chip, one of them is the speed and efficiency for evaluating new viruses from large numbers of specimens. It’s not difficult to imagine a chip-based system that quickly tells us in what kind of cells a new agent can survive and therefore how it’s most likely to spread (e.g., by sex, blood, sneezes, and so on).
When we see an outbreak, there are a number of questions we’d like to have answered. First, what’s the microbe behind it? Techniques like viral microarrays and high throughput sequencing are increasing the speed at which we can identify new agents and also helping us to find things that we’d have missed through older techniques. But once we’ve identified a microbe, we want to know where it’s going. We’ll return in chapter 12 to a vision of what the ultimate pandemic prevention system will look like, but it would certainly involve approaches like those developed by the Andino lab to assess the potential evolutionary directions that a virus can take. And the tools that Quake’s group has developed might one day form a set of high-speed chips that quickly evaluate how it’s likely to spread.
* * *
Modern information and communication technology provides us with another set of tools that does something distinct and complementary to the biotech advances discussed above. In fact, some of this technology is sitting in your pocket as you read this sentence.
In one of our research sites in southwest Cameroon sits a rubber plantation called Hevecam, where we conducted an experiment. This experiment represents one of the exciting new trends in public health. And it’s all based on simple cell phones.
In Hevecam, a plantation with nearly a hundred thousand inhabitants, when individuals get sick they go to a local clinic. If they’re sufficiently ill, they then move from that local clinic to the referral hospital in the center of the plantation. Yet traditionally there has been no good way for the referral hospital to monitor what’s happening in the local clinics. A few years ago Lucky Gunasekara, who now heads up our program on digital epidemiology, and his partners at the nonprofit FrontlineSMS:Medic that he co-founded, set up a simple system based on text messages to allow the referral hospital to monitor what was occurring in the local clinics. By simply texting a series of preset codes, the vast majority of vital clinical information could be communicated up the medical hierarchy clearly, instantly, and efficiently. Using predetermined codes and simple text message forms, the local clinics could rapidly inform everyone else of how many cases of malaria, diarrhea, and other illnesses they were seeing.
Simple technologies can have dramatic impact. With a few simple techniques, medical conditions at Hevecam could be monitored not only in the referral hospital but also remotely over a web dashboard for anyone with appropriate access. By allowing local clinicians or patients themselves the capacity to communicate, information can be accumulated, organized, and analyzed, leading to a much more rapid and localized sense of what’s going on during a health emergency.
Something just like this occurred during the earthquake in Haiti in 2010. Immediately after the earthquake, organizations like Ushahidi
2
set up short, free codes to which people could text “help” messages. They then turned to the local DJs who, along with popular word of mouth, publicized the numbers. Amazingly, when the dust cleared, the statistical analysis of the text message distributions mapped accurately onto high-resolution aerial imagery of damage. Effectively, people’s text messages gave highly informative clues as to where the greatest damage occurred. More importantly for those in Haiti, the messages saved lives, with the critical information transmitted to the heroic rescue workers on the scene.
Similar systems have been used during outbreaks, such as the cholera outbreak in Haiti in the fall of 2010. The ultimate hope is that outbreak detection can be crowdsourced, with small bits of information provided by sufferers that converges into a real-time picture of the beginnings of outbreaks and their subsequent spread. The short codes are only the start. As more and more countries adopt electronic medical records, people around the world will increasingly link to them directly by reporting their health complaints from their phones. This information will not only provide more efficient medical care to individuals who report illness—when analyzed across large numbers of users, it will allow more rapid and sensitive detection of health anomalies. Eventually, response systems can be developed to recognize unusual clusters of health complaints that signal the beginnings of an epidemic. With that, the age of digital epidemiology will truly have begun.
* * *
One of the critiques of using text messaging as an early indicator of disease spread is that even under the direst circumstances not everyone will text. But there are ways to use cell phones that don’t require their users to do a thing.
At the moment I’m writing this sentence, over 60 percent of the world’s population has been planted with automated locating beacons. These beacons provide constantly updating information on exactly where they are. Within the next five to ten years, virtually everyone on the planet will have one. This is not a government plot. It is the mobile phone in your pocket.
Cell phones constantly communicate with cell towers, providing telecom operators with an incredibly rich amount of data about where their customers are, how the customers are connected to each other, and with a bit of interpretation, the social behaviors of their users. These so-called call data records have provided huge data opportunities for the telecoms to understand their clients and sell them more services. But the massive data sets have much more value than sales. This constant flow of seemingly innocuous information could save your life.
The data collected by cellular telephone companies makes us all potential sensors for rapid detection of important human events. This was shown elegantly by Nathan Eagle, a member of the innovative MIT Media Lab and one of the pioneers in applying call data records to generalized problems. Along with his colleagues, Eagle sought to investigate what could be known about an earthquake by mining call data records.
Eagle and his team studied data on calling patterns in Rwanda for three years, including the critical week of February 3, 2008, when a 5.9 magnitude earthquake occurred in the Lake Kivu region. By establishing a baseline for the frequency of calls, Eagle and his team were able to see telltale clues of unusual calling patterns during the period immediately following the earthquake. They were able to detect the time of the quake through a peak in call numbers. They were also able to establish the epicenter of the quake by using location data from cell towers, placing the epicenter central to the locations of the heaviest call volumes.
The idea that using data derived from cell phones can detect an earthquake in space and time is amazing. It also suggests a range of different applications. Individuals who are ill may have fundamentally different call patterns than those that are not, and call patterns may also alter as a new outbreak spreads. Analyses of call data records alone might not provide perfect early detection of a new outbreak, but combined with other sources of outbreak data from organizations like ours and other health institutions, it might help us chart early epidemic spread.
* * *
Cell phones are growing more ubiquitous by the day and will likely be critical tools in helping to detect and respond quickly to outbreaks before they become pandemics. Yet they are not the only technology-heavy solutions being used in the growing field of digital surveillance. In 2009 my colleagues at Google
3
published a fascinating paper showing that individuals’ online search patterns also provide a sense of what people are becoming infected with.
With the vast stores of search data kept by Google and US influenza surveillance data collected by the CDC, the team was able to calibrate their system to determine the key search words that sick people or their caregivers used to indicate the presence of illness. The team used searches on words related to influenza and its symptoms and remedies to establish a system that accurately tracked the influenza statistics generated by the CDC. In fact, they did better. Since Google search data is available immediately, and CDC influenza surveillance data lags because of time needed for reporting and posting, Google was able to beat the CDC in providing accurate influenza trends before the traditional surveillance system.
Early data on seasonal influenza, as provided by the Google Flu Trends system, is interesting and potentially important. This early data provides health organizations time to order medications and prepare for different triage needs. But early detection of seasonal influenza is not the Holy Grail. That honor would go to a system that could detect a newly emerging pandemic. Google is now working to extend its influenza findings to other kinds of diseases. As more and more people use search engines like Google, and more and more data is acquired, the hope is that better and better trend analyses will be developed for agents other than influenza. Perhaps at some point a community experiencing the beginning of a pandemic will signal its arrival just by Googling.
* * *
The explosion of online social media provides another set of
big data
in which weak but potentially valuable signals of a coming plague may be found. Computer scientists, like Vasileios Lampos and Nello Cristianini from the University of Bristol, have taken a similar approach as the scientists at Google, sorting through hundreds of millions of Twitter messages. Like their colleagues at Google, Lampos and Cristianini used key words to watch trends in Twitter and find associations with influenza statistics, in this case provided by the UK’s Health Protection Agency.
In 2009 they tracked the frequency of tweets related to influenza during the H1N1 pandemic and found they were able to track the official health data with 97 percent accuracy. As with the findings by the Google Flu Trends team, this work provides a rapid and potentially inexpensive way to supplement traditional epidemiological data gathering. It also has the potential to be extended to more than just influenza.
While online social media can be scanned to see what people are communicating about, online social networking may provide a richer and subtler range of possible uses. In fascinating recent work, two leading social scientists, Nicholas Christakis and James Fowler, have studied how social networks can inform surveillance for infectious diseases.
In a clever experiment, these two scientists followed Harvard students who were divided into two groups. The first group was randomly selected from the Harvard student population. The second group was chosen from individuals that the first group named as friends. Because individuals near the center of a social network are likely to be infected sooner than those on the periphery, Christakis and Fowler hypothesized that during an outbreak the friend group would become infected sooner than the random, and therefore on average less socially central, group. The results were dramatic. During an influenza outbreak in 2009, the friend group became infected on average fourteen days ahead of the randomly chosen group.
The hope is that social science can identify novel kinds of sentinels to monitor for new outbreaks and catch them early.
4
Determining friends would be time consuming, however—something we could accomplish on a single college campus but perhaps not nationally. Now self-identified friends in massive online social networks may make this task much easier. Online social networks like Facebook, while not designed to help monitor for outbreaks, have created relatively easy-to-monitor systems that can be mined to determine the frequency of illness, identify social sentinels, and perhaps eventually provide predictions for spread of a new agent within a community.
* * *
When John Snow created the first Geographic Information System in 1854, he took actions that would seem very logical and straightforward to us today. He took a map, he plotted where sick people were, and he plotted possible sources of contagion. Snow could not have predicted the directions in which his first tentative step would lead or the data that would eventually become available for today’s GIS.
In the end it may be that no single data source reigns supreme. If Snow were alive today and investigating an outbreak, he’d want it all. He’d want to know where the sick people were, and he’d be glad to get the data more quickly and easily through text messages or Internet searches. He’d like to know exactly what cases were infected with, down to the very specific microbial genetic strain. He’d seek to use call data records to monitor people’s movements in order to track the movement of the disease or where it was seeded. He’d like to know how people were connected socially, and he’d certainly follow individuals who were likely to become infected first or show signs earlier than the rest.
You can imagine the ultimate outbreak GIS, or in terms more familiar to Silicon Valley, what Lucky Gunasekara, the head of my data team, calls the ultimate outbreak
mash-up
: a map with layer after layer of critical information—where people are, what they’re concerned about, what they’re infected with, where they’re moving, and who they’re connected to. Developing and maintaining this combined digital and biological mash-up is the precise objective of Lucky’s team and something to which we’ll return in the final chapter of this book. Ideally, over time the data can be analyzed jointly, the various factors can be trained on actual outbreaks, and all the technology can be weighted optimally to maximize predictive power.