Big Data vs. Disease Epidemics

Can big data be used as a social good in the fight against disease epidemics?

Chris Fabian with Manuel Garcia-Herranz & Alex Rutherford
Children queue to be vaccinated against yellow fever in the northern district of Séguéla
16 September 2016

In a world that is more connected than ever, it is increasingly easy for diseases to spread, not only within countries and regions but also along frequent routes of international travel.  The increase in epidemics such as Yellow Fever, and experiences with the recent Ebola outbreak in West Africa and the ongoing Zika epidemic highlight this new global risk. There is also a new expectation of how quickly the international community needs to respond.

Responses to epidemics are predicated on real-time data at a national and global level. Whether that information is telling those involved in the response where to focus their limited resources or how people who are most at risk are thinking about a threat or what information to provide to affected populations – international organizations need to be able to know and act more quickly than ever before.

During the Ebola crisis, UNICEF was able to leverage two existing powerful data platforms to inform its work:

  1. U-Report – which is a system that allows for real-time, two-way communication with young people through SMS — allowed us to quickly understand young people’s needs in Liberia. For instance, young people needed accurate information about signs, symptoms and ways to prevent Ebola to combat the many rumors that were circulating within local communities. Youth also wanted to know what social behaviours were safe, as well as when they could go back to school.
  2. Edutrac, built on the same technology that powers U-Report, allowed us to gather real-time information about the needs in schools in Sierra Leone and helped us ensure that hygiene equipment had been delivered to schools.

These two products, with millions of combined users, are part of UNICEF’s innovative approach to development programming. They allow us to use real-time data to better inform policy and advocacy work with governments as well as to strengthen our ability to deliver key services to the world’s most vulnerable children.

A third platform that emerged during the Ebola crisis has yet to reach the scale of U-Report and Edutrac, but has the potential to help us understand emerging shocks in real-time.  This technology is a prototype in its early stages (and cursed with the name of a technology prototype, the Magic Box).  With the Magic Box, we began to combine data sources from our private sector partners showing where people were moving with UNICEF’s epidemiological (case) data.  This approach was based on existing academic research using mobility data to predict and understand epidemics such as the international spread of Ebola ([1]), the national spread of Malaria in Kenya ([2]), and the Cholera outbreak following Haiti’s earthquake ([3]).  Our prototypes also took into account strong claims from the scientific community on the importance of having better mobility data to respond to the Ebola epidemic ([4]).  It is also based on the pioneering efforts of UN Global Pulse to construct partnerships and collaborations with data providers and give UN agencies access to digital exhaust.

The work done with Magic Box v.0.1 during Ebola was crude:

  • Our prototype and the mobility data were not ready until after the outbreak, so we couldn’t work in real-time.
  • We had limited data sources from partners so insights were limited (only mobility data from one major mobile network operator per country in two countries in West Africa).

However, v.0.1 pointed to an interesting possibility. UNICEF, along with private and public sector partners, could take the massive amounts of data from relatively unstructured sources to develop insights about the world around us.

Data growth of unstructured data
Slideshare/ET Center

It was a hypothesis – but data from this early work showed a correlation between people’s mobility and the spread of Ebola.

Mobility data pointing at the south ferry line as the most common route of travel from an infected area in Kaffu (in blue) to Freetown area.
UNICEF Innovation
Mobility data pointing at the south ferry line as the most common route of travel from an infected area in Kaffu (in blue) to Freetown area.

The Zika epidemic allows us to take this system one step further. Influenced by recent academic research on the spread of Dengue ([5]) and new methods for using social media to asses emergencies ([6]), Magic Box v.0.5 was built as a collaboration between Google engineers and UNICEF’s Office of Innovation. The work around Zika, which is still ongoing, and in very early stages, has allowed us to bring new partners into the effort to combat global risks.  Amadeus, which provides more than 40% of global travel booking joined as an initial data provider, while also generously offering support from their core business and engineering teams. IBM recently entered the collaboration with support from its weather data teams.  In addition to these three incredibly connected and technologically savvy partners, UNICEF continues to work with UN Global Pulse and through the UN Innovations Network, which it co-chairs with UNHCR and WFP to drive a research agenda that will benefit other parts of the United Nations.

Connecting aggregate data sources from private sector partners is not easy.  We have to understand issues of privacy, different corporate languages, and a variety of technology stacks.  We also need to link data to action.  But this early stage prototype points to a new set of possibilities: if we can bring more collaborators into the rapidly-growing technology partnership we can also look at data in new ways.  Combining weather, population movement, models of mosquito prevalence and registered cases of a disease has potential uses in an epidemic like Zika.  But this data can also tell us other stories – it can work in conjunction with UNICEF’s gold-standard Multiple Indicator Cluster Survey and other household surveys – to provide real-time visibility into issues that affect children.

New collaborations, like the Magic Box, give UNICEF the capacity to link the needs of the world’s most vulnerable populations to a rapidly expanding set of technology and data-driven solutions and partners.  These networks are in their early stages, but the effort already signals a world where a real-time understanding of risks and global challenges allows us to work better, and faster, for children.

Human mobility estimated from the aggregated movements of Twitter users in Brazil.
UNICEF Innovation
Human mobility estimated from the aggregated movements of Twitter users in Brazil.




Christopher Fabian (@hichrisfabian on Twitter) is a technologist who leads UNICEF Innovation’s Ventures Unit based in New York. Fabian work with UNICEF is focused on finding innovative solutions to some of the world’s most complex problems, particularly those faced by children. Fabian studied philosophy at the American University in Cairo and at Trinity College in Dublin. He also holds a degree in Media Studies from the New School in New York, NY.

Manuel Garcia-Herranz (@Ranzher on Twitter) is the Lead Research Scientist in the UNICEF Office of Innovation. Manuel has a PhD in computer science and is interested in the study of computational social networks, complex systems and behavioral dynamics and in how new types of data and analysis can be used for human development. His work in the last few years has focused on epidemic outbreaks, nowcasting, and estimating socioeconomic indicators from digital exhaust, as well as developing protocols and methodologies for data sharing.

Alex Rutherford (@arutherfordium on Twitter) is a Research Scientist within the UNICEF Office of Innovation, and previously a data scientist with Global Pulse, which is an innovation initiative within the Secretary General’s Office. Alex has a PhD in condensed matter physics and has spent the last 8 years, including several spent in the Middle East, applying the theoretical physics toolkit to a disparate set of problems. These include precursors of ethnic violence, social mobilisation, incentives to corruption, data privacy and constitutional reform. Alex enjoys explaining the thrill of data science to non-experts and explaining the thrill of data science within the UN to data science experts.



[1] Gomes MFC, Pastore y Piontti A, Rossi L, Chao D, Longini I, Halloran ME, Vespignani A. Assessing the International Spreading Risk Associated with the 2014 West African Ebola Outbreak. PLOS Currents Outbreaks. 2014 Sep 2 . Edition 1. doi: 10.1371/currents.outbreaks.cd818f63d40e24aef769dda7df9e0da5.

[2] Wesolowski, A., Eagle, N., Tatem, A. J., Smith, D. L., Noor, A. M., Snow, R. W., & Buckee, C. O. (2012). Quantifying the impact of human mobility on malaria. Science338(6104), 267-270.

[3] Bengtsson L, Lu X, Thorson A, Garfield R, von Schreeb J (2011) Improved Response to Disasters and Outbreaks by Tracking Population Movements with Mobile Phone Network Data: A Post-Earthquake Geospatial Study in Haiti. PLoS Med 8(8): e1001083. doi: 10.1371/journal.pmed.1001083

[4] Halloran, M. E., Vespignani, A., Bharti, N., Feldstein, L. R., Alexander, K. A., Ferrari, M., … & Del Valle, S. Y. (2014). Ebola: mobility data. Science,346(6208), 433-433.

[5] Wesolowski, A., Qureshi, T., Boni, M. F., Sundsøy, P. R., Johansson, M. A., Rasheed, S. B., … & Buckee, C. O. (2015). Impact of human mobility on the emergence of dengue epidemics in Pakistan. Proceedings of the National Academy of Sciences112(38), 11887-11892.

[6] Kryvasheyeu, Y., Chen, H., Obradovich, N., Moro, E., Van Hentenryck, P., Fowler, J., & Cebrian, M. (2016). Rapid assessment of disaster damage using social media activity. Science advances2(3), e1500779.