Each year, it seems that a select group of topics dominate headlines and retroactively characterize that year as “the year of the X,” with “X” representing the most notable topics. 2014 saw several themes dominate the news and trending topics across social media sites: data breaches across major retailers, the World Cup, the Sochi Winter Olympics, the Ice Bucket Challenge and the Ebola virus outbreak. While these are disparate topics, there is a common denominator across all: big data. Technologies such as electronic health records, mobile applications, social media, wearable devices, the Internet of Things, and sensors are resulting in a treasure trove of big data. Patients are sharing symptoms, “checking in” at medical clinics on apps and social media sites, and wearables are tracking other health markers. The ability to combine and analyze these myriad sources of data is transforming the practice of epidemic tracking.

Big data is playing an increasingly large role across many aspects of business, and shows particular promise in the area of public health, specifically with regard to epidemic tracking.

The Ebola virus outbreak offers strong evidence of the important role big data can play in epidemic tracking. When news of the epidemic began breaking in March of 2014, most of the official forecasts on the spread of Ebola came from the US Centers for Disease Control (CDC) and the World Health Organization (WHO). While initially, both the CDC and WHO relied primarily on conventional epidemiological approaches and measures to arrive at their estimates of how far and how quickly the disease was likely to spread, CDC recognized that these traditional tools were not adequate and leveraged big data to hone their insights. They had been piloting BioMosaic, a tool that merges health, population and movement data to predict the spread of disease, for several years, and realized that Ebola fit remarkably well with the tool. BioMosaic provided CDC with near real-time availability of the global air transportation network, and enabled them to identify the at-risk populations and create a mosaic map of the diaspora population both on the move from affected areas as well as statically in terms of the US resident population. As the outbreak threatened to become a global pandemic, other organizations and sources also turned to real-time data elements to see if they could detect patterns that would help better predict how, and where, the disease might be spreading next.

HealthMap, a disease-monitoring website maintained by a team of researchers and epidemiologists at Boston Children’s Hospital, is one such source of big data analytics. The site provides early detection and real-time surveillance on emerging health threats by aggregating and analyzing data from multiple sources, including social media streams, online news stories, official reports, travel sites, and official sources. CDC unveiled a new software tool, the Epi Info viral hemorrhagic fever (VHF) application, to help identify people exposed to the virus faster than traditional reporting methods allowed. The Epi Info VHF application coincided with the launch of the U.S. government’s Global Health Security Agenda to strengthen national security by helping other nations prevent, detect, and effectively respond to disease outbreaks. Over the next five years, the initiative will strengthen the health infrastructure of at least 30 partner countries with 4 billion citizens.

Ebola isn’t the only health epidemic for which big data is proving to be a useful tool; the flu is another illness that big data is helping monitor and abate. When a recent flu epidemic in Boston and New York had infected hundreds and killed 18, app developers and health officials turned to big data for help.

In both the Ebola and the flu epidemics, social media played a large part in supplying massive data sets that helped identify outbreaks, forecast where the diseases might spread next, and gave clues as to where new outbreaks may be developing. Ming-Hsiang Tsou, a professor at San Diego State University, and an author of a recent study that examines the complex relationship between disease events and messages on social media sites, believes algorithms that map social media posts and mobile phone data hold enormous potential for helping researchers track epidemics. Given the popularity of social media sites, infectious disease surveillance systems that use data-sharing technologies to track social media data could potentially inform early warning systems about disease outbreaks, as well as facilitate communication between health-care providers and local, national and international health authorities.

Beyond tracking data, data storage and data analytics are other elements of big data’s role in tracking and predicting health epidemics. Innovative healthcare data solutions that enable secure management of vast amounts of patient data are essential, as is the ability to use and share it enterprise-wide, and gain efficiencies of scale through cloud solutions and virtualization. Additionally, researchers must be able to quickly and reliably access data and track insights from extremely large data sets.

According to the McKinsey Global Institute, using data to better predict the healthcare needs of the U.S. population could save between $300 and $450 billion. With possible savings of 10% of the entire U.S. medical bill, as well as the potential to predict, track and stop epidemics in their tracks, insights from big data could be the prescription for better care, lower costs and lives saved. As John McDaniel, practice leader for the U.S. Healthcare Provider Market at NetApp, stated in a recent Forbes article, “There’s no question. This is the future of healthcare.” And, likely, the future of epidemic tracking.