top of page

Improving the CDC’s prediction models using Luminoso

The Centers for Disease Control relied on Luminoso to surface trends and improve their prediction models


As pandemics rise around the world, this government-run public health organization recognized the need for faster, more accurate analytics. They rely on such analytics, specifically predictive models, to measure the spread of disease and inform how it responds to new and evolving situations.

The CDC was already using sophisticated models built on quantitative data reported after-the-fact from doctors’ offices, hospital emergency rooms, and urgent care centers. While reliable, the data is outdated by the time it reaches them. To be truly effective in its educational outreach, the organization needs models that can actually be predictive.

After partnering with Luminoso, they launched a far-reaching project to integrate data - including social media and free text - to more accurately predict trends in the spread of the flu. The overall objective was to prove the value of real-time analysis of unstructured data in understanding the spread and severity of a wide range of health challenges, from MERS to Ebola, and even viruses and diseases that have not yet become known. 


The initial focus of the project was to marry unstructured data with existing metrics to enhance predictive models. The Luminoso team put together a framework that allowed them to monitor core concepts and expressions that would discern actual flu cases versus other diseases. It would also correlate mentions of symptoms and the change of those symptoms over time to track progression and severity.

• Luminoso tracked how many people had the flu – even in cases where the afflicted only tweeted an emoji – and improved the organization’s prediction model

• Amid the Ebola outbreak, they used Luminoso to identify misinformation on social media and publicly address it


Twitter was one data source used to monitor mentions of the flu. By using Compass, Luminoso’s solution for analyzing streaming data, the organization could process and categorize more than 8,000 tweets per minute.

As more unstructured data was collected and combined with their existing models and algorithms, the Situational Awareness team modified their framework to further increase the clarity and accuracy of their predictive model. 

This helped them determine:

• How many people had the flu

• How effective the flu vaccine was

• The severity and duration of flu cases, and the virulence of the current flu strain

Social posts and free text references aligned with their reporting

Luminoso’s application of artificial intelligence and natural language processing enabled the organization to identify new cases of the flu and other diseases, even when the diseases weren’t explicitly mentioned.

For example, people on social media who used the phrase “shopping for Nyquil” were more likely to be suffering from the flu than people who simply said that they were “home sick from work.” They also found that the use of the pill emoji 💊 was proven to be a strong indicator of a flu case, even if the person tweeting provided no other information.

Tracking and managing public perception of Ebola

Less than halfway into the flu season, the Situational Awareness team got the chance to test Luminoso Compass when the Ebola crisis struck. The team found itself waging a battle against misinformation and misperception that required fast, accurate, and actionable analytics.

It was critical for them to quickly identify and track public concerns and conspiracy theories so that it could respond with accurate and relevant information. Given the slower processing time of its traditional models, the team found itself relying almost exclusively on Luminoso’s Compass tool in order to keep up with the thousands of social media messages streaming in by the minute.

Over the course of six weeks, the public health organization used Compass to collect and analyze Twitter data about Ebola. 

They found that:

• There were three major themes: the Ebola outbreak in Africa, Ebola patients outside of Africa, and the Ebola zeitgeist or topical mentions of Ebola

• There was also a major concern that Ebola might become airborne

• Conspiracy theories that credited a government or shadow organization with the development and spread of Ebola. Because Ebola emerged in Africa, conspiracy theories abounded about Ebola being developed to specifically kill people of certain races.

Ultimately, the real-time nature of Luminoso Compass allowed the CDC to quickly surface and plan an effective response to educate and reassure the public.

For more examples of Luminoso in action, check out our resources section, and be sure to follow us on Twitter @LuminosoInsight


bottom of page