NOTE: Forecasting for the 2016/17 season has concluded. Forecasting will resume in November 2017.
Influenza (flu) is a respiratory virus that can result in illness ranging from mild to severe. Each year, millions of people get sick with influenza, hundreds of thousands are hospitalized and thousands of people die from flu. Tracking flu activity to inform prevention measures is an important public health function that is currently performed by CDC’s flu surveillance system, which can lag behind real-time flu activity. But what if it were possible to predict flu activity accurately weeks or months in advance for multiple locations? While this is not currently possible, the goal of flu forecasting is to provide a more-timely and forward-looking tool that health officials can use to target medical interventions, inform earlier public health actions, and allocate resources for communications, disease prevention and control. The potential benefits of flu forecasting are significant.
Since 2013, the Influenza Division at the Centers for Disease Control and Prevention has worked with external researchers to improve the science and usability of influenza forecasts by coordinating seasonal influenza prediction challenges. This work includes defining prediction targets, facilitating data access, establishing evaluation metrics to assess accuracy, and developing forecast visualizations.
Twenty-one research teams have developed different flu forecasting models and are providing flu activity forecasts to CDC for the 2016/17 influenza season. This beta website houses the weekly influenza activity forecasts provided by the various research teams. It’s important to note that these are not CDC forecasts and that the forecasts on this website are not endorsed by CDC. These forecasts are based on different models, can vary significantly, and may be inaccurate.
Interested in participating in the challenge? Please email flucontest@cdc.gov for more information
Use the tool below to explore user-submitted forecasts for the 2016/17 influenza season.
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, and Vermont
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including New Jersey, New York, Puerto Rico, and the U.S. Virgin Islands
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17*
Including Delaware, District of Columbia, Maryland, Pennsylvania, Virginia, and West Virginia
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, and Tennessee
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Illinois, Indiana, Michigan, Minnesota, Ohio, and Wisconsin
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Arkansas, Louisiana, New Mexico, Oklahoma, and Texas
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Iowa, Kansas, Missouri, and Nebraska
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Colorado, Montana, North Dakota, South Dakota, Utah, and Wyoming
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Arizona, California, Hawaii, and Nevada
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
Including Alaska, Idaho, Oregon, and Washington
The "Average" prediction is an unweighted mean of all submitted forecasts, while the "Hist-Avg" prediction is based on the past 20 years of observed ILI activity.
NOTE: Forecasting for the 2016/17 season has concluded and predictions will not be updated further.
Four week ahead predicted ILI levels - updated 5/17/17
Predicted probabilties of ILINet values 1-4 weeks ahead - updated 5/17/17
Predicted probabilties for week of season onset and season peak - updated 5/17/17
Predicted probabilties for season peak ILI% - updated 5/17/17
2016-17 Participating Teams
Carnegie Mellon University's Delphi group
Team FORSEA
Group Health Research Institute
Harvard University
Institute of Cognitive Science
Iowa State University
Knowledge Based Systems, Inc.
Los Alamos National Laboratory
Northeastern University's MOBS-Lab
University of Massachusetts' Kernel of Truth
University of Melbourne - Defence Science and Technology Group
University of Minnesota's HumNat Lab
Virginia Tech's 4Sight
For each week during the season, participants will be asked to provide national and regional probabilistic forecasts for the entire influenza season (seasonal targets) and for the next four weeks (four-week ahead targets). The seasonal targets are the onset week, the peak week, and the peak intensity of the 2016-2017 influenza season. The four-week ahead targets are the percent of outpatient visits experiencing influenza-like illness (ILI) one week, two weeks, three weeks, and four weeks ahead from date of the forecast.
Onset Week
Definition The onset of the season is defined as the MMWR surveillance week when the percentage of visits for influenza-like illness (ILI) reported through ILINet reaches or exceeds the baseline value for three consecutive weeks (updated 2016-2017 ILINet baseline values for the US and each HHS region will be available the week of October 10, 2016). Forecasted "onset" week values should be for the first week of that three week period.
Motivation Accurate and timely forecasts for the start of the season can be useful in planning for influenza prevention and control activities. For the general public, the start of the season offers an important opportunity to take preventive measures, such as getting vaccinated, before flu becomes widespread. For clinicians and public health authorities, the start of the season indicates that influenza should be high on their list of possible diagnoses for patients with respiratory illness. This is particularly important for the management of hospitalized patients and high-risk patients with suspected influenza when early treatment with influenza antivirals can be critical.
Seasonal Peak Week
Definition The peak week will be defined as the MMWR surveillance week that the weighted ILINet percentage is the highest for the 2016-2017 influenza season.
Motivation Accurate and timely forecasts for the peak week can be useful for planning and promoting activities to increase influenza vaccination prior to the bulk of influenza illness. For healthcare, pharmacy, and public health authorities, a forecast for the peak week can guide efficient staff and resource allocation.
Seasonal Peak Intensity
Definition The intensity will be defined as the highest numeric value, rounded to one decimal place, that the weighted ILINet percentage reaches during the 2016-2017 influenza season.
Motivation Accurate and timely forecasts for the peak week and intensity of the influenza season can be useful for influenza prevention and control, including the planning and promotion of activities to increase influenza vaccination prior to the bulk of influenza illness. For healthcare, pharmacy, and public health authorities, a forecast for the peak week and intensity can help with appropriate staff and resource allocation since a surge of patients with influenza illness can be expected to seek care and receive treatment in the weeks surrounding the peak.
Short Term Forecasts
Definition One- to four-week ahead forecasts will be defined as the weighted ILINet percentage for the target week, rounded to one decimal place.
Motivation Forecasts capable of providing reliable estimates of influenza activity over the next month are critical because they allow healthcare and public health officials to prepare for and respond to near-term changes in influenza activity and bridge the gap between reported incidence data and long-term seasonal forecasts.
ILINet Data
Data on the weekly proportion of people seeing their health-care provider for influenza-like illness (ILI) is reported through the ILINet System for the United States as a whole and for each HHS health region. These data can be accessed directly from CDC. Alternatively, the R package cdcfluview
(available from CRAN or GitHub) can be used to access the data as shown in the following example
# Option 1: Install from CRAN
install.packages("cdcfluview")
# Option 2: Install from GitHub (most up-to-date version)
devtools::install_github("hrbrmstr/cdcfluview")
library(cdcfluview)
# National ILINet data for 1997/98 - 2015/16 seasons
usflu <- get_flu_data(region = "national", data_source = "ilinet", years = 1997:2015)
# HHS Regional ILINet data for 1997/98 - 2015/16 seasons
regionflu <- get_flu_data(region = "HHS", sub_region = 1:10, data_source = "ilinet", years = 1997:2015)
Please note that while cdcfluview
accesses publically available CDC data, it is not produced, maintained, or endorsed by the CDC.
Additional Data
Teams are welcome to use data sources for model development beyond ILINet - possible additional data sources include but are not limited to:
- Carnegie Mellon University's Delphi group's Epidata API
- Health Tweets
All forecasts will be evaluated using the weighted observations pulled from the ILINet system in week 28, and the logarithmic score will be used to measure the accuracy of the probability distribution of a forecast. Logarithmic scores will be averaged across different time periods, the seasonal targets, the four-week ahead targets, and locations to provide both specific and generalized measures of model accuracy. Forecast accuracy will be measured by log score only. Nonetheless, forecasters are requested to continue to submit point predictions, which should aim to minimize the absolute error (AE).
Logarithmic Score
If ;;\mathbf{p};; is the set of probabilities for a given forecast, and ;;\mathbf{p_i};; is the probability assigned to the observed outcome ;;i;;, the logarithmic score is:
$$S(\mathbf{p},i) = \text{ln}(p_i)$$
For onset week and peak week, the probability assigned to that correct bin (based on the weighted ILINet value) plus the probability assigned to the preceding and proceeding bins will be summed to determine the probability assigned to the observed outcome. In the case of multiple peak weeks, the probability assigned to the bins containing the peak weeks and the preceding and proceeding bins will be summed. For peak percentage and 4-weeks-ahead forecasts, the probability assigned to the correct bin plus the probability assigned to the five preceding and five proceeding bins will be summed to determine the probability assigned to the observed outcome. For example, if the correct peak ILINet value is 6.5%, the probabilities assigned to all bins ranging from 6.0% to 7.0% will be summed to determine the probability assigned to the observed outcome.
For all targets, if the correct bin is near the first or last bin, the number of bins summed will be reduced accordingly. No bin farther than one bin (onset and peak week) or five bins away (percentage forecasts) from the correct bin will contribute to the score. For example, if the correct ILINet percentage for a given week is 0.3%, probabilities assigned to bins ranging from 0% to 0.8% will be summed. Undefined natural logs (which occur when the probability assigned to the observed outcome was 0) will be assigned a value of -10. Forecasts which are not submitted (e.g. if a week is missed) or that are incomplete (e.g. sum of probabilities greater than 1.1) will also be assigned a value of -10.
Example: A forecast predicts there is a probability of 0.2 (i.e. a 20% chance) that the flu season starts on week 44, a 0.3 probability that it starts on week 45, and a 0.1 probability that it starts on week 46 with the other 0.4 (40%) distributed across other weeks according to the forecast. Once the flu season has started, the prediction can be evaluated, and the ILINet data show that the flu season started on week 45. The probabilities for week 44, 45, and 46 would be summed, and the forecast would receive a score of ;;ln(0.6) = -0.51;;. If the season started on another week, the score would be calculated on the probability assigned to that week plus the values assigned to the preceding and proceeding week.
References
Gneiting T and AE Raftery. (2007) Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association. 102(477):359-378. Available at: https://www.stat.washington.edu/raftery/Research/PDF/Gneiting2007jasa.pdf
Rosenfeld R, J Grefenstette, and D Burke. (2012) A Proposal for Standardized Evaluation of Epidemiological Models. Available at: http://delphi.midas.cs.cmu.edu/files/StandardizedEvaluationRevised12-11-09.pdf.
Participating teams can submit your team's weekly submission here. Forecasts must be submitted by 23:59 EST (UTC -5) on the forecast date. Your local time/date will be displayed on your system.