Infectious Disease Forecast

Accurate infectious disease forecasts can aid public health responses, e.g., to inform risk communication, support logistic planning, and guide mitigation strategies. Given these promising applications, it has been a long-term goal of public health to reliably predict the most likely future outcomes (e.g., the number of cases and hospitalizations) during an epidemic. In collaboration with Dr. Jeffrey Shaman and other colleagues, we have worked extensively on infectious disease forecast since ~2012, and have developed various forecast systems including for seasonal influenza (e.g., summary here) and the 2014-2015 West African Ebola outbreak (e.g. Shaman et al. 2014 PLoS Current Outbreaks).

To provide additional lead time (e.g., 6 months in advance to inform vaccines/antivirals manufacture), our recent work has focused on developing accurate, well-calibrated long-lead infectious disease forecasts.  For instance, in a 2023 study (Yang & Shaman 2023 PLoS CB),  we developed a long-lead COVID-19 forecast system with lead times up to 6 months, by incorporating three strategies to address three major technical challenges (i.e., system chaos, error growth, and viral mutations). Tallied over >25,000 retrospective predictions through September 2022, the forecast approach using all three strategies improved probabilistic forecast accuracy by ~50% and point prediction accuracy by ~100%, compared to a baseline approach without those strategies. Promisingly, real-time forecasts generated in early October 2022 for the 2022-2023 respiratory virus season were also in general accurate. Building on this proof-of-concept work, we are currently developing real-time long-lead forecasts for both COVID-19 and influenza.

Figure. Real-time forecasts for the 2022-2023 respiratory virus season. The states are arranged based on accuracy of historical forecast (higher accuracy for those in the left panel and those on the top). In each panel, each row shows estimates and forecasts of weekly numbers of infections (1st column), cases (2nd column), or deaths (3rd column) for each state.  Vertical dashed lines indicate the week of forecast initiation (i.e., October 2, 2022). Dots show reported weekly cases or deaths, including for the forecast period. Blue lines and blue areas (line = median; darker blue = 50% CI; lighter blue = 95% CI) show model training estimates.  Red lines and red areas (line = median; dark red = 50% Predictive Interval; lighter red = 95% Predictive Interval) show model forecasts using the best-performing approach.