AI Model Offers New Approach to Infectious Disease Forecasting
First use of large language modelling enhances outbreak prediction accuracy.

Complete the form below to unlock access to ALL audio articles.
A new artificial intelligence (AI) model developed by researchers at Johns Hopkins University and Duke University has demonstrated improved performance in predicting the spread of infectious diseases.
The model, called PandemicLLM, is the first to apply large language modelling (LLM) to the task of infectious disease forecasting. According to a retrospective evaluation of COVID-19 trends across the United States, PandemicLLM outperforms current state-of-the-art models, particularly during periods of dynamic change.
The research, published in Nature Computational Science, highlights how AI tools that incorporate real-time data and reasoning can support more effective public health responses to emerging disease threats.
Limitations of traditional forecasting approaches
Forecasting the spread of infectious diseases such as influenza or COVID-19 is a complex challenge. Traditional statistical models tend to perform well when epidemiological conditions are stable. However, these models struggle to adapt when new viral variants arise, public health policies shift or population behaviors change.
The COVID-19 pandemic exposed these limitations. When the virus evolved or when mitigation strategies such as mask mandates were introduced or lifted, forecasts often failed to accurately predict the resulting changes in transmission patterns.
PandemicLLM aims to overcome these challenges by using LLMs to incorporate a wider range of information, and to reason about how this information may interact to affect disease dynamics.
Integrating diverse data streams
Unlike earlier forecasting tools that primarily relied on epidemiological trends, PandemicLLM processes four distinct types of input data:
- State-level spatial data: Demographic information, healthcare system capacity and political factors at the state level.
- Epidemiological time series data: Trends in reported infections, hospitalizations and vaccination rates.
- Public health policy data: The type and strictness of interventions such as stay-at-home orders, mask mandates and vaccination campaigns.
- Genomic surveillance data: The prevalence and characteristics of circulating virus variants.
By combining these diverse streams of information, PandemicLLM is able to reason about how shifts in one area (for example, the emergence of a new variant) might influence trends in another (such as hospital admissions).
Testing the model
To evaluate the model’s performance, the research team applied it retrospectively to the COVID-19 pandemic, generating weekly forecasts for individual US states over a 19-month period.
They compared the results to those of models submitted to the Centers for Disease Control and Prevention’s CovidHub, a platform that compiles and evaluates infectious disease forecasts from multiple research groups.
PandemicLLM demonstrated greater accuracy than other models, particularly during periods when the outbreak was in flux. The model was able to accurately predict infection patterns and hospitalization trends one to three weeks in advance.
By incorporating real-time policy and genomic data, PandemicLLM moved beyond the limitations of models that rely mainly on historical case data. This ability to reason about emerging information proved especially valuable when conditions changed rapidly.
Potential applications
While PandemicLLM was tested using COVID-19 data, the model can be adapted to forecast other infectious diseases, provided that relevant input data are available. The researchers suggest potential applications for diseases such as avian influenza, monkeypox and respiratory syncytial virus (RSV).
The team is also investigating whether LLMs could be used to model individual-level health decisions, such as whether to seek vaccination or comply with public health guidance. Incorporating such behavioral modelling could further enhance forecasting tools and inform more effective public health interventions.
Supporting future public health response
The COVID-19 pandemic has highlighted the need for more adaptive and informative disease forecasting tools. PandemicLLM represents an important step in this direction by demonstrating how AI models that reason across multiple data streams can improve forecasting accuracy.
As future outbreaks are inevitable, tools such as PandemicLLM could provide valuable support for public health officials tasked with predicting, tracking and managing the spread of infectious diseases.
Reference: Du H, Zhao Y, Zhao J, et al. Advancing real-time infectious disease forecasting using large language models. Nat Comput Sci. 2025. doi:10.1038/s43588-025-00798-6
This content includes text that has been generated with the assistance of AI. Technology Networks' AI policy can be found here.