Journal of Medical Internet Research


Key Takeaways

  • Potential mutations in avian flu strains pose a credible future pandemic risk.
  • A recent study developed a novel machine learning model to identify genomic features with potential for spillover to human hosts.
  • The approach may have additional applications, including guiding the development of targeted influenza vaccines.

The next viral pandemic is an increasingly likely possibility in the coming decades [], and while we have some guesses, ultimately we don’t know exactly where or when to expect it []. What the world needs is a NATO (North Atlantic Treaty Organization)–style early warning radar to alert us to these incoming threats. Ideally, the radar must distinguish between the viral majority that pose no risk to human health and the very few that do. And one more thing, it should warn us of this potential pandemic pest before it spills over into humans.

Is that too much to ask? Recent research suggests it might not be.

A team of virologists and computational biologists led by Liam Brierley, PhD, while at the University of Liverpool (now at the University of Glasgow), may have come up with a solution to the problem of preparing for future pandemics. The group developed a machine learning (artificial intelligence [AI]) model capable of predicting which strains of avian influenza circulating in animals have the potential to jump into humans. Their research paper [] is currently available as a preprint and is undergoing peer review—but the results at this early stage look promising.

Traditional phylogenetics is a retrospective tool, analyzing sequences from existing databases to flag new variants of potential pandemic concern [,]. In contrast, the Brierley et al [] paper introduces a predictive biophysical information layer to detect viruses with host-jumping potential. The approach was designed to identify protein and nucleic acid sequences that share a functional similarity with existing zoonotic influenza strains. As Brierley explained via email, “The idea behind the machine learning approach is, if we can identify what makes a zoonotic virus from these ‘first principles’ of protein function and nucleotide motifs, we can apply this to highly divergent sequences that a phylogeny would struggle to make inference about, because it would be equally distant from all sequences we currently know.” In other words, by considering factors like protein motif functionality, we might be able to identify signatures of host-jumping potential between distantly related viruses—something traditional phylogenetic analyses would miss.

Avian flu has been around for centuries but is a relatively new pathogen in humans []. In 1997, authorities confirmed the first case of a virus carrying the signature H5 and N1 bird flu proteins in a human []. This new human-compatible virus has since spread rapidly through the world’s bird population. So far, there have been numerous confirmed human fatalities but no proof of human-to-human transmission [,]. H5N1 has also impacted other mammals, decimating the sea lion and elephant seal populations in Peru in 2023 []. Farmworkers exposed to their livestock account for most of the over 600 human cases reported to date [].

Brierley chose the H5N1 influenza virus as a use case for his analysis because it was a credible future pandemic threat but trained the model on all known subtypes of avian flu. He writes, “It’s important to capture that whole diversity if we want to be able to make predictions for new strains in future.”

An advantage for the number crunchers was the extensive avian influenza genetic database at their disposal. Currently, nearly 19,000 distinct viral sequences are available covering 120 different viral subtypes. What is more, 618 samples came from humans. The research team trained its AI model on a subset of the database and then tested it against the remaining data. They were seeking reliable and wide-scale genomic signals associated with human infection, combined with the physical and chemical properties of short protein regions identified by their model, and with the samples taken from human beings.

The results showed the model had a 91.9% chance of correctly identifying a virus at risk of spillover into humans. Notably, first-generation protein modeling software achieved this precision on a database that will likely grow and improve as surveillance programs add sequences from new outbreaks. While more data is needed, Brierley writes that, “The leaps in performance will come from more advanced models, which are not decades away but rather, years or even months.”

While many regions of the influenza virus genome have similarities with each other, adding the protein motif predictions narrowed the spillover regions down to just a few key areas. These regions were small, often just two or three base pairs long, but mutations in these regions may be critical to allow a bird virus to propagate within human cells.

The AI model could cut through the noise generated from thousands of viral base pairs and highlight just a few protein motifs or patterns that mattered. They were:

  • RNA polymerase complex: 9 motifs across the PA (polymerase acidic protein), PB1 (polymerase basic protein 1), and PB2 (polymerase basic protein 2) subunit genes, whose proteins are essential for the virus to replicate, were top of the list.
  • Virus binding: 1 motif within the hemagglutinin (HA) gene was included. The “H” in the H5N1 virus family is involved in binding to the host cells. Mutations here could allow the virus to recognize and infect a new host.
  • Replication: Motifs of the nucleoprotein (NP), a chaperone responsible for ensuring the host replicates the viral genes correctly and packages them for export, were included.
  • Immune evasion: Motifs within a nonstructural gene (NS1), the gene involved in dampening the host cell’s immune response after infection, were also included

The model has already shown potential for use with other influenza viruses. It flagged rare influenza viruses like H10N8, which has been detected in humans [], as well as the H4 subtypes, which have not. Brierley notes that he can “highlight a couple of H4 sequences from H4N8 and H4N6 subtypes that seemed to have elevated zoonotic potential, and the next step would be laboratory studies to understand the exact mechanisms.”

Future priorities for this approach would include improved modeling and expanding the database, especially with viruses that cause asymptomatic infections that could more easily spread undetected. Global health authorities can assist by monitoring for new emerging viruses and adding their genetic sequences to the database before they become more virulent.

AI will become an important tool in the public health armory. But despite the benefits reported in the study, machine learning cannot offer a complete solution. Brierley noted limitations of the approach using AI, noting that “it can’t tell you exactly WHY a certain feature matters but it can tell you where we should start looking in deeper study,” adding that “it can only do the job it’s trained for. We can’t, for example, make any predictions about spillover into other mammals from this model.” Furthermore, AI can only analyze samples taken at a single point in time; it cannot gauge the direction of viral spread. And the influenza virus is subject to modification by its hosts, which may affect its virulence and transmissibility—for example, the addition of sugar and phosphate molecules on the mature virus can alter its pathogenicity, a feature that AI could miss.

This technology is not just an effective defense against future viral threats; it could also be used proactively during influenza season. The annual flu shot would benefit from the superior predictive power of AI over the traditional genetic approach to developing targeted vaccines. Beyond influenza, it could be readily adapted to other respiratory pathogens, including coronaviruses and the common cold.

In Brierley’s words, “One thing is clear: as computing power grows and our model complexity starts to better capture biological realities,” these models could become central in understanding and monitoring viruses with zoonotic potential, helping us anticipate and potentially mitigate the impact of the next pandemic.

None declared.

© JMIR Publications. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 3.Mar.2026.