Mind your Ps: A probabilistic model to aid the interpretation of molecular epidemiology data

Ana Raquel Penedos*, Aurora Fernández-García, Mihaela Lazar, Kajal Ralh, David Williams, Kevin E. Brown

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Assessing relatedness of pathogen sequences in clinical samples is a core goal in molecular epidemiology. Tools for Bayesian analysis of phylogeny, such as the BEAST software package, have been typically used in the analysis of sequence/time data in public health. However, they are computationally-, time-, and knowledge-intensive, demanding resources that many laboratories do not have available or cannot allocate frequently. Methods: To evaluate a faster and simpler alternative method to support the routine interpretation of sequence data for epidemiology, we obtained sequences for two regions in the measles virus genome, N-450 and MF-NCR, from patient samples of genotypes B3, D4 and D8 taken between 2011 and 2017 in the UK and Romania. A mathematical model incorporating time, possible shared ancestry and the Poisson distribution describing the number of expected substitutions at a given time point was developed to exclude epidemiological relatedness between pairs of sequences. The model was validated against the commonly used Bayesian phylogenetic method using an independent dataset collected in 2017–19. Findings: We demonstrate that our model, using time and sequence information to predict whether two samples may be related within a given time frame, minimises the risk of erroneous exclusion of relatedness. An easy-to-use implementation in the form of a guide and spreadsheet is provided for convenient application. Interpretation: The proposed model only requires a previously calculated substitution rate for the locus and pathogen of interest. It allows for an informed but quick decision on the likelihood of relatedness between two samples within a time frame, without the need for phylogenetic reconstruction, thus facilitating rapid epidemiological interpretation of sequence data. Funding: This work was funded by the United Kingdom Health Security Agency (UKHSA). The World Health Organization European Regional Office funded Aurora Fernández-García and Mihaela Lazar training visits to UKHSA.

Original languageEnglish
Article number103989
JournalEBioMedicine
Volume79
DOIs
Publication statusPublished - May 2022

Bibliographical note

Funding Information:
This work was funded by the UKHSA, which was responsible for all aspects of the study and the manuscript. The WHO European Regional Office funded the training visits of Aurora Fernández-García and Mihaela Lazar to the UKHSA during which some of the sequences used in the study were acquired and was not involved with any aspects of the work.

Funding Information:
Our thanks to all staff in the Immunisation and Diagnosis Unit of the United Kingdom Health Security Agency, and in the Viral Respiratory Infections Laboratory from Cantacuzino Institute, Romania, who produced the N-450 sequences used in this study. This work was funded by the UKHSA, which was responsible for all aspects of the study and the manuscript. The WHO European Regional Office funded the training visits of Aurora Fern?ndez-Garc?a and Mihaela Lazar to the UKHSA during which some of the sequences used in the study were acquired and was not involved with any aspects of the work.

Publisher Copyright:
© 2022

Keywords

  • Clinical virology
  • Elimination
  • Epidemiology
  • Measles
  • Molecular epidemiology
  • Outbreak

Fingerprint

Dive into the research topics of 'Mind your Ps: A probabilistic model to aid the interpretation of molecular epidemiology data'. Together they form a unique fingerprint.

Cite this