Patterns of within-host genetic diversity in SARS-COV-2

Gerry Tonkin-Hill*, Inigo Martincorena*, Roberto Amato, Andrew Rj Lawson, Moritz Gerstung, Ian Johnston, David K. Jackson, Naomi Park, Stefanie V. Lensing, Michael A. Quail, Sonia Gonsalves, Cristina Ariani, Michael Spencer Chapman, William L. Hamilton, Luke W. Meredith, Grant Hall, Aminu S. Jahun, Yasmin Chaudhry, Myra Hosmillo, Malte L. PinckertIliana Georgana, Anna Yakovleva, Laura G. Caller, Sarah L. Caddy, Theresa Feltwell, Fahad A. Khokhar, Charlotte J. Houldcroft, Martin D. Curran, Surendra Parmar, Rachel Nelson Ewan M. Harrison, John Sillitoe, Stephen D. Bentley, Jeffrey C. Barrett, M. Estee Torok, Ian G. Goodfellow, Cordelia Langford, Dominic Kwiatkowski*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

88 Citations (Scopus)
18 Downloads (Pure)


Monitoring the spread of SARS-CoV-2 and reconstructing transmission chains has become a major public health focus for many governments around the world. The modest mutation rate and rapid transmission of SARS-CoV-2 prevents the reconstruction of transmission chains from consensus genome sequences, but within-host genetic diversity could theoretically help identify close contacts. Here we describe the patterns of within-host diversity in 1181 SARS-CoV-2 samples sequenced to high depth in duplicate. 95.1% of samples show within-host mutations at detectable allele frequencies. Analyses of the mutational spectra revealed strong strand asymmetries suggestive of damage or RNA editing of the plus strand, rather than replication errors, dominating the accumulation of mutations during the SARS-CoV-2 pandemic. Within- and between-host diversity show strong purifying selection, particularly against nonsense mutations. Recurrent within-host mutations, many of which coincide with known phylogenetic homoplasies, display a spectrum and patterns of purifying selection more suggestive of mutational hotspots than recombination or convergent evolution. While allele frequencies suggest that most samples result from infection by a single lineage, we identify multiple putative examples of co-infection. Integrating these results into an epidemiological inference framework, we find that while sharing of within-host variants between samples could help the reconstruction of transmission chains, mutational hotspots and rare cases of superinfection can confound these analyses.

Original languageEnglish
Article number66857
Number of pages25
Publication statusPublished - 13 Aug 2021

Bibliographical note

Funding Information: This work was funded by COG-UK, supported by funding from the Medical Research Council (MRC) part of UK Research and Innovation (UKRI), the National Institute of Health Research (NIHR) and Genome Research Limited, operating as the Wellcome Sanger Institute; the Wellcome Trust (Senior Fellowship to IG ref: 207498/Z/17/Z and PhD Scholarship to GTH ref: 204016/Z/16/Z); the Academy of Medical Sciences and the Health Foundation (Clinician Scientist Fellowship to MET); and the Cambridge NIHR Biomedical Research Centre (MET).

Open Access: This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Publisher Copyright: © Tonkin-Hill et al.

Citation: Tonkin-Hill, Gerry, et al. "Patterns of within-host genetic diversity in SARS-CoV-2." Elife 10 (2021).

DOI: 10.7554/eLife.66857


Dive into the research topics of 'Patterns of within-host genetic diversity in SARS-COV-2'. Together they form a unique fingerprint.

Cite this