Control of artifactual variation in reported intersample relatedness during clinical use of a mycobacterium tuberculosis sequencing pipeline

David Wyllie*, Nicholas Sanderson, Richard Myers, Tim Peto, Esther Robinson, Derrick W. Crook, E. Grace Smith, A. Sarah Walker

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)

Abstract

Contact tracing requires reliable identification of closely related bacterial isolates. When we noticed the reporting of artifactual variation between Mycobacterium tuberculosis isolates during routine next-generation sequencing of Mycobacterium spp., we investigated its basis in 2,018 consecutive M. tuberculosis isolates. In the routine process used, clinical samples were decontaminated and inoculated into broth cultures; from positive broth cultures DNA was extracted and sequenced, reads were mapped, and consensus sequences were determined. We investigated the process of consensus sequence determination, which selects the most common nucleotide at each position. Having determined the high-quality read depth and depth of minor variants across 8,006 M. tuberculosis genomic regions, we quantified the relationship between the minor variant depth and the amount of nonmycobacterial bacterial DNA, which originates from commensal microbes killed during sample decontamination. In the presence of nonmycobacterial bacterial DNA, we found significant increases in minor variant frequencies, of more than 1.5-fold, in 242 regions covering 5.1% of the M. tuberculosis genome. Included within these were four high-variation regions strongly influenced by the amount of nonmycobacterial bacterial DNA. Excluding these four regions from pairwise distance comparisons reduced biologically implausible variation from 5.2% to 0% in an independent validation set derived from 226 individuals. Thus, we demonstrated an approach identifying critical genomic regions contributing to clinically relevant artifactual variation in bacterial similarity searches. The approach described monitors the outputs of the complex multistep laboratory and bioinformatics process, allows periodic process adjustments, and will have application to quality control of routine bacterial genomics.

Original languageEnglish
Article numbere00104-18
JournalJournal of Clinical Microbiology
Volume56
Issue number8
DOIs
Publication statusPublished - Aug 2018

Bibliographical note

Funding Information:
This study was supported by the Health Innovation Challenge Fund (a parallel funding partnership between the Wellcome Trust [WT098615/Z/12/Z] and the Department of Health [grant HICF-T5-358]) and NIHR Oxford Biomedical Research Centre. Derrick Crook is affiliated to the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England. Crook is based at University of Oxford.

Publisher Copyright:
© 2018 Wyllie et al.

Keywords

  • Artifact
  • Mycobacterium tuberculosis
  • Reference mapping
  • Relatedness
  • Single nucleotide variation

Fingerprint

Dive into the research topics of 'Control of artifactual variation in reported intersample relatedness during clinical use of a mycobacterium tuberculosis sequencing pipeline'. Together they form a unique fingerprint.

Cite this