Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): A genetic association study in UK Biobank

OxGSK Consortium, UK Brain Expression Consortium (UKBEC)

Research output: Contribution to journalArticlepeer-review

209 Citations (Scopus)


Background: Understanding the genetic basis of airflow obstruction and smoking behaviour is key to determining the pathophysiology of chronic obstructive pulmonary disease (COPD). We used UK Biobank data to study the genetic causes of smoking behaviour and lung health. Methods: We sampled individuals of European ancestry from UK Biobank, from the middle and extremes of the forced expiratory volume in 1 s (FEV1) distribution among heavy smokers (mean 35 pack-years) and never smokers. We developed a custom array for UK Biobank to provide optimum genome-wide coverage of common and low-frequency variants, dense coverage of genomic regions already implicated in lung health and disease, and to assay rare coding variants relevant to the UK population. We investigated whether there were shared genetic causes between different phenotypes defined by extremes of FEV1. We also looked for novel variants associated with extremes of FEV1 and smoking behaviour and assessed regions of the genome that had already shown evidence for a role in lung health and disease. We set genome-wide significance at p<5 × 10-8. Findings: UK Biobank participants were recruited from March 15, 2006, to July 7, 2010. Sample selection for the UK BiLEVE study started on Nov 22, 2012, and was completed on Dec 20, 2012. We selected 50 008 unique samples: 10 002 individuals with low FEV1, 10 000 with average FEV1, and 5002 with high FEV1 from each of the heavy smoker and never smoker groups. We noted a substantial sharing of genetic causes of low FEV1 between heavy smokers and never smokers (p=2·29 × 10-16) and between individuals with and without doctor-diagnosed asthma (p=6·06 × 10-11). We discovered six novel genome-wide significant signals of association with extremes of FEV1, including signals at four novel loci (KANSL1, TSEN54, TET2, and RBM19/TBX5) and independent signals at two previously reported loci (NPNT and HLA-DQB1/HLA-DQA2). These variants also showed association with COPD, including in individuals with no history of smoking. The number of copies of a 150 kb region containing the 5' end of KANSL1, a gene that is important for epigenetic gene regulation, was associated with extremes of FEV1. We also discovered five new genome-wide significant signals for smoking behaviour, including a variant in NCAM1 (chromosome 11) and a variant on chromosome 2 (between TEX41 and PABPC1P2) that has a trans effect on expression of NCAM1 in brain tissue. Interpretation: By sampling from the extremes of the lung function distribution in UK Biobank, we identified novel genetic causes of lung function and smoking behaviour. These results provide new insight into the specific mechanisms underlying airflow obstruction, COPD, and tobacco addiction, and show substantial shared genetic architecture underlying airflow obstruction across individuals, irrespective of smoking behaviour and other airway disease. Funding: Medical Research Council.

Original languageEnglish
Pages (from-to)769-781
Number of pages13
JournalThe Lancet Respiratory Medicine
Issue number10
Publication statusPublished - Oct 2015
Externally publishedYes

Bibliographical note

Funding Information:
This work was funded by a Medical Research Council (MRC) strategic award to MDT, IPH, DPS, and LVW (MC_PC_12010) . This research was done using the UK Biobank resource. MDT was supported by MRC fellowships G0501942 and G0902313 . IPH is supported by an MRC programme grant (G1000861) . This Article presents independent research funded partially by the National Institute for Health Research (NIHR). The views expressed are those of the authors and not necessarily those of the National Health Service, the NIHR, or the Department of Health. We thank Affymetrix for their role in array design and for undertaking genotyping and genotype calling. We thank all members of the UK Biobank Array Design Group: Peter Donnelly (chair), Jose Bras, Adam Butterworth, Richard Durbin, Paul Elliott, Ian Hall, John Hardy, Mark McCarthy, Gil McVean, Tim Peakman, Nazneen Rahman, Nilesh Samani, Martin Tobin, and Hugh Watkins. This study makes use of data generated by the UK10K Consortium, derived from samples from TwinsUK and ALSPAC. A full list of the investigators who contributed to the generation of the data is available from the UK10K website. Funding for UK10K was provided by the Wellcome Trust under award WT091310. JM is funded by an ERC Consolidator Grant (617306). APM is a Wellcome Trust Senior Research Fellow in Basic Biomedical Science (grant number WT098017). The lung eQTL study at Laval University was supported by the Chaire de pneumologie de la Fondation JD Bégin de l'Université Laval, the Fondation de l'Institut universitaire de cardiologie et de pneumologie de Québec, the Respiratory Health Network of the FRQS, the Canadian Institutes of Health Research (MOP - 123369), and the Cancer Research Society and Read for the Cure. YB is the recipient of a Junior 2 Research Scholar award from the Fonds de recherche Québec – Santé (FRQS). EGCUT received financing by FP7 grants (278913, 306031, 313010), Center of Excellence in Genomics (EXCEGEN), and University of Tartu (SP1GVARENG). We thank EGCUT technical personnel, especially V Soo and S Smit. Data analyses were done in part in the High Performance Computing Center of University of Tartu. We thank Paul Jennings for assistance with generation of code to extract expression data. MO is a Postdoctoral Fellow of the Michael Smith Foundation for Health Research and the Canadian Institute for Health Research Integrated and Mentored Pulmonary and Cardiovascular Training program. This research used the ALICE and SPECTRE High Performance Computing Facilities at the University of Leicester.

Publisher Copyright:
© 2015 Wain et al. Open Access article distributed under the terms of CC BY.


Dive into the research topics of 'Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): A genetic association study in UK Biobank'. Together they form a unique fingerprint.

Cite this