Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk

Ellie Paige, Jessica Barrett, David Stevens, Ruth H. Keogh, Michael J. Sweeting, Irwin Nazareth, Irene Petersen, Angela M. Wood*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)


The benefits of using electronic health records (EHRs) for disease risk screening and personalized health-care decisions are being increasingly recognized. Here we present a computationally feasible statistical approach with which to address the methodological challenges involved in utilizing historical repeat measures of multiple risk factors recorded in EHRs to systematically identify patients at high risk of future disease. The approach is principally based on a 2-stage dynamic landmark model. The first stage estimates current risk factor values from all available historical repeat risk factor measurements via landmark-age-specific multivariate linear mixed-effects models with correlated random intercepts, which account for sporadically recorded repeat measures, unobserved data, and measurement errors. The second stage predicts future disease risk from a sex-stratified Cox proportional hazards model, with estimated current risk factor values from the first stage. We exemplify these methods by developing and validating a dynamic 10-year cardiovascular disease risk prediction model using primary-care EHRs for age, diabetes status, hypertension treatment, smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol in 41,373 persons from 10 primary-care practices in England and Wales contributing to The Health Improvement Network (1997-2016). Using cross-validation, the model was well-calibrated (Brier score = 0.041, 95% confidence interval: 0.039, 0.042) and had good discrimination (C-index = 0.768, 95% confidence interval: 0.759, 0.777).

Original languageEnglish
Pages (from-to)1530-1538
Number of pages9
JournalAmerican Journal of Epidemiology
Issue number7
Publication statusPublished - 1 Jul 2018
Externally publishedYes

Bibliographical note

Funding Information:
Author affiliations: Department of Public Health and Primary Care, School of Clinical Medicine, University of Cambridge, Cambridge, United Kingdom (Ellie Paige, Jessica Barrett, David Stevens, Michael J. Sweeting, Angela M. Wood); National Centre for Epidemiology and Population Health, Research School of Population, The Australian National University, Canberra, Australia (Ellie Paige); MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom (Jessica Barrett); Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, United Kingdom (Ruth H. Keogh); and Institute of Epidemiology and Health, Research Department of Primary Care and Population Health, Institute of Epidemiology and Health Care, University College London, London, United Kingdom (Irwin Nazareth, Irene Petersen). This work was funded by the Medical Research Council (MRC) (grant MR/K014811/1). J.B. was supported by an MRC fellowship (grant G0902100) and the MRC Unit Program (grant MC_UU_00002/5). R.H.K. was supported by an MRC Methodology Fellowship (grant MR/M014827/1).

Publisher Copyright:
© The Author(s) 2018.


  • cardiovascular disease
  • dynamic risk prediction
  • electronic health records
  • landmarking
  • mixed-effects models
  • primary care records


Dive into the research topics of 'Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk'. Together they form a unique fingerprint.

Cite this