Development and validation of a primary care electronic health record phenotype to study migration and health in the UK

Neha Pathak, Claire X. Zhang, Yamina Boukari, Rachel Burns, Rohini Mathur, Arturo Gonzalez-Izquierdo, Spiros Denaxas, Pam Sonnenberg, Andrew Hayward, Robert W. Aldridge*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


International migrants comprised 14% of the UK’s population in 2020; however, their health is rarely studied at a population level using primary care electronic health records due to difficulties in their identification. We developed a migration phenotype using country of birth, visa status, non-English main/first language and non-UK-origin codes and applied it to the Clinical Practice Research Datalink (CPRD) GOLD database of 16,071,111 primary care patients between 1997 and 2018. We compared the completeness and representativeness of the identified migrant population to Office for National Statistics (ONS) country-of-birth and 2011 census data by year, age, sex, geographic region of birth and ethnicity. Between 1997 to 2018, 403,768 migrants (2.51% of the CPRD GOLD population) were identified: 178,749 (1.11%) had foreign-country-of-birth or visa-status codes, 216,731 (1.35%) non-English-main/first-language codes, and 8288 (0.05%) non-UK-origin codes. The cohort was similarly distributed versus ONS data by sex and region of birth. Migration recording improved over time and younger migrants were better represented than those aged ≥50. The validated phenotype identified a large migrant cohort for use in migration health research in CPRD GOLD to inform healthcare policy and practice. The under-recording of migration status in earlier years and older ages necessitates cautious interpretation of future studies in these groups.

Original languageEnglish
Article number13304
JournalInternational Journal of Environmental Research and Public Health
Issue number24
Publication statusPublished - 1 Dec 2021
Externally publishedYes

Bibliographical note

Funding Information:
Funding: Wellcome Trust Clinical Research Career Development Fellowship [206602] and Clinical Research Training Fellowship [211162].

Funding Information:
Conflicts of Interest: The authors declare no conflict of interest. N.P. and R.W.A. receive funding from the Wellcome Trust. R.W.A. has undertaken paid research consulting work on migration and health for Doctors of World and International Labor Organization in the last five years. C.X.Z. is employed by Public Health England and contributes to the development of national guidance and policy in migrant health. C.X.Z. is also a Trustee for the charity Art Refuge. The views expressed are those of the authors and not necessarily those of the Wellcome Trust, UCL, London School of Hygiene and Tropical Medicine, Public Health England, Guy’s & St Thomas’ NHS Foundation Trust, and Health Data Research UK.

Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.


  • Algorithm
  • Clinical practice research datalink
  • Migration
  • Phenotype
  • Primary care
  • Validation


Dive into the research topics of 'Development and validation of a primary care electronic health record phenotype to study migration and health in the UK'. Together they form a unique fingerprint.

Cite this