Assessing the impact of a health intervention via user-generated Internet content

  • Vasileios Lampos*
  • , Elad Yom-Tov
  • , Richard Pebody
  • , Ingemar J. Cox
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Citations (Scopus)

Abstract

Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of user-generated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the prevalence of a health event in a population from Internet data. This model is applied to identify control location groups that correlate historically with the areas, where a specific intervention campaign has taken place. We then determine the impact of the intervention by inferring a projection of the disease rates that could have emerged in the absence of a campaign. Our case study focuses on the influenza vaccination program that was launched in England during the 2013/14 season, and our observations consist of millions of geo-located search queries to the Bing search engine and posts on Twitter. The impact estimates derived from the application of the proposed statistical framework support conventional assessments of the campaign.

Original languageEnglish
Pages (from-to)1434-1457
Number of pages24
JournalData Mining and Knowledge Discovery
Volume29
Issue number5
DOIs
Publication statusPublished - 22 Sept 2015

Bibliographical note

Publisher Copyright:
© 2015, The Author(s).

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Gaussian Process
  • Infectious diseases
  • Intervention
  • Search query logs
  • Social media
  • Supervised learning
  • User-generated content

Fingerprint

Dive into the research topics of 'Assessing the impact of a health intervention via user-generated Internet content'. Together they form a unique fingerprint.

Cite this