Abstract
This paper investigates whether infectious intestinal diseases (IIDs) can be detected and quantified using social media content. Experiments are conducted on user-generated data from the microblogging service, Twitter. Evaluation is based on the comparison with the number of IID cases reported by traditional health surveillance methods. We employ a deep learning approach for creating a topical vocabulary, and then apply a regularised linear (Elastic Net) as well as a nonlinear (Gaussian Process) regression function for inference. We show that like previous text regression tasks, the nonlinear approach performs better. In general, our experimental results, both in terms of predictive performance and semantic interpretation, indicate that Twitter data contain a signal that could be strong enough to complement conventional methods for IID surveillance.
Original language | English |
---|---|
Title of host publication | DH 2016 - Proceedings of the 2016 Digital Health Conference |
Publisher | Association for Computing Machinery, Inc |
Pages | 157-161 |
Number of pages | 5 |
ISBN (Electronic) | 9781450342247 |
DOIs | |
Publication status | Published - 11 Apr 2016 |
Event | 6th International Conference on Digital Health, DH 2016 - Montreal, Canada Duration: 11 Apr 2016 → 13 Apr 2016 |
Publication series
Name | DH 2016 - Proceedings of the 2016 Digital Health Conference |
---|
Conference
Conference | 6th International Conference on Digital Health, DH 2016 |
---|---|
Country/Territory | Canada |
City | Montreal |
Period | 11/04/16 → 13/04/16 |
Bibliographical note
Funding Information:This research has been supported by the EPSRC IRC grant EP/K031953/1
Keywords
- Disease surveillance
- IID
- Infectious intestinal disease
- Social media
- User-generated content
- Word embeddings