Background: Worldwide, syndromic surveillance is increasingly used for improved and timely situational awareness and early identification of public health threats. Syndromic data streams are fed into detection algorithms, which produce statistical alarms highlighting potential activity of public health importance. All alarms must be assessed to confirm whether they are of public health importance. In England, approximately 100 alarms are generated daily and, although their analysis is formalised through a risk assessment process, the process requires notable time, training, and maintenance of an expertise base to determine which alarms are of public health importance. The process is made more complicated by the observation that only 0.1% of statistical alarms are deemed to be of public health importance. Therefore, the aims of this study were to evaluate machine learning as a tool for computer-assisted human decision-making when assessing statistical alarms. Methods: A record of the risk assessment process was obtained from Public Health England for all 67,505 statistical alarms between August 2013 and October 2015. This record contained information on the characteristics of the alarm (e.g. size, location). We used three Bayesian classifiers- naïve Bayes, tree-augmented naïve Bayes and Multinets - to examine the risk assessment record in England with respect to the final 'Decision' outcome made by an epidemiologist of 'Alert', 'Monitor' or 'No-action'. Two further classifications based upon tree-augmented naïve Bayes and Multinets were implemented to account for the predominance of 'No-action' outcomes. Results: The attributes of each individual risk assessment were linked to the final decision made by an epidemiologist, providing confidence in the current process. The naïve Bayesian classifier performed best, correctly classifying 51.5% of 'Alert' outcomes. If the 'Alert' and 'Monitor' actions are combined then performance increases to 82.6% correctly classified. We demonstrate how a decision support system based upon a naïve Bayes classifier could be operationalised within an operational syndromic surveillance system. Conclusions: Within syndromic surveillance systems, machine learning techniques have the potential to make risk assessment following statistical alarms more automated, robust, and rigorous. However, our results also highlight the importance of specialist human input to the process.
Bibliographical noteFunding Information:
This research was jointly funded by the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emergency Preparedness and Response at King’s College London and the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Gastrointestinal Infections at University of Liverpool in partnership with Public Health England (PHE). The funders approved the research proposal but had no input to the collection, analysis, and interpretation of data or the writing of the maunscript. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR, the Department of Health or Public Health England.
We acknowledge: NHS 111 and NHS Digital, participating EDSSS emergency departments and the Royal College of Emergency Medicine, Advanced and participating Out of Hours service providers, TPP and participating SystmOne practices, QSurveillance, The University of Nottingham, EMIS/EMIS practices and ClinRisk, NHS24 Information Services Team and NHS Scotland. The authors also acknowledge support from all the primary data providers and the PHE Real-time Syndromic Surveillance Team.
© 2019 The Author(s).
- Artificial intelligence
- Bayes' theorem
- Decision making
- Machine learning
- Public health
- Syndromic surveillance