# Multiple Document Representations from News Alerts for Automated   Bio-surveillance Event Detection

**Authors:** Aaron Tuor, Fnu Anubhav, Lauren Charles

arXiv: 1902.06231 · 2019-02-19

## TL;DR

This paper introduces a method for automated bio-surveillance event detection using multiple document representations, achieving high accuracy and recall in classifying health-related news documents.

## Contribution

It presents a novel approach combining recurrent neural networks and traditional text representations for improved disease event classification.

## Key findings

- Achieved 97% recall and 93.3% accuracy in classification
- Joint modeling of document title and description improves performance
- Recurrent neural network-based representations outperform TF-IDF and Naive Bayes

## Abstract

Due to globalization, geographic boundaries no longer serve as effective shields for the spread of infectious diseases. In order to aid bio-surveillance analysts in disease tracking, recent research has been devoted to developing information retrieval and analysis methods utilizing the vast corpora of publicly available documents on the internet. In this work, we present methods for the automated retrieval and classification of documents related to active public health events. We demonstrate classification performance on an auto-generated corpus, using recurrent neural network, TF-IDF, and Naive Bayes log count ratio document representations. By jointly modeling the title and description of a document, we achieve 97% recall and 93.3% accuracy with our best performing bio-surveillance event classification model: logistic regression on the combined output from a pair of bidirectional recurrent neural networks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.06231/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1902.06231/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1902.06231/full.md

---
Source: https://tomesphere.com/paper/1902.06231