Automatic Detection of Text Genre
Brett Kessler, Geoffrey Nunberg, Hinrich Schuetze (Xerox PARC and, Stanford University)

TL;DR
This paper introduces a theory of text genres as bundles of facets correlated with surface cues, demonstrating that genre detection based on surface cues can be as effective as using deeper structural features.
Contribution
It proposes a novel theory of genres as facet bundles and shows surface cues are sufficient for genre detection, simplifying the classification process.
Findings
Genre detection based on surface cues is as successful as structural methods.
Genres can be modeled as bundles of facets correlating with surface cues.
Surface cue-based detection simplifies genre classification.
Abstract
As the text databases available to users become larger and more heterogeneous, genre becomes increasingly important for computational linguistics as a complement to topical and structural principles of classification. We propose a theory of genres as bundles of facets, which correlate with various surface cues, and argue that genre detection based on surface cues is as successful as detection based on deeper structural properties.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Natural Language Processing Techniques · Topic Modeling
