Authors Should Label Their Own Documents
Marcus Ma, Cole Johnson, Nolan Bridges, Jackson Trager, Georgios Chochlakis, Shrikanth Narayanan

TL;DR
This paper introduces author labeling, a real-time annotation method where authors label their own data during creation, leading to higher quality, faster, and cheaper annotations, especially for subjective content, demonstrated through a chatbot deployment and improved product recommendation.
Contribution
The paper presents a novel author labeling technique, a real-time annotation system, and demonstrates its effectiveness in improving model performance and annotation quality over traditional methods.
Findings
537% improvement in click-through rate over industry baseline
Higher quality, faster, and cheaper annotations compared to traditional methods
Successful deployment in a commercial chatbot with 20,000+ users
Abstract
Third-party annotation is the status quo for labeling text, but egocentric information such as sentiment and belief can at best only be approximated by a third-person proxy. We introduce author labeling, an annotation technique where the writer of the document itself annotates the data at the moment of creation. We collaborate with a commercial chatbot with over 20,000 users to deploy an author labeling annotation system. This system identifies task-relevant queries, generates on-the-fly labeling questions, and records authors' answers in real time. We train and deploy an online-learning model architecture for product recommendation with author-labeled data to improve performance. We train our model to minimize the prediction error on questions generated for a set of predetermined subjective beliefs using author-labeled responses. Our model achieves a 537% improvement in click-through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · AI in Service Interactions
