NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg
Joshua Bambrick, Minjie Xu, Andy Almonte, Igor Malioutov, Guim, Perarnau, Vittorio Selo, Iat Chong Chan

TL;DR
NSTM is a real-time, query-driven news summarization system at Bloomberg that filters, clusters, and summarizes large volumes of news to provide concise, relevant updates with sub-second latency.
Contribution
The paper introduces NSTM, a novel system combining semantic clustering and summarization to efficiently generate personalized news digests at scale.
Findings
Operates with sub-second latency for thousands of requests daily.
Effectively filters noise and duplicates to identify key news.
Provides comprehensive, concise summaries tailored to user queries.
Abstract
Millions of news articles from hundreds of thousands of sources around the globe appear in news aggregators every day. Consuming such a volume of news presents an almost insurmountable challenge. For example, a reader searching on Bloomberg's system for news about the U.K. would find 10,000 articles on a typical day. Apple Inc., the world's most journalistically covered company, garners around 1,800 news articles a day. We realized that a new kind of summarization engine was needed, one that would condense large volumes of news into short, easy to absorb points. The system would filter out noise and duplicates to identify and summarize key news about companies, countries or markets. When given a user query, Bloomberg's solution, Key News Themes (or NSTM), leverages state-of-the-art semantic clustering techniques and novel summarization methods to produce comprehensive, yet concise,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
