Giving Text Analytics a Boost
Raphael Polig, Kubilay Atasu, Laura Chiticariu, Christoph Hagleitner,, H. Peter Hofstee, Frederick R. Reiss, Eva Sitaridi, Huaiyu Zhu

TL;DR
This paper enhances text analytics performance by integrating a reconfigurable hardware accelerator with IBM's SystemT, significantly boosting throughput for large-scale textual data analysis.
Contribution
It introduces a novel hardware-accelerated system for text analytics that extends SystemT's capabilities to efficiently handle Big Data workloads.
Findings
Throughput improved by an order of magnitude
Effective deployment via extended compilation flow
Efficient multi-threaded communication interface
Abstract
The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBM's SystemT software is a powerful text analytics system, which offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing the so-called "Big Data" in an efficient way, despite the high memory bandwidth that is available. We show that by using a streaming hardware accelerator implemented in reconfigurable logic, the throughput rates of the SystemT's information extraction queries can be improved by an order of magnitude. We present how such a system can be deployed by extending SystemT's existing compilation flow and by using a multi-threaded communication interface that can efficiently use the bandwidth of the accelerator.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
