Mining Data from the Congressional Record
Zhengyu Ma, Tianjiao Qi, James Route, Amir Ziai

TL;DR
This paper presents a method for storing and analyzing the US Congressional record using AWS and Solr, revealing potential links between tax language frequency and economic indicators.
Contribution
It introduces a novel data storage and analysis approach for the Congressional record, enabling policy-related linguistic analysis over centuries.
Findings
Potential relationships between tax discussion frequency and economic indicators
Effective use of cloud services for large-scale legislative data analysis
Preliminary evidence of policy language trends correlating with economic factors
Abstract
We propose a data storage and analysis method for using the US Congressional record as a policy analysis tool. We use Amazon Web Services and the Solr search engine to store and process Congressional record data from 1789 to the present, and then query Solr to find how frequently language related to tax increases and decreases appears. This frequency data is compared to six economic indicators. Our preliminary results indicate potential relationships between incidence of tax discussion and multiple indicators. We present our data storage and analysis procedures, as well as results from comparisons to all six indicators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedia Influence and Politics
