# Mining Data from the Congressional Record

**Authors:** Zhengyu Ma, Tianjiao Qi, James Route, Amir Ziai

arXiv: 1906.00529 · 2019-06-04

## TL;DR

This paper presents a method for storing and analyzing the US Congressional record using AWS and Solr, revealing potential links between tax language frequency and economic indicators.

## Contribution

It introduces a novel data storage and analysis approach for the Congressional record, enabling policy-related linguistic analysis over centuries.

## Key findings

- Potential relationships between tax discussion frequency and economic indicators
- Effective use of cloud services for large-scale legislative data analysis
- Preliminary evidence of policy language trends correlating with economic factors

## Abstract

We propose a data storage and analysis method for using the US Congressional record as a policy analysis tool. We use Amazon Web Services and the Solr search engine to store and process Congressional record data from 1789 to the present, and then query Solr to find how frequently language related to tax increases and decreases appears. This frequency data is compared to six economic indicators. Our preliminary results indicate potential relationships between incidence of tax discussion and multiple indicators. We present our data storage and analysis procedures, as well as results from comparisons to all six indicators.

---
Source: https://tomesphere.com/paper/1906.00529