TL;DR
OpenEDGAR is an open source Python framework that simplifies the collection, parsing, and analysis of SEC EDGAR filings for research and industrial use, supporting distributed computing and comprehensive data extraction.
Contribution
It introduces a versatile, open source tool built on Django for efficient EDGAR data retrieval, parsing, and database construction, enhancing research and industry workflows.
Findings
Supports distributed compute across multiple servers.
Enables rapid construction of research databases from EDGAR.
Provides comprehensive data extraction and search functionalities.
Abstract
OpenEDGAR is an open source Python framework designed to rapidly construct research databases based on the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system operated by the US Securities and Exchange Commission (SEC). OpenEDGAR is built on the Django application framework, supports distributed compute across one or more servers, and includes functionality to (i) retrieve and parse index and filing data from EDGAR, (ii) build tables for key metadata like form type and filer, (iii) retrieve, parse, and update CIK to ticker and industry mappings, (iv) extract content and metadata from filing documents, and (v) search filing document contents. OpenEDGAR is designed for use in both academic research and industrial applications, and is distributed under MIT License at https://github.com/LexPredict/openedgar.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
