A Real-Time System to Populate FRA Form 57 from News
Chansong Lim, Haz Sameen Shahgir, Yue Dong, Jia Chen, and Evangelos E. Papalexakis

TL;DR
This paper introduces a real-time system that extracts and populates FRA Form 57 data from news articles to provide quick situational awareness after railway incidents, overcoming challenges of noisy data and complex form structure.
Contribution
The paper presents a novel pipeline combining vision language models and grouped question answering to accurately extract structured data from irregular forms and noisy news sources.
Findings
Achieved high information retrieval accuracy from news articles.
Successfully aligned scraped news with official FRA records.
Demonstrated real-time population of FRA Form 57 from unstructured news data.
Abstract
Local railway committees need timely situational awareness after highway-rail grade crossing incidents, yet official Federal Railroad Administration (FRA) investigations can take days to weeks. We present a demo system that populates Highway-Rail Grade Crossing Incident Data (Form 57) from news in real time. Our approach addresses two core challenges: the form is visually irregular and semantically dense, and news is noisy. To solve these problems, we design a pipeline that first converts Form 57 into a JSON schema using a vision language model with sample aggregation, and then performs grouped question answering following the intent of the form layout to reduce ambiguity. In addition, we build an evaluation dataset by aligning scraped news articles with official FRA records and annotating retrievable information. We then assess our system against various alternatives in terms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Public Relations and Crisis Communication · Data Visualization and Analytics
