MetRoBERTa: Leveraging Traditional Customer Relationship Management Data to Develop a Transit-Topic-Aware Language Model
Michael Leong, Awad Abdelhalim, Jude Ha, Dianne Patterson, Gabriel L., Pincus, Anthony B. Harris, Michael Eichler, Jinhua Zhao

TL;DR
This paper introduces MetRoBERTa, a large language model trained on transit CRM data to classify open-ended rider feedback into relevant topics, enhancing transit agencies' ability to analyze customer experiences at scale.
Contribution
The paper presents a novel semi-supervised training approach for a transit-topic-aware language model based on RoBERTa, trained on 6 years of CRM feedback, outperforming classical methods.
Findings
Achieved 90% accuracy in transit topic classification.
Demonstrated the model's applicability to social media data like Twitter.
Provided a scalable framework for analyzing rider feedback.
Abstract
Transit riders' feedback provided in ridership surveys, customer relationship management (CRM) channels, and in more recent times, through social media is key for transit agencies to better gauge the efficacy of their services and initiatives. Getting a holistic understanding of riders' experience through the feedback shared in those instruments is often challenging, mostly due to the open-ended, unstructured nature of text feedback. In this paper, we propose leveraging traditional transit CRM feedback to develop and deploy a transit-topic-aware large language model (LLM) capable of classifying open-ended text feedback to relevant transit-specific topics. First, we utilize semi-supervised learning to engineer a training dataset of 11 broad transit topics detected in a corpus of 6 years of customer feedback provided to the Washington Metropolitan Area Transit Authority (WMATA). We then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Human Mobility and Location-Based Analysis · Transportation Planning and Optimization
MethodsMulti-Head Attention · Linear Layer · Softmax · Layer Normalization · Adam · Residual Connection · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · WordPiece
