Hierarchical Delay Attribution Classification using Unstructured Text in   Train Management Systems

Anton Borg; Per Lingvall; Martin Svensson

arXiv:2402.04108·cs.LG·February 7, 2024·1 cites

Hierarchical Delay Attribution Classification using Unstructured Text in Train Management Systems

Anton Borg, Per Lingvall, Martin Svensson

PDF

Open Access

TL;DR

This paper explores machine learning methods to automate delay attribution coding in train management systems using unstructured text, comparing hierarchical and flat classification approaches.

Contribution

It introduces a hierarchical classification approach for delay attribution and evaluates its performance against flat models and manual coding.

Findings

01

Hierarchical approach outperforms flat classification.

02

Machine learning models outperform random classifier.

03

Performance is below manual classification accuracy.

Abstract

EU directives stipulate a systematic follow-up of train delays. In Sweden, the Swedish Transport Administration registers and assigns an appropriate delay attribution code. However, this delay attribution code is assigned manually, which is a complex task. In this paper, a machine learning-based decision support for assigning delay attribution codes based on event descriptions is investigated. The text is transformed using TF-IDF, and two models, Random Forest and Support Vector Machine, are evaluated against a random uniform classifier and the classification performance of the Swedish Transport Administration. Further, the problem is modeled as both a hierarchical and flat approach. The results indicate that a hierarchical approach performs better than a flat approach. Both approaches perform better than the random uniform classifier but perform worse than the manual classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Natural Language Processing Techniques