VTBench: A Multimodal Framework for Time-Series Classification with Chart-Based Representations
Madhumitha Venkatesan, Xuyang Chen, Dongyu Liu

TL;DR
VTBench introduces a systematic framework combining raw time-series data with interpretable chart-based visualizations to enhance classification performance and interpretability across diverse datasets.
Contribution
It presents a modular, extensible approach for multimodal fusion of raw signals and chart visualizations, with comprehensive evaluation and practical guidelines.
Findings
Chart-only models perform well on smaller datasets.
Combining multiple chart types improves accuracy.
Multimodal models can outperform or match raw-only models depending on redundancy.
Abstract
Time-series classification (TSC) has advanced significantly with deep learning, yet most models rely solely on raw numerical inputs, overlooking alternative representations. While texture-based encodings such as Gramian Angular Fields (GAF) and Recurrence Plots (RP) convert time series into 2D images, they often require heavy preprocessing and yield less intuitive representations. In contrast, chart-based visualizations offer more interpretable alternatives and show promise in specific domains; however, their effectiveness remains underexplored, with limited systematic evaluation across chart types, visual encoding choices, and datasets. In this work, we introduce VTBench, a systematic and extensible framework that re-examines TSC through multimodal fusion of raw sequences and chart-based visualizations. VTBench generates lightweight, human-interpretable plots -- line, area, bar, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
