TUM-VIE: The TUM Stereo Visual-Inertial Event Dataset
Simon Klenk, Jason Chui, Nikolaus Demmel, Daniel Cremers

TL;DR
The TUM-VIE dataset offers a comprehensive collection of stereo event camera data, synchronized IMU and grayscale frames, and ground truth poses to advance 3D perception and navigation research in challenging environments.
Contribution
It introduces a large, diverse stereo event dataset with high-resolution sensors, synchronized multimodal data, and ground truth for evaluating visual-inertial perception algorithms.
Findings
Includes challenging sequences where current SLAM algorithms struggle.
Provides high-resolution stereo event data with synchronized IMU and frames.
Enables benchmarking and development of robust event-based perception methods.
Abstract
Event cameras are bio-inspired vision sensors which measure per pixel brightness changes. They offer numerous benefits over traditional, frame-based cameras, including low latency, high dynamic range, high temporal resolution and low power consumption. Thus, these sensors are suited for robotics and virtual reality applications. To foster the development of 3D perception and navigation algorithms with event cameras, we present the TUM-VIE dataset. It consists of a large variety of handheld and head-mounted sequences in indoor and outdoor environments, including rapid motion during sports and high dynamic range scenarios. The dataset contains stereo event data, stereo grayscale frames at 20Hz as well as IMU data at 200Hz. Timestamps between all sensors are synchronized in hardware. The event cameras contain a large sensor of 1280x720 pixels, which is significantly larger than the sensors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
