TeleEgo: Benchmarking Egocentric AI Assistants in the Wild

Jiaqi Yan; Ruilong Ren; Jingren Liu; Shuning Xu; Ling Wang; Yiheng Wang; Xinlin Zhong; Yun Wang; Long Zhang; Xiangyu Chen; Changzhi Sun; Jixiang Luo; Dell Zhang; Hao Sun; Chi Zhang; Xuelong Li

arXiv:2510.23981·cs.CV·December 11, 2025

TeleEgo: Benchmarking Egocentric AI Assistants in the Wild

Jiaqi Yan, Ruilong Ren, Jingren Liu, Shuning Xu, Ling Wang, Yiheng Wang, Xinlin Zhong, Yun Wang, Long Zhang, Xiangyu Chen, Changzhi Sun, Jixiang Luo, Dell Zhang, Hao Sun, Chi Zhang, Xuelong Li

PDF

1 Datasets

TL;DR

TeleEgo is a comprehensive benchmark for evaluating egocentric AI assistants in real-world, streaming scenarios, focusing on multi-modal understanding, memory, and responsiveness over long durations.

Contribution

It introduces TeleEgo, a long-duration, multi-modal benchmark with new metrics for assessing real-time accuracy and long-term memory in egocentric AI assistants.

Findings

01

Current models show limited real-time accuracy in streaming settings.

02

The benchmark reveals challenges in long-term memory retention.

03

TeleEgo provides a platform for systematic evaluation of egocentric AI capabilities.

Abstract

Egocentric AI assistants in real-world settings must process multi-modal inputs (video, audio, text), respond in real time, and retain evolving long-term memory. However, existing benchmarks typically evaluate these abilities in isolation, lack realistic streaming scenarios, or support only short-term tasks. We introduce \textbf{TeleEgo}, a long-duration, streaming, omni-modal benchmark for evaluating egocentric AI assistants in realistic daily contexts. The dataset features over 14 hours per participant of synchronized egocentric video, audio, and text across four domains: work \& study, lifestyle \& routines, social activities, and outings \& culture. All data is aligned on a unified global timeline and includes high-quality visual narrations and speech transcripts, curated through human refinement.TeleEgo defines 12 diagnostic subtasks across three core capabilities: Memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

David0219/TeleEgo
dataset· 1.0k dl
1.0k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.