Artic: AI-oriented Real-time Communication for MLLM Video Assistant

Jiangkai Wu; Zhiyuan Ren; Junquan Zhong; Liming Liu; Xinggong Zhang

arXiv:2602.12641·cs.NI·February 16, 2026

Artic: AI-oriented Real-time Communication for MLLM Video Assistant

Jiangkai Wu, Zhiyuan Ren, Junquan Zhong, Liming Liu, Xinggong Zhang

PDF

Open Access

TL;DR

Artic introduces an AI-oriented RTC framework for MLLM Video Assistants that enhances response accuracy and reduces latency by adaptive bitrate management and targeted streaming, addressing current system limitations.

Contribution

The paper presents Artic, a novel RTC framework tailored for MLLM Video Assistants, with new adaptive bitrate and streaming techniques optimized for AI understanding of video content.

Findings

01

Improves MLLM accuracy by 15.12%

02

Reduces latency by 135.31 ms

03

Introduces the first degraded video understanding benchmark

Abstract

AI Video Assistant emerges as a new paradigm for Real-time Communication (RTC), where one peer is a Multimodal Large Language Model (MLLM) deployed in the cloud. This makes interaction between humans and AI more intuitive, akin to chatting with a real person. However, a fundamental mismatch exists between current RTC frameworks and AI Video Assistants, stemming from the drastic shift in Quality of Experience (QoE) and more challenging networks. Measurements on our production prototype also confirm that current RTC fails, causing latency spikes and accuracy drops. To address these challenges, we propose Artic, an AI-oriented RTC framework for MLLM Video Assistants, exploring the shift from "humans watching video" to "AI understanding video." Specifically, Artic proposes: (1) Response Capability-aware Adaptive Bitrate, which utilizes MLLM accuracy saturation to proactively cap bitrate,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Quality Assessment · Advanced Data and IoT Technologies · Advanced Neural Network Applications