ELLMPEG: An Edge-based Agentic LLM Video Processing Tool
Zoha Azimi, Reza Farahani, Radu Prodan, Christian Timmerer

TL;DR
ELLMPEG is a framework that enables edge-based, agentic LLMs to generate and verify video processing commands locally, reducing reliance on cloud APIs and improving efficiency and privacy.
Contribution
This paper introduces ELLMPEG, a novel edge-enabled agentic LLM system that generates and verifies video processing commands locally, integrating tool-aware RAG and self-reflection.
Findings
Qwen2.5 with ELLMPEG achieves 78% command accuracy.
ELLMPEG reduces API costs by enabling local command generation.
The framework improves efficiency and privacy in video processing tasks.
Abstract
Large language models (LLMs), the foundation of generative AI systems like ChatGPT, are transforming many fields and applications, including multimedia, enabling more advanced content generation, analysis, and interaction. However, cloud-based LLM deployments face three key limitations: high computational and energy demands, privacy and reliability risks from remote processing, and recurring API costs. Recent advances in agentic AI, especially in structured reasoning and tool use, offer a better way to exploit open and locally deployed tools and LLMs. This paper presents ELLMPEG, an edge-enabled agentic LLM framework for the automated generation of video-processing commands. ELLMPEG integrates tool-aware Retrieval-Augmented Generation (RAG) with iterative self-reflection to produce and locally verify executable FFmpeg and VVenC commands directly at the edge, eliminating reliance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Generative Adversarial Networks and Image Synthesis
