Multi-Agent Object Detection Framework Based on Raspberry Pi YOLO Detector and Slack-Ollama Natural Language Interface
Vladimir Kalu\v{s}ev, Branko Brklja\v{c}, Milan Brklja\v{c}

TL;DR
This paper introduces a multi-agent object detection system on Raspberry Pi that integrates LLM-based natural language control with real-time vision, demonstrating a low-cost, resource-constrained AI platform.
Contribution
It presents a novel multi-agent framework combining local LLM interfaces with vision agents on Raspberry Pi, emphasizing rapid prototyping and resource-efficient design.
Findings
Successful integration of LLM chatbots and object detection on Raspberry Pi
Insights into limitations of low-cost hardware for multi-agent AI systems
Comparison with cloud-based solutions highlighting local processing advantages
Abstract
The paper presents design and prototype implementation of an edge based object detection system within the new paradigm of AI agents orchestration. It goes beyond traditional design approaches by leveraging on LLM based natural language interface for system control and communication and practically demonstrates integration of all system components into a single resource constrained hardware platform. The method is based on the proposed multi-agent object detection framework which tightly integrates different AI agents within the same task of providing object detection and tracking capabilities. The proposed design principles highlight the fast prototyping approach that is characteristic for transformational potential of generative AI systems, which are applied during both development and implementation stages. Instead of specialized communication and control interface, the system is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
