Multi-Agent Object Detection Framework Based on Raspberry Pi YOLO Detector and Slack-Ollama Natural Language Interface

Vladimir Kalu\v{s}ev; Branko Brklja\v{c}; Milan Brklja\v{c}

arXiv:2604.13345·cs.CV·April 16, 2026

Multi-Agent Object Detection Framework Based on Raspberry Pi YOLO Detector and Slack-Ollama Natural Language Interface

Vladimir Kalu\v{s}ev, Branko Brklja\v{c}, Milan Brklja\v{c}

PDF

TL;DR

This paper introduces a multi-agent object detection system on Raspberry Pi that integrates LLM-based natural language control with real-time vision, demonstrating a low-cost, resource-constrained AI platform.

Contribution

It presents a novel multi-agent framework combining local LLM interfaces with vision agents on Raspberry Pi, emphasizing rapid prototyping and resource-efficient design.

Findings

01

Successful integration of LLM chatbots and object detection on Raspberry Pi

02

Insights into limitations of low-cost hardware for multi-agent AI systems

03

Comparison with cloud-based solutions highlighting local processing advantages

Abstract

The paper presents design and prototype implementation of an edge based object detection system within the new paradigm of AI agents orchestration. It goes beyond traditional design approaches by leveraging on LLM based natural language interface for system control and communication and practically demonstrates integration of all system components into a single resource constrained hardware platform. The method is based on the proposed multi-agent object detection framework which tightly integrates different AI agents within the same task of providing object detection and tracking capabilities. The proposed design principles highlight the fast prototyping approach that is characteristic for transformational potential of generative AI systems, which are applied during both development and implementation stages. Instead of specialized communication and control interface, the system is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.