From Skeletons to Semantics: Design and Deployment of a Hybrid Edge-Based Action Detection System for Public Safety

Ganen Sethupathy; Lalit Dumka; Jan Schagen

arXiv:2603.29777·cs.CV·April 1, 2026

From Skeletons to Semantics: Design and Deployment of a Hybrid Edge-Based Action Detection System for Public Safety

Ganen Sethupathy, Lalit Dumka, Jan Schagen

PDF

TL;DR

This paper develops a hybrid edge-based action detection system combining skeleton analysis and vision-language models to improve real-time public safety monitoring under resource constraints.

Contribution

It presents a system-level comparison of motion-based and semantic approaches, demonstrating a hybrid architecture's effectiveness on edge devices.

Findings

01

Skeleton-based processing offers low latency and privacy benefits.

02

Vision-language models enable contextual understanding and zero-shot reasoning.

03

Hybrid system balances speed and semantic depth for public safety applications.

Abstract

Public spaces such as transport hubs, city centres, and event venues require timely and reliable detection of potentially violent behaviour to support public safety. While automated video analysis has made significant progress, practical deployment remains constrained by latency, privacy, and resource limitations, particularly under edge-computing conditions. This paper presents the design and demonstrator-based deployment of a hybrid edge-based action detection system that combines skeleton-based motion analysis with vision-language models for semantic scene interpretation. Skeleton-based processing enables continuous, privacy-aware monitoring with low computational overhead, while vision-language models provide contextual understanding and zero-shot reasoning capabilities for complex and previously unseen situations. Rather than proposing new recognition models, the contribution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.