# PMRVT: Parallel Attention Multilayer Perceptron Recurrent Vision Transformer for Object Detection with Event Cameras

**Authors:** Zishi Song, Jianming Wang, Yongxin Su, Yukuan Sun, Xiaojie Duan

PMC · DOI: 10.3390/s25216580 · Sensors (Basel, Switzerland) · 2025-10-25

## TL;DR

This paper introduces PMRVT, a new vision system for object detection using event cameras, which are better at capturing fast motion and offer faster performance than traditional cameras.

## Contribution

PMRVT introduces a unified framework that balances efficiency, spatial expressiveness, and temporal consistency for real-time event-based object detection.

## Key findings

- PMRVT achieves 48.7% mAP with 7.72 ms latency on the Gen1 dataset.
- It improves accuracy by 1.5 pp and reduces latency by 8% compared to state-of-the-art methods.

## Abstract

Object detection in high-speed and dynamic environments remains a core challenge in computer vision. Conventional frame-based cameras often suffer from motion blur and high latency, while event cameras capture brightness changes asynchronously with microsecond resolution, high dynamic range, and ultra-low latency, offering a promising alternative. Despite these advantages, existing event-based detection methods still suffer from high computational cost, limited temporal modeling, and unsatisfactory real-time performance. We present PMRVT (Parallel Attention Multilayer Perceptron Recurrent Vision Transformer), a unified framework that systematically balances early-stage efficiency, enriched spatial expressiveness, and long-horizon temporal consistency. This balance is achieved through a hybrid hierarchical backbone, a Parallel Attention Feature Fusion (PAFF) mechanism with coordinated dual-path design, and a temporal integration strategy, jointly ensuring strong accuracy and real-time performance. Extensive experiments on Gen1 and 1 Mpx datasets show that PMRVT achieves 48.7% and 48.6% mAP with inference latencies of 7.72 ms and 19.94 ms, respectively. Compared with state-of-the-art methods, PMRVT improves accuracy by 1.5 percentage points (pp) and reduces latency by 8%, striking a favorable balance between accuracy and speed and offering a reliable solution for real-time event-based vision applications.

## Full-text entities

- **Genes:** GEN1 (GEN1 structure-specific endonuclease) [NCBI Gene 348654] {aka Gen}
- **Diseases:** injury to (MESH:D014947), PMRVT (MESH:D014786)
- **Chemicals:** ConvLSTM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12610684/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12610684/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12610684/full.md

---
Source: https://tomesphere.com/paper/PMC12610684