# Going beyond still images to improve input variance resilience in multi-stream vision understanding models

**Authors:** Amir Hosein Fadaei, Mohammad-Reza  A. Dehaqani

PMC · DOI: 10.1038/s41598-024-66346-w · Scientific Reports · 2024-07-04

## TL;DR

This paper introduces a brain-inspired vision model trained with videos to improve resilience to input changes.

## Contribution

The novel approach uses video-based training with spatiotemporal features to enhance model resilience.

## Key findings

- Models trained on videos show greater resilience to input media alterations.
- Incorporating temporal features improves robustness compared to static image training.
- Brain-inspired video training aligns with natural vision processing.

## Abstract

Traditionally, vision models have predominantly relied on spatial features extracted from static images, deviating from the continuous stream of spatiotemporal features processed by the brain in natural vision. While numerous video-understanding models have emerged, incorporating videos into image-understanding models with spatiotemporal features has been limited. Drawing inspiration from natural vision, which exhibits remarkable resilience to input changes, our research focuses on the development of a brain-inspired model for vision understanding trained with videos. Our findings demonstrate that models that train on videos instead of still images and include temporal features become more resilient to various alternations on input media.

## Full-text entities

- **Diseases:** myopia (MESH:D009216)
- **Chemicals:** mAP (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11224316/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11224316/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC11224316/full.md

---
Source: https://tomesphere.com/paper/PMC11224316