# Brainsourcing for temporal visual attention estimation

**Authors:** Yoelvis Moreno-Alcayde, Tuukka Ruotsalo, Luis A. Leiva, V. Javier Traver

PMC · DOI: 10.1007/s13534-024-00449-1 · 2025-01-11

## TL;DR

This paper explores how brain signals can reveal when people pay attention to different parts of a video over time, using EEG data from multiple observers.

## Contribution

The study introduces a novel method to estimate temporal visual attention using only brain signals, without relying on gaze data or video content.

## Key findings

- EEG-based inter-subject consistency correlates with temporal visual attention in videos.
- Effect sizes ranged from medium to very large in one dataset, showing strong relevance.
- Temporal brain signals can reveal attention patterns useful for human-computer interaction and medical applications.

## Abstract

The concept of temporal visual attention in dynamic contents, such as videos, has been much less studied than its spatial counterpart, i.e., visual salience. Yet, temporal visual attention is useful for many downstream tasks, such as video compression and summarisation, or monitoring users’ engagement with visual information. Previous work has considered quantifying a temporal salience score from spatio-temporal user agreements from gaze data. Instead of gaze-based or content-based approaches, we explore to what extent only brain signals can reveal temporal visual attention. We propose methods for (1) computing a temporal visual salience score from salience maps of video frames; (2) quantifying the temporal brain salience score as a cognitive consistency score from the brain signals from multiple observers; and (3) assessing the correlation between both temporal salience scores, and computing its relevance. Two public EEG datasets (DEAP and MAHNOB) are used for experimental validation. Relevant correlations between temporal visual attention and EEG-based inter-subject consistency were found, as compared with a random baseline. In particular, effect sizes, measured with Cohen’s d, ranged from very small to large in one dataset, and from medium to very large in another dataset. Brain consistency among subjects watching videos unveils temporal visual attention cues. This has relevant practical implications for analysing attention for visual design in human-computer interaction, in the medical domain, and in brain-computer interfaces at large.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11871278/full.md

---
Source: https://tomesphere.com/paper/PMC11871278