TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos

Leonardo Plini; Luca Scofano; Edoardo De Matteis; Guido Maria D'Amely di Melendugno; Alessandro Flaborea; Andrea Sanchietti; Giovanni Maria Farinella; Fabio Galasso; Antonino Furnari

arXiv:2411.02570·cs.CV·January 6, 2026

TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos

Leonardo Plini, Luca Scofano, Edoardo De Matteis, Guido Maria D'Amely di Melendugno, Alessandro Flaborea, Andrea Sanchietti, Giovanni Maria Farinella, Fabio Galasso, Antonino Furnari

PDF

1 Repo

TL;DR

This paper introduces TI-PREGO, a dual-branch system combining step recognition and future step anticipation using LLMs to detect procedural mistakes online in egocentric videos, addressing open-set errors without prior failure examples.

Contribution

The paper proposes a novel dual-branch architecture utilizing LLMs for online mistake detection in egocentric videos, advancing open-set error recognition in procedural tasks.

Findings

01

Effective detection of procedural mistakes in real-time

02

Robustness demonstrated across multiple datasets

03

Outperforms existing state-of-the-art models

Abstract

Identifying procedural errors online from egocentric videos is a critical yet challenging task across various domains, including manufacturing, healthcare, and skill-based training. The nature of such mistakes is inherently open-set, as unforeseen or novel errors may occur, necessitating robust detection systems that do not rely on prior examples of failure. Currently, however, no technique effectively detects open-set procedural mistakes online. We propose a dual branch architecture to address this problem in an online fashion: one branch continuously performs step recognition from the input egocentric video, while the other anticipates future steps based on the recognition module's output. Mistakes are detected as mismatches between the currently recognized action and the action predicted by the anticipation module. The recognition branch takes input frames, predicts the current…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aleflabo/prego
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.