Parameter-Efficient Multi-View Proficiency Estimation: From Discriminative Classification to Generative Feedback

Edoardo Bianchi; Antonio Liotta

arXiv:2605.03848·cs.CV·May 6, 2026

Parameter-Efficient Multi-View Proficiency Estimation: From Discriminative Classification to Generative Feedback

Edoardo Bianchi, Antonio Liotta

PDF

TL;DR

This paper presents three innovative methods for multi-view proficiency estimation that improve accuracy, efficiency, and interpretability, enabling better coaching and rehabilitation tools.

Contribution

It introduces SkillFormer, PATS, and ProfVLM, combining discriminative, sampling, and generative approaches for state-of-the-art proficiency estimation with fewer parameters.

Findings

01

Achieved state-of-the-art accuracy on Ego-Exo4D dataset.

02

Reduced trainable parameters by up to 20x.

03

Generated interpretable feedback alongside proficiency labels.

Abstract

Estimating how well a person performs an action, rather than which action is performed, is central to coaching, rehabilitation, and talent identification. This task is challenging because proficiency is encoded in subtle differences in timing, balance, body mechanics, and execution, often distributed across multiple views and short temporal events. We discuss three recent contributions to multi-view proficiency estimation on Ego-Exo4D. SkillFormer introduces a parameter-efficient discriminative architecture for selective multi-view fusion; PATS improves temporal sampling by preserving locally dense excerpts of fundamental movements; and ProfVLM reformulates proficiency estimation as conditional language generation, producing both a proficiency label and expert-style feedback through a gated cross-view projector and a compact language backbone. Together, these methods achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.