Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study
Tamim Ahmed, Thanassis Rikakis

TL;DR
This paper presents an automated system for scoring the ARAT in stroke rehabilitation using multimodal video analysis, multi-view fusion, and hierarchical Bayesian models, validated by clinicians for accuracy and usability.
Contribution
It introduces a novel multimodal, multi-view, and hierarchical Bayesian approach for automated ARAT scoring, improving interpretability and clinical relevance.
Findings
Achieved 89.0% validation accuracy with late fusion.
Hierarchical Bayesian Models closely match manual assessments.
System is validated by clinicians for accuracy and usability.
Abstract
Manual scoring of the Action Research Arm Test (ARAT) for upper extremity assessment in stroke rehabilitation is time-intensive and variable. We propose an automated ARAT scoring system integrating multimodal video analysis with SlowFast, I3D, and Transformer-based models using OpenPose keypoints and object locations. Our approach employs multi-view data (ipsilateral, contralateral, and top perspectives), applying early and late fusion to combine features across views and models. Hierarchical Bayesian Models (HBMs) infer movement quality components, enhancing interpretability. A clinician dashboard displays task scores, execution times, and quality assessments. We conducted a study with five clinicians who reviewed 500 video ratings generated by our system, providing feedback on its accuracy and usability. Evaluated on a stroke rehabilitation dataset, our framework achieves 89.0%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques
MethodsOpenPose
