Transforming Gait: Video-Based Spatiotemporal Gait Analysis
R. James Cotton, Emoonah McClerklin, Anthony Cimorelli, Ankit Patel,, Tasos Karakostas

TL;DR
This paper presents a neural network that analyzes monocular videos to produce accurate, clinically meaningful gait parameters, enabling remote gait assessment with high precision.
Contribution
It introduces a novel neural network model trained on extensive data to derive detailed gait metrics from monocular videos, bridging the gap between lab-based and video-based gait analysis.
Findings
Accurately estimates gait cycle timing and joint kinematics from video.
Provides cycle-by-cycle gait parameters such as cadence and step length.
Achieves high correlation with traditional lab-based gait measurements.
Abstract
Human pose estimation from monocular video is a rapidly advancing field that offers great promise to human movement science and rehabilitation. This potential is tempered by the smaller body of work ensuring the outputs are clinically meaningful and properly calibrated. Gait analysis, typically performed in a dedicated lab, produces precise measurements including kinematics and step timing. Using over 7000 monocular video from an instrumented gait analysis lab, we trained a neural network to map 3D joint trajectories and the height of individuals onto interpretable biomechanical outputs including gait cycle timing and sagittal plane joint kinematics and spatiotemporal trajectories. This task specific layer produces accurate estimates of the timing of foot contact and foot off events. After parsing the kinematic outputs into individual gait cycles, it also enables accurate cycle-by-cycle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDiabetic Foot Ulcer Assessment and Management · Gait Recognition and Analysis · Human Pose and Action Recognition
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
