GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences
Saihui Hou, Chenye Wang, Wenpeng Lang, Zhengxiang Lan, Yongzhen Huang

TL;DR
This paper introduces GaitSnippet, a novel gait recognition approach that models gait as a composition of snippets to better capture multi-scale temporal context, outperforming existing set and sequence methods.
Contribution
The paper proposes a new snippet-based gait recognition framework that effectively captures multi-scale temporal information, addressing limitations of prior set and sequence approaches.
Findings
Achieves 77.5% rank-1 accuracy on Gait3D dataset.
Achieves 81.7% rank-1 accuracy on GREW dataset.
Demonstrates effectiveness across four widely-used gait datasets.
Abstract
Recent advancements in gait recognition have significantly enhanced performance by treating silhouettes as either an unordered set or an ordered sequence. However, both set-based and sequence-based approaches exhibit notable limitations. Specifically, set-based methods tend to overlook short-range temporal context for individual frames, while sequence-based methods struggle to capture long-range temporal dependencies effectively. To address these challenges, we draw inspiration from human identification and propose a new perspective that conceptualizes human gait as a composition of individualized actions. Each action is represented by a series of frames, randomly selected from a continuous segment of the sequence, which we term a snippet. Fundamentally, the collection of snippets for a given sequence enables the incorporation of multi-scale temporal context, facilitating more…
Peer Reviews
Decision·ICLR 2026 Poster
1. The proposed method achieves improvements in both retrieval performance and inference speed. 2. This paper introduces a simple approach to periodic modeling in gait recognition, which only requires an average estimation of frame length to guide snippet sampling. This is practical for gait recognition; 3. Detailed experiments demonstrate the effectiveness of the sampling strategy and modeling design.
1. Regarding the framework design, the approach appears incremental and lacks clear differentiation from previous methods. The snippet sampling essentially extends uniform sampling, while the modeling constrains the receptive field through various blocks. 2. Unlike TSN, which targets general video understanding, gait data exhibits inherent periodicity. Is there any analysis examining how this characteristic influences the snippet design? 3. Is the 32-frame setting standard in gait recognition ta
S1) The paper introduces a seemingly new paradigm that redefines how temporal context is encoded in gait recognition. By treating gait as a composition of snippets rather than full sequences or unordered sets, the method unifies short- and long-range temporal modeling. S2) The pipeline is coherently structured—sampling, modeling, and supervision are all systematically justified and experimentally validated. S3) The experiments are extensive, spanning multiple datasets and including ablation s
O1) While snippets are conceptually appealing, the paper provides limited theoretical justification for why random snippets generalize better than full sequences or unordered sets. The argument about mimicking human perception (recognition from partial cycles) is plausible but not formalized or empirically isolated. It remains unclear whether the improvement stems from better regularization, temporal diversity, or implicit data augmentation. O2) Although the results are strong, the paper reads
+ The paper is well motivated and easy to follow. The introduced gait snipplet is able to combine both the sequential information as well as the set-level unordered information for understanding. + Authors have provided pretty useful information in Table 3 for different sampling policy for both set, sequence and snipplet. It is interesting to see that the snipplet is able to yield the best performance across three modalities. + Authors also compared the gait snipplet on multiple dataset, which
- Some of the latest results are not included in the paper, e.g., in [1], authors showed a 77.6/70.3 for R1 and mAP on Gait3D, while gait snipplet shows 77.5 and 69.4. Numbers on GREW is also 85.8/92.6 for R1 and R5 on [1] while it is 81.7/90.9. [1] is introducing a skeleton map for gait recognition, which is also a similar new representation that can be compared with snipplet, and frankly speaking, I think the snipplet idea can also be used on the skeleton map. It would be interesting for autho
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGait Recognition and Analysis · Balance, Gait, and Falls Prevention · Human Pose and Action Recognition
