Micro-Expression-Aware Avatar Fingerprinting via Inter-Frame Feature Differencing
Masoumeh Chapariniya, Jean-Marc Odobez, Volker Dellwo, Teodora Vukovi\'c

TL;DR
This paper introduces a novel avatar fingerprinting method that leverages inter-frame feature differencing on raw video frames with a micro-expression-aware backbone, enabling end-to-end optimization and improved identity verification.
Contribution
The proposed system eliminates external preprocessing, uses a deep feature differencing approach with a specialized backbone, and demonstrates superior performance in avatar driver identification.
Findings
Achieves an AUC of 0.877 on NVFAIR dataset.
Outperforms landmark-based baseline on most cross-generator pairs.
Temporal motion features are the primary discriminative factor.
Abstract
Avatar fingerprinting, i.e., verifying who drives a synthetic talking-head video rather than whether it is real, is a critical safeguard for authorized use of face-reenactment technology. Existing methods rely on a fixed, non-differentiable landmark extraction stage that prevents the fingerprinting model from being optimized end-to-end from raw pixels. We propose a preprocessing-free system built on a micro-expression-aware backbone operating on raw video frames, with inter-frame feature differencing as the core design principle: consecutive feature maps are subtracted in the learned deep feature space, so that temporally stable appearance dimensions contribute zero to the output while driver-specific motion dynamics are preserved. A controlled ablation on NVFAIR confirms that temporal motion accounts for the large majority of discriminative performance, and that raw appearance features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
