VidFuncta: Towards Generalizable Neural Representations for Ultrasound Videos
Julia Wolleb, Florentin Bieder, Paul Friedrich, Hemant D. Tagare, Xenophon Papademetris

TL;DR
VidFuncta introduces a novel neural representation framework for ultrasound videos that captures temporal dynamics and dataset redundancies, improving reconstruction and downstream task performance across multiple medical applications.
Contribution
It extends the INR-based Functa framework to handle variable-length ultrasound videos by disentangling static and dynamic features, enabling better generalization and efficiency.
Findings
Outperforms 2D and 3D baselines in video reconstruction.
Enables direct operation on learned modulation vectors for downstream tasks.
Validated on three public ultrasound datasets for various clinical tasks.
Abstract
Ultrasound is widely used in clinical care, yet standard deep learning methods often struggle with full video analysis due to non-standardized acquisition and operator bias. We offer a new perspective on ultrasound video analysis through implicit neural representations (INRs). We build on Functa, an INR framework in which each image is represented by a modulation vector that conditions a shared neural network. However, its extension to the temporal domain of medical videos remains unexplored. To address this gap, we propose VidFuncta, a novel framework that leverages Functa to encode variable-length ultrasound videos into compact, time-resolved representations. VidFuncta disentangles each video into a static video-specific vector and a sequence of time-dependent modulation vectors, capturing both temporal dynamics and dataset-level redundancies. Our method outperforms 2D and 3D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
