Deep kernel video approximation for unsupervised action segmentation

Silvia L. Pintea; Jouke Dijkstra

arXiv:2604.21572·cs.CV·April 24, 2026

Deep kernel video approximation for unsupervised action segmentation

Silvia L. Pintea, Jouke Dijkstra

PDF

TL;DR

This paper introduces a novel unsupervised video action segmentation method using deep kernel space approximation with neural tangent kernels and maximum mean discrepancy, achieving competitive results on standard benchmarks.

Contribution

It proposes a new approach leveraging deep kernel space and NTKs for unsupervised action segmentation, improving reliability and efficiency over existing methods.

Findings

01

Achieves competitive results on six standard benchmarks.

02

Outperforms prior agglomerative methods when segment count is unknown.

03

Uses MMD with NTKs for more reliable and faster distribution approximation.

Abstract

This work focuses on per-video unsupervised action segmentation, which is of interest to applications where storing large datasets is either not possible, or nor permitted. We propose to segment videos by learning in deep kernel space, to approximate the underlying frame distribution, as closely as possible. To define this closeness metric between the original video distribution and its approximation, we rely on maximum mean discrepancy (MMD) which is a geometry-preserving metric in distribution space, and thus gives more reliable estimates. Moreover, unlike the commonly used optimal transport metric, MMD is both easier to optimize, and faster. We choose to use neural tangent kernels (NTKs) to define the kernel space where MMD operates, because of their improved descriptive power as opposed to fixed kernels. And, also, because NTKs sidestep the trivial solution, when jointly learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.