SERNF: Sample-Efficient Real-World Dexterous Policy Fine-Tuning via Action-Chunked Critics and Normalizing Flows

Chenyu Yang; Denis Tarasov; Davide Liconti; Hehui Zheng; Robert K. Katzschmann

arXiv:2602.09580·cs.RO·April 7, 2026

SERNF: Sample-Efficient Real-World Dexterous Policy Fine-Tuning via Action-Chunked Critics and Normalizing Flows

Chenyu Yang, Denis Tarasov, Davide Liconti, Hehui Zheng, Robert K. Katzschmann

PDF

TL;DR

SERFN introduces a sample-efficient, likelihood-based fine-tuning framework for dexterous manipulation, combining normalizing flows and chunked critics to handle multimodal actions and improve real-world robotic performance.

Contribution

The paper presents a novel off-policy fine-tuning method using normalizing flows and chunked critics, enabling stable, sample-efficient dexterous policy adaptation on real robots.

Findings

01

SERFN achieves stable, sample-efficient fine-tuning on real robotic tasks.

02

Normalizing flows enable exact likelihood computation for multimodal action chunks.

03

Chunked critics improve long-horizon credit assignment in dexterous manipulation.

Abstract

Real-world fine-tuning of dexterous manipulation policies remains challenging due to limited real-world interaction budgets and highly multimodal action distributions. Diffusion-based policies, while expressive, do not permit conservative likelihood-based updates during fine-tuning because action probabilities are intractable. In contrast, conventional Gaussian policies collapse under multimodality, particularly when actions are executed in chunks, and standard per-step critics fail to align with chunked execution, leading to poor credit assignment. We present SERFN, a sample-efficient off-policy fine-tuning framework with normalizing flow (NF) to address these challenges. The normalizing flow policy yields exact likelihoods for multimodal action chunks, allowing conservative, stable policy updates through likelihood regularization and thereby improving sample efficiency. An…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.