FineXtrol: Controllable Motion Generation via Fine-Grained Text

Keming Shen; Bizhu Wu; Junliang Chen; Xiaoqin Wang; and Linlin Shen

arXiv:2511.18927·cs.CV·November 25, 2025

FineXtrol: Controllable Motion Generation via Fine-Grained Text

Keming Shen, Bizhu Wu, Junliang Chen, Xiaoqin Wang, and Linlin Shen

PDF

Open Access 1 Video

TL;DR

FineXtrol introduces a new framework for precise, controllable motion generation driven by detailed, temporally-aware text descriptions, overcoming limitations of previous methods in detail alignment and computational efficiency.

Contribution

The paper presents FineXtrol, a novel control framework utilizing hierarchical contrastive learning to improve motion controllability with fine-grained textual signals.

Findings

01

Achieves strong controllability in motion generation.

02

Demonstrates flexibility in directing body part movements.

03

Outperforms previous methods in quantitative metrics.

Abstract

Recent works have sought to enhance the controllability and precision of text-driven motion generation. Some approaches leverage large language models (LLMs) to produce more detailed texts, while others incorporate global 3D coordinate sequences as additional control signals. However, the former often introduces misaligned details and lacks explicit temporal cues, and the latter incurs significant computational cost when converting coordinates to standard motion representations. To address these issues, we propose FineXtrol, a novel control framework for efficient motion generation guided by temporally-aware, precise, user-friendly, and fine-grained textual control signals that describe specific body part movements over time. In support of this framework, we design a hierarchical contrastive learning module that encourages the text encoder to produce more discriminative embeddings for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

FineXtrol: Controllable Motion Generation via Fine-Grained Text· underline

Taxonomy

TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · 3D Shape Modeling and Analysis