DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision

Xiandong Zou; Ruihao Xia; Hongsong Wang; Pan Zhou

arXiv:2506.09814·cs.CV·March 23, 2026

DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision

Xiandong Zou, Ruihao Xia, Hongsong Wang, Pan Zhou

PDF

Open Access 3 Reviews

TL;DR

DreamCS introduces a novel framework for text-to-3D generation that leverages unpaired 3D preference data and a new reward model to produce geometrically accurate and human-aligned 3D assets.

Contribution

It develops the first large-scale unpaired 3D preference dataset and a reward model trained directly on this data, improving human preference alignment in 3D generation.

Findings

01

Outperforms prior methods in producing human-preferred 3D assets.

02

Effectively learns human-aligned 3D geometric preferences without paired comparisons.

03

Enhances both implicit and explicit 3D generation quality.

Abstract

While text-to-3D generation has attracted growing interest, existing methods often struggle to produce 3D assets that align well with human preferences. Current preference alignment techniques for 3D content typically rely on hardly-collected preference-paired multi-view 2D images to train 2D reward models, when then guide 3D generation -- leading to geometric artifacts due to their inherent 2D bias. To address these limitations, we construct 3D-MeshPref, the first large-scale unpaired 3D preference dataset, featuring diverse 3D meshes annotated by a large language model and refined by human evaluators. We then develop RewardCS, the first reward model trained directly on unpaired 3D-MeshPref data using a novel Cauchy-Schwarz divergence objective, enabling effective learning of human-aligned 3D geometric preferences without requiring paired comparisons. Building on this, we propose…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

- The Cauchy-Schwarz divergence training approach for unpaired preference data offers a potentially generalizable framework applicable beyond 3D generation tasks. - 3D-MeshPref provides a human-verified dataset of diverse unpaired 3D meshes, which may be a good community resource that helps reduces dependence on expensive paired annotations. - The differentiable meshization pipeline enables end-to-end optimization with geometry-aware supervision, demonstrating technical soundness in integrating

Weaknesses

- The CS divergence receives extensive theoretical treatment (Appendix B) but lacks empirical validation. No ablations demonstrate performance degradation without this loss, no clustering baselines justify its necessity, and Table 4's λ variations don't compare against removing the term entirely. The mathematical formalism appears to add complexity without proven practical benefit. I ask the authors to further elaborate upon this point. - Table 1 exposes a critical flaw: RewardCS underperforms R

Reviewer 02Rating 4Confidence 4

Strengths

* The paper correctly diagnoses a key failure mode of 2D preference signals in 3D generation and proposes a principled 3D reward to address it. * The paper provides a formal mathematical justification, establishing the asymptotic equivalence between the proposed unpaired objective (using CS divergence) and traditional paired supervision, which instills high confidence in the method's soundness. * Building 3D‑MeshPref at 30k+ meshes with human‑verified thresholds, is a non‑trivial engineering con

Weaknesses

* A primary concern is the use of the "GA" (3D Geometry-Asset Alignment Reward) metric. The authors state (Section 4, Appendix F.1) that this metric is "based on RewardCS" and "derived from RewardCS." Using a variant of their own proposed model as a key evaluation metric creates a significant risk of "metric-method coupling," where the metric may be inherently biased to favor the architecture and training objective of the method being tested. This potential bias makes the "GA" scores in Table 1

Reviewer 03Rating 6Confidence 3

Strengths

1. The proposed RewardCS model directly tackles the fundamental issue of 2D bias in existing text-to-3D preference alignment methods, which leads to geometric artifacts like the Janus problem. The 3D-geometric aware model also bypasses the need for hard-to-collect paired preference data. 2. The use of Cauchy-Schwarz divergence for unpaired preference learning is both effective in practice and supported by a solid theoretical proof of its equivalence to paired learning. 3. The method is shown to

Weaknesses

1. The entire DreamCS framework is built upon the increasingly dated Score Distillation Sampling (SDS) paradigm, which is slow, optimization-based, and prone to artifacts. The field is rapidly moving towards fast, feed-forward text-to-3D generators (e.g., Trellis). A more forward-looking and potentially more effective approach for preference alignment would be to directly fine-tune these feed-forward models using human preferences, rather than adding a complex reward guidance mechanism to a slow

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Human Motion and Animation · Interactive and Immersive Displays

MethodsALIGN