Meltdown: Circuits and Bifurcations in Point-Cloud-Conditioned 3D Diffusion Transformers

Maximilian Plattner; Fabian Paischer; Johannes Brandstetter; Arturs Berzins

arXiv:2602.11130·cs.LG·May 19, 2026

Meltdown: Circuits and Bifurcations in Point-Cloud-Conditioned 3D Diffusion Transformers

Maximilian Plattner, Fabian Paischer, Johannes Brandstetter, Arturs Berzins

PDF

3 Reviews

TL;DR

This paper investigates a failure mode called Meltdown in 3D diffusion transformers conditioned on point clouds, revealing how small perturbations can cause catastrophic reconstruction failures and proposing PowerRemap as a mitigation.

Contribution

The study identifies the mechanism behind Meltdown failures, linking circuit-level attention issues to trajectory bifurcations, and introduces PowerRemap to effectively mitigate these failures.

Findings

01

Meltdown occurs in 89.9-100% of tested shapes across architectures.

02

PowerRemap rescues 98.3% of shapes on WaLa and 84.6% on Make-a-Shape.

03

Failure is linked to low-rank, directional perturbations in the diffusion process.

Abstract

Sparse point clouds are a common input modality for 3D surface reconstruction, including in safety-critical settings such as surgical navigation and autonomous perception. Recent point-cloud-conditioned 3D diffusion transformers achieve state-of-the-art results in this regime by leveraging learned priors. We show that these models can fail catastrophically under realistic input variation, and present a mechanistic case study of why. We identify a failure mode we call Meltdown: tiny on-surface perturbations to a sparse input point cloud can fracture the reconstructed output into hundreds of disconnected pieces. Adversarial search recovers Meltdown in 89.9-100% of shapes across the two open-weight state-of-the-art architectures we study (WaLa, Make-a-Shape) on real-world datasets (GSO, SimJEB) and under both DDPM and DDIM sampling. We trace Meltdown along the forward pass: it is governed…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

1. The paper presents an interesting application of existing activation patching method to identify geometry-related representations within 3D latent diffusion models. 2. The proposed meltdown phenomenon is novel and well-characterized, although it remains unclear whether similar behavior would be observed on other surface reconstruction datasets beyond Google Scanned Objects (GSO). 3. The proposed PowerRemap intervention is simple yet effective, demonstrating strong recovery performance on WaLa

Weaknesses

1. The generalizability of this finding is very limited. The experiment focused on two models (WaLa and MAKE-A-SHAPE) and evaluated the meltdown on only one dataset (Google Scanned Objects). It is unknown whether the meltdown phenonomon is unique to the GSO datasets, and if the cross-attention head that controls the meltdown can be found in latent 3D diffusion transformer, other than WaLa and MAKE-A-SHAPE. 2. As shown in Tables 2 and 3 in Appendix B.3 (p. 21), the effectiveness of PowerRemap di

Reviewer 02Rating 6Confidence 1

Strengths

Clean activation-patching grid over depth×time pinpoints a single early cross-attention write controlling meltdown; procedure and repair map are explicit. PowerRemap is model-agnostic, test-time only, and provably reduces spectral entropy without changing singular vectors. On GSO, meltdown occurs widely, and PowerRemap rescues 98.3% of WALA failures.

Weaknesses

For make-a-shape, reported rescue is only 10.1% with the same 𝛾, suggesting sensitivity to architecture and hyperparameters and limiting generality. Spectral entropy is the only diagnostic evaluated; no comparison to effective rank, top-k energy, condition number, per-head concentration, or Jacobian norms. “Connected components” may conflate legitimate multi-part objects with failures; precision/recall vs. human labels not reported. 𝛾 selection is ad-hoc (global 𝛾=100); the paper itself note

Reviewer 03Rating 6Confidence 3

Strengths

1. The meltdown phenomenon is a common issue in 3D diffusion models for shape completion and worth investigating. 2. The finding that a single cross-attention module is primarily responsible for the observed failure is particularly interesting and provides useful insight into the model’s internal behavior. 3. The discussion on diffusion dynamics is interesting and contributes to a better conceptual understanding of diffusion behavior.

Weaknesses

1. The experiments are insufficient. The observed meltdown failure is likely to depend strongly on the density of the input point cloud, yet this factor is neither analyzed nor explicitly specified in the experiments. In addition, all experiments are conducted solely on the GSO dataset, which limits the generality of the conclusions. Including results on at least one additional dataset would significantly strengthen the empirical support for the proposed theory. 2. In Fig. 3, the trend of conne

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing · 3D Shape Modeling and Analysis