From Transthoracic to Transesophageal: Cross-Modality Generation using LoRA Diffusion

Emmanuel Oladokun; Yuxuan Ou; Anna Novikova; Daria Kulikova; Sarina Thomas; Jurica \v{S}prem; Vicente Grau

arXiv:2508.13077·eess.IV·August 19, 2025

From Transthoracic to Transesophageal: Cross-Modality Generation using LoRA Diffusion

Emmanuel Oladokun, Yuxuan Ou, Anna Novikova, Daria Kulikova, Sarina Thomas, Jurica \v{S}prem, Vicente Grau

PDF

Open Access

TL;DR

This paper presents a method to adapt a transthoracic echo diffusion model to transesophageal echo with minimal data and parameters, enabling high-quality image synthesis and improved segmentation performance.

Contribution

The authors introduce a novel adaptation pipeline combining Low-Rank Adaptation and MaskR$^2$ for cross-modality diffusion model transfer with minimal parameter updates.

Findings

01

High-fidelity TEE synthesis with limited data and small adapters

02

Effective transformation of mask formats without harming downstream tasks

03

Synthetic data augmentation improves segmentation of underrepresented structures

Abstract

Deep diffusion models excel at realistic image synthesis but demand large training sets-an obstacle in data-scarce domains like transesophageal echocardiography (TEE). While synthetic augmentation has boosted performance in transthoracic echo (TTE), TEE remains critically underrepresented, limiting the reach of deep learning in this high-impact modality. We address this gap by adapting a TTE-trained, mask-conditioned diffusion backbone to TEE with only a limited number of new cases and adapters as small as $1 0^{5}$ parameters. Our pipeline combines Low-Rank Adaptation with MaskR $^{2}$ , a lightweight remapping layer that aligns novel mask formats with the pretrained model's conditioning channels. This design lets users adapt models to new datasets with a different set of anatomical structures to the base model's original set. Through a targeted adaptation strategy, we find that adapting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis