Beyond Imaging: Vision Transformer Digital Twin Surrogates for 3D+T Biological Tissue Dynamics

Kaan Berke Ugurlar; Joaqu\'in de Navascu\'es; Michael Taynnan Barros

arXiv:2508.15883·eess.IV·August 26, 2025

Beyond Imaging: Vision Transformer Digital Twin Surrogates for 3D+T Biological Tissue Dynamics

Kaan Berke Ugurlar, Joaqu\'in de Navascu\'es, Michael Taynnan Barros

PDF

TL;DR

This paper introduces VT-DTSN, a deep learning framework using Vision Transformers to accurately predict and reconstruct 3D+T biological tissue dynamics from imaging data, facilitating in silico biological studies.

Contribution

The work presents a novel Vision Transformer-based surrogate model for high-fidelity, time-resolved biological tissue imaging reconstruction, integrating multi-view fusion and a specialized training loss.

Findings

01

Achieves low error rates and high structural similarity in tissue dynamics reconstruction.

02

Demonstrates robustness and consistency across biological replicates.

03

Enables efficient in silico exploration of tissue behaviors.

Abstract

Understanding the dynamic organization and homeostasis of living tissues requires high-resolution, time-resolved imaging coupled with methods capable of extracting interpretable, predictive insights from complex datasets. Here, we present the Vision Transformer Digital Twin Surrogate Network (VT-DTSN), a deep learning framework for predictive modeling of 3D+T imaging data from biological tissue. By leveraging Vision Transformers pretrained with DINO (Self-Distillation with NO Labels) and employing a multi-view fusion strategy, VT-DTSN learns to reconstruct high-fidelity, time-resolved dynamics of a Drosophila midgut while preserving morphological and feature-level integrity across imaging depths. The model is trained with a composite loss prioritizing pixel-level accuracy, perceptual structure, and feature-space alignment, ensuring biologically meaningful outputs suitable for in silico…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.