# Generative inpainting of incomplete Euclidean distance matrices of trajectories generated by a fractional Brownian motion

**Authors:** Alexander Lobashev, Dmitry Guskov, Kirill Polovnikov

PMC · DOI: 10.1038/s41598-025-97893-5 · Scientific Reports · 2025-05-31

## TL;DR

This paper explores using diffusion-based inpainting to reconstruct incomplete distance matrices from fractional Brownian motion trajectories and applies it to biological data.

## Contribution

A novel physics-informed generative approach for imputing incomplete biological distance matrices using diffusion models trained on fractional Brownian motion.

## Key findings

- Conditional diffusion models effectively reproduce fBm correlations across different memory regimes.
- Diffusion models generalize rather than memorize training data, enabling robust imputation at high missing ratios.
- The method outperforms standard bioinformatic approaches on microscopy-derived chromosomal distance matrices.

## Abstract

Fractional Brownian motion (fBm) exhibits both randomness and strong scale-free correlations, posing a challenge for generative artificial intelligence to replicate the underlying stochastic process. In this study, we evaluate the performance of diffusion-based inpainting methods on a specific dataset of corrupted images, which represent incomplete Euclidean distance matrices (EDMs) of fBm across various memory exponents (H). Our dataset reveals that, in the regime of low missing ratios, data imputation is unique, as the remaining partial graph is rigid, thus providing a reliable ground truth for inpainting. We find that conditional diffusion generation effectively reproduces the inherent correlations of fBm paths across different memory regimes, including sub-diffusion, Brownian motion, and super-diffusion trajectories, making it a robust tool for statistical imputation in cases with high missing ratios. Moreover, while recent studies have suggested that diffusion models memorize samples from the training dataset, our findings indicate that diffusion behaves qualitatively differently from simple database searches, allowing for generalization rather than mere memorization of the training data. As a biological application, we utilize our fBm-trained diffusion model to impute microscopy-derived distance matrices of chromosomal segments (FISH data), which are incomplete due to experimental imperfections. We demonstrate that our inpainting method outperforms standard bioinformatic methods, suggesting a novel physics-informed generative approach for the enrichment of high-throughput biological datasets.

## Full-text entities

- **Diseases:** EDM (MESH:C535290), colon cancer (MESH:D015179), rigidity (MESH:D009127)
- **Chemicals:** auxin (MESH:D007210), DDPM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** HCT116 — Homo sapiens (Human), Colon carcinoma, Cancer cell line (CVCL_0291)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12126505/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12126505/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12126505/full.md

---
Source: https://tomesphere.com/paper/PMC12126505