Training-inference input alignment outweighs framework choice in longitudinal retinal image prediction

Liyin Chen; Nazlee Zebardast; Mengyu Wang; Tobias Elze; Jason I. Comander

arXiv:2604.16955·cs.CV·April 28, 2026

Training-inference input alignment outweighs framework choice in longitudinal retinal image prediction

Liyin Chen, Nazlee Zebardast, Mengyu Wang, Tobias Elze, Jason I. Comander

PDF

TL;DR

This study shows that aligning training and inference inputs is more crucial than model complexity for predicting retinal disease progression from longitudinal images, especially when variability dominates disease signals.

Contribution

The paper introduces a framework to assess the necessary generative complexity based on task entropy and demonstrates its effectiveness in retinal image prediction tasks.

Findings

01

Training-inference alignment significantly improves prediction quality.

02

Framework-based assessment guides appropriate model complexity.

03

Deterministic models perform comparably or better when disease progression is slow.

Abstract

Predicting disease progression from longitudinal imaging is useful for clinical decision making and trial design. Recent methods have moved toward increasing generative complexity, but the conditions under which this complexity is necessary remain unclear. We propose that generative complexity should match the entropy of the predictable component of a task's conditional posterior, with training-inference input alignment required in all regimes. Two model-light measurements, a task-entropy analysis on raw image pairs and a posterior-concentration analysis on a stochastic model, let practitioners assess the complexity a task warrants before committing to a modeling framework. We validated this framework on a fundus autofluorescence (FAF) dataset by contrasting five conditioning configurations, sharing one architecture and training set, spanning standard conditional diffusion,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.