Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

Zhao Yang; Yi Duan; Jiwei Zhu; Ying Ba; Chuan Cao; Bing Su

arXiv:2602.21550·cs.LG·March 13, 2026

Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

Zhao Yang, Yi Duan, Jiwei Zhu, Ying Ba, Chuan Cao, Bing Su

PDF

Open Access 1 Models 3 Reviews

TL;DR

This paper shows that integrating proximal multimodal epigenomic signals with a novel framework called Prism significantly improves gene expression prediction accuracy using short DNA sequences, challenging the focus on long sequence modeling.

Contribution

The paper introduces Prism, a new framework that effectively integrates diverse epigenomic signals and mitigates confounding effects, achieving state-of-the-art gene expression prediction with short sequences.

Findings

01

Long sequence modeling can decrease prediction performance.

02

Proper integration of multimodal signals improves accuracy.

03

Prism outperforms existing methods on gene expression prediction.

Abstract

Gene expression prediction, which predicts mRNA expression levels from DNA sequences, presents significant challenges. Previous works often focus on extending input sequence length to locate distal enhancers, which may influence target genes from hundreds of kilobases away. Our work first reveals that for current models, long sequence modeling can decrease performance. Even carefully designed algorithms only mitigate the performance degradation caused by long sequences. Instead, we find that proximal multimodal epigenomic signals near target genes prove more essential. Hence we focus on how to better integrate these signals, which has been overlooked. We find that different signal types serve distinct biological roles, with some directly marking active regulatory elements while others reflect background chromatin patterns that may introduce confounding effects. Simple concatenation may…

Peer Reviews

Decision·ICLR 2026 Oral

Reviewer 01Rating 8Confidence 4

Strengths

1. The introduction of confounder components for the gene expression prediction and their connection to biological intuition is important. As it completes the current casual relationship formulation of the epigenomic signal. 2. The observation regarding the sequence length required for CAGE prediction is interesting and biologically reasonable. The provided experiments support such observation on the K562 cell for Gene Expression CAGE Prediction. I still have some doubts about whether a shorte

Weaknesses

The overall framework appears well designed and complete, and I have no further comments regarding potential improvements. My remaining concern lies in how to determine the appropriate sequence length for different prediction tasks. Furthermore, if the goal is to train a unified model for general gene expression prediction, it would be helpful to clarify how the model can adapt to varying sequence length requirements across different genes or datasets.

Reviewer 02Rating 6Confidence 4

Strengths

1. Clear motivation **P**. The paper picks upon a prevalent issue in long-context DNA sequence modelling. The authors narrow down on the key-issue and validate it experimentally. 2. Within gene expression prediction, using latent background-state weights + uniform backdoor averaging is relatively novel. 3. While most baselines are trained and reported at 200k bp, Prism runs at 2k bp and still beats prior SOTA (Table 1). This supports their claim that better multi-modal integration can offset lo

Weaknesses

1. Prism completely discards long-range sequence information by design, operating on only 2kbp. This is presented as a strength, but I believe that this is also a fundamental limitation. The model cannot discover regulatory elements or sequence variations beyond its 2kbp window unless their effects are already captured by the provided proximal epigenomic signals. Have the authors explored how the metrics change when we increase the context? Why was 2k chosen? 2. Results in Table 1 are based on

Reviewer 03Rating 6Confidence 3

Strengths

The paper presents a clear and well-supported argument that genomic sequence models do not significantly benefit from longer input sequences, a strong claim that is convincingly demonstrated through extensive experimentation (specifically Table 12). The results showing the impact of epigenetic markers are compelling and supported by thorough ablation studies that highlight the individual contribution of each signal type. The analysis of the confounding effect is insightful, and the proposed solu

Weaknesses

My concerns regarding the efficiency of the proposed approach. As shown in the hyperparameter sensitivity analysis (Section 4.3), the variation in performance when tuning the parameters \alpha and \beta appears minimal, suggesting limited sensitivity to these design choices. Similarly, the number of background states n has only a minor impact on results, as even the case n=0 in Table 2a performs comparably well. This raises questions about how essential the proposed causal intervention mechanism

Code & Models

Models

🤗
yangyz1230/Prism
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Chromatin Dynamics · Machine Learning in Bioinformatics · Gene expression and cancer classification