A Matrix Variational Auto-Encoder for Variant Effect Prediction in Pharmacogenes

Antoine Honor\'e; Borja Rodr\'iguez G\'alvez; Yoomi Park; Yitian Zhou; Volker M. Lauschke; Ming Xiao

arXiv:2507.02624·cs.LG·July 4, 2025

A Matrix Variational Auto-Encoder for Variant Effect Prediction in Pharmacogenes

Antoine Honor\'e, Borja Rodr\'iguez G\'alvez, Yoomi Park, Yitian Zhou, Volker M. Lauschke, Ming Xiao

PDF

3 Reviews

TL;DR

This paper introduces a transformer-based matrix variational auto-encoder that leverages deep mutational scanning data to predict variant effects in pharmacogenes, outperforming traditional MSA-based models and reducing computational costs.

Contribution

It presents a novel matrix VAE model trained on DMS data that surpasses state-of-the-art methods in zero-shot variant effect prediction and demonstrates the benefits of integrating structural data.

Findings

01

matVAE-MSA outperforms DeepSequence in zero-shot prediction

02

Incorporating AlphaFold structures improves performance

03

DMS datasets can replace MSAs with minimal predictive loss

Abstract

Variant effect predictors (VEPs) aim to assess the functional impact of protein variants, traditionally relying on multiple sequence alignments (MSAs). This approach assumes that naturally occurring variants are fit, an assumption challenged by pharmacogenomics, where some pharmacogenes experience low evolutionary pressure. Deep mutational scanning (DMS) datasets provide an alternative by offering quantitative fitness scores for variants. In this work, we propose a transformer-based matrix variational auto-encoder (matVAE) with a structured prior and evaluate its performance on 33 DMS datasets corresponding to 26 drug target and ADME proteins from the ProteinGym benchmark. Our model trained on MSAs (matVAE-MSA) outperforms the state-of-the-art DeepSequence model in zero-shot prediction on DMS datasets, despite using an order of magnitude fewer parameters and requiring less computation…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

- The authors explore novel architectural innovations to the DeepSequence/EVE family of models. In particular, they use self-attention layers where the attention map is determined from predicted contacts in AF2 structures. They also place more expressive priors on the latent space in matVAE-MSA. These are innovative ideas that have not yet been considered in the field. - The authors clearly benchmark their method to other state-of-the-art methods and clearly show the impact of their architectu

Weaknesses

- The primary weakness of this paper is that their VAE model does not outperform existing unsupervised variant effect predictors (like ESM or DeepSequence). They find that using a more complicated prior does not improve performance and that self-attention layers with attention maps defined using AF2 contacts does not help. - The authors do go on to say that their encoder-only model trained on DMS data does outperform unsupervised variant effect predictors, but they do not compare to unsupervise

Reviewer 02Rating 3Confidence 3

Strengths

* **Novelty**: To the best of my knowledge, the authors' proposed framework is novel. * **Impact**: The authors' motivation (i.e., assessing how well evolutionary pressure corresponds to fitness and the corresponding impact on variant effect prediction) is solid, and such studies would likely be of interest to the machine learning for proteins community.

Weaknesses

Despite the potential impact of the authors' work, I believe that the authors' submission has significant issues that prevent me from recommending acceptance at this time. I provide details on the major issues below: * **Unclear motivation for model design choices**: The authors spend a significant amount of time experimenting certain modeling/architecture choices (e.g. using a mixture of gaussians prior rather than a unimodal prior), which in the end don't have an impact on model performance.

Reviewer 03Rating 3Confidence 4

Strengths

- A clear presentation on the new transformer-based module for the encoder and decoder. - A comprehensive investigation and analysis on the impact of different designed modules to the prediction task.

Weaknesses

- The presentation of the motivation is unclear (Q1, 3). - The justification for the experimental design and significance of the results is not clearly articulated (Q2, 3, 4, 6). - The design of the prediction tasks appears to be questionable (Q3, 5, 8). - The comparison with baseline methods is incomplete (Q7).

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.