EHCube4P: Learning Epistatic Patterns Through Hypercube Graph Convolution Neural Network for Protein Fitness Function Estimation
Muhammad Daud, Philippe Charton, Cedric Damour, Jingbo Wang, and Frederic Cadet

TL;DR
This paper introduces EHCube4P, a novel graph convolutional neural network framework that models protein sequence landscapes as hypercubes, integrating wavelet denoising to improve fitness prediction accuracy in noisy, complex landscapes.
Contribution
The study presents a new hypercube-based GCN model combined with wavelet denoising for robust protein fitness prediction, capturing higher-order epistatic interactions beyond pairwise effects.
Findings
Effective noise suppression with wavelet transform.
Generalizes well across different enzyme datasets.
Higher accuracy on smoother fitness landscapes.
Abstract
Understanding the relationship between protein sequences and their functions is fundamental to protein engineering, but this task is hindered by the combinatorially vast sequence space and the experimental noise inherent in fitness measurements. In this study, we present a novel framework that models the sequence landscape as a hypercube and integrates wavelet-based signal denoising with a graph convolutional neural network (GCN) to predict protein fitness across rugged fitness landscapes. Using a dataset of 419 experimentally measured mutant sequences of the Tobacco 5-Epi-Aristolochene Synthase (TEAS) enzyme, we preprocess the fitness signals using a 1-D discrete wavelet transform with a Daubechies-3 basis to suppress experimental noise while preserving local epistatic patterns. Our model comprises two GCN layers, allowing for beyond pairwise aggregation, followed by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Genetic Dynamics · Evolutionary Algorithms and Applications · Microbial Metabolic Engineering and Bioproduction
MethodsSparse Evolutionary Training · Graph Convolutional Network
