Metadata-guided Feature Disentanglement for Functional Genomics
Alexander Rakowski, Remo Monti, Viktoriia Huryn, Marta Lemanczyk, Uwe, Ohler, Christoph Lippert

TL;DR
This paper introduces Metadata-guided Feature Disentanglement (MFD), a novel deep learning approach that separates biological signals from technical biases in large genomics datasets, improving interpretability and downstream prediction tasks.
Contribution
MFD is a new method that incorporates metadata into model training to disentangle biological features from technical biases in functional genomics data.
Findings
MFD improves model interpretability by linking features to experimental factors.
MFD enhances downstream tasks like enhancer prediction and variant discovery.
The approach maintains or improves predictive performance.
Abstract
With the development of high-throughput technologies, genomics datasets rapidly grow in size, including functional genomics data. This has allowed the training of large Deep Learning (DL) models to predict epigenetic readouts, such as protein binding or histone modifications, from genome sequences. However, large dataset sizes come at a price of data consistency, often aggregating results from a large number of studies, conducted under varying experimental conditions. While data from large-scale consortia are useful as they allow studying the effects of different biological conditions, they can also contain unwanted biases from confounding experimental factors. Here, we introduce Metadata-guided Feature Disentanglement (MFD) - an approach that allows disentangling biologically relevant features from potential technical biases. MFD incorporates target metadata into model training, by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification
