Learning to refine domain knowledge for biological network inference

Peiwen Li; Menghua Wu

arXiv:2410.14436·q-bio.QM·October 21, 2024

Learning to refine domain knowledge for biological network inference

Peiwen Li, Menghua Wu

PDF

Open Access

TL;DR

This paper introduces an amortized algorithm that refines biological domain knowledge using data observations, improving causal graph inference and error detection in biological networks with limited data.

Contribution

It combines knowledge graph refinement with data-driven learning, offering a novel approach to improve causal inference in complex biological systems.

Findings

01

Outperforms baselines in recovering ground truth causal graphs

02

Effectively identifies errors in prior biological knowledge

03

Works well with limited interventional data

Abstract

Perturbation experiments allow biologists to discover causal relationships between variables of interest, but the sparsity and high dimensionality of these data pose significant challenges for causal structure learning algorithms. Biological knowledge graphs can bootstrap the inference of causal structures in these situations, but since they compile vastly diverse information, they can bias predictions towards well-studied systems. Alternatively, amortized causal structure learning algorithms encode inductive biases through data simulation and train supervised models to recapitulate these synthetic graphs. However, realistically simulating biology is arguably even harder than understanding a specific system. In this work, we take inspiration from both strategies and propose an amortized algorithm for refining domain knowledge, based on data observations. On real and synthetic datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Bioinformatics · Bioinformatics and Genomic Networks · Fractal and DNA sequence analysis