AGP: A Novel Arabidopsis thaliana Genomics-Phenomics Dataset and its HyperGraph Baseline Benchmarking
Manuel Serna-Aguilera, Fiona L. Goggin, Aranyak Goswami, Alexander Bucksch, Suxing Liu, Khoa Luu

TL;DR
This paper introduces AGP, a comprehensive multi-modal Arabidopsis thaliana dataset linking gene expression and phenotypes, and benchmarks hypergraph models for gene-trait prediction, advancing plant genomics research.
Contribution
The paper presents the first integrated multi-modal genomics-phenomics dataset for Arabidopsis thaliana and evaluates a biologically-informed hypergraph baseline for gene-trait association tasks.
Findings
Hypergraph baseline effectively models gene-trait relationships.
AGP dataset enables multi-faceted analysis of gene-phenotype links.
Benchmark results highlight potential for improved predictive models.
Abstract
Understanding which genes control which traits in an organism remains one of the central challenges in biology. Despite significant advances in data collection technology, our ability to map genes to traits is still limited. This genome-to-phenome (G2P) challenge spans several problem domains, including plant breeding, and requires models capable of reasoning over high-dimensional, heterogeneous, and biologically structured data. Currently, however, many datasets solely capture genetic information or solely capture phenotype information. Additionally, phenotype data is very heterogeneous, which many datasets do not fully capture. The critical drawback is that these datasets are not integrated, that is, they do not link with each other to describe the same biological specimens. This limits machine learning models' ability to be informed on the various aspects of these specimens,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant Molecular Biology Research · Genetic Mapping and Diversity in Plants and Animals · Bioinformatics and Genomic Networks
