Reconciling Inconsistent Molecular Structures from Biochemical Databases
Casper Asbj{\o}rn Eriksen, Jakob Lykke Andersen, Rolf Fagerberg,, Daniel Merkle

TL;DR
This paper introduces StructRecon, a tool that integrates and standardizes molecular structures from multiple biochemical databases, resolving inconsistencies and selecting the most likely structures through a novel graph-based approach.
Contribution
StructRecon is a new method that constructs an identifier graph and standardizes structures to reconcile discrepancies across databases, improving data consistency.
Findings
Resolved a unique structure for 85.11% of identifiers in EColiCore2
Supports multiple levels of structural detail for flexible standardization
Open-source and modular design enables future database integration
Abstract
Information on the structure of molecules, retrieved via biochemical databases, plays a pivotal role in various disciplines, such as metabolomics, systems biology, and drug discovery. However, no such database can be complete, and the chemical structure for a given compound is not necessarily consistent between databases. This paper presents StructRecon, a novel tool for resolving unique and correct molecular structures from database identifiers. StructRecon traverses the cross-links between database entries in different databases to construct what we call an identifier graph, which offers a more complete view of the total information available on a particular compound across all the databases. In order to reconcile discrepancies between databases, we first present an extensible model for chemical structure which supports multiple independent levels of detail, allowing standardisation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Metabolomics and Mass Spectrometry Studies · Analytical Chemistry and Chromatography
