Dock2D: Synthetic data for the molecular recognition problem
Siddharth Bhadra-Lobo, Georgy Derevyanko, Guillaume Lamoureux

TL;DR
Dock2D introduces simplified 2D shape datasets for studying molecular interactions, enabling easier algorithm testing and interpretation compared to complex protein structure data.
Contribution
The paper presents two new toy datasets, Dock2D-IP and Dock2D-IF, for benchmarking algorithms predicting molecular interactions using 2D shapes.
Findings
Baseline solutions demonstrate the energy function can be learned from interaction pose data.
The same energy function applies to both docking and binding energy estimation tasks.
Datasets facilitate algorithm development and interpretation in molecular recognition.
Abstract
Predicting the physical interaction of proteins is a cornerstone problem in computational biology. New classes of learning-based algorithms are actively being developed, and are typically trained end-to-end on protein complex structures extracted from the Protein Data Bank. These training datasets tend to be large and difficult to use for prototyping and, unlike image or natural language datasets, they are not easily interpretable by non-experts. We present Dock2D-IP and Dock2D-IF, two "toy" datasets that can be used to select algorithms predicting protein-protein interactionsor any other type of molecular interactions. Using two-dimensional shapes as input, each example from Dock2D-IP ("interaction pose") describes the interaction pose of two shapes known to interact and each example from Dock2D-IF ("interaction fact") describes whether two shapes form a stable complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Computational Drug Discovery Methods · Machine Learning in Materials Science
