Analysis of the first Genetic Engineering Attribution Challenge
Oliver M. Crook, Kelsey Lane Warmbrod, Greg Lipstein, Christine Chung,, Christopher W. Bakerlee, T. Greg McKelvey Jr., Shelly R. Holland, Jacob L., Swett, Kevin M. Esvelt, Ethan C. Alley, and William J. Bradshaw

TL;DR
This paper reports on the first Genetic Engineering Attribution Challenge, demonstrating significant improvements in identifying the lab-of-origin of engineered sequences using machine learning, especially CNNs and ensemble methods.
Contribution
It introduces a public competition that advances GEA technology, showcasing new high-performance models and metrics for attribution accuracy and confidence.
Findings
Top models outperformed previous approaches by 10 percentage points in accuracy.
Ensemble models further improved attribution performance.
Both CNN-based and neural-network-free approaches achieved high accuracy.
Abstract
The ability to identify the designer of engineered biological sequences -- termed genetic engineering attribution (GEA) -- would help ensure due credit for biotechnological innovation, while holding designers accountable to the communities they affect. Here, we present the results of the first Genetic Engineering Attribution Challenge, a public data-science competition to advance GEA. Top-scoring teams dramatically outperformed previous models at identifying the true lab-of-origin of engineered sequences, including an increase in top-1 and top-10 accuracy of 10 percentage points. A simple ensemble of prizewinning models further increased performance. New metrics, designed to assess a model's ability to confidently exclude candidate labs, also showed major improvements, especially for the ensemble. Most winning teams adopted CNN-based machine-learning approaches; however, one team…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCRISPR and Genetic Engineering · Cell Image Analysis Techniques · Biomedical and Engineering Education
