Curated AI beats frontier LLMs at pharma asset discovery

{\L}ukasz Kidzi\'nski; Kevin Thomas

arXiv:2605.04908·cs.AI·May 7, 2026

Curated AI beats frontier LLMs at pharma asset discovery

{\L}ukasz Kidzi\'nski, Kevin Thomas

PDF

TL;DR

A curated AI platform named Gosset outperforms leading web-search-based LLMs in pharmaceutical asset discovery, achieving higher verified drug retrieval with perfect precision and recall on niche oncology and immunology targets.

Contribution

Introduction of Gosset, a curated drug-asset database integrated with an AI chat interface, demonstrating superior performance over frontier web-access LLMs in pharma asset discovery.

Findings

01

Gosset retrieves 3.2x more verified drugs per query than top frontier systems.

02

Gosset achieves 100% recall and perfect precision on tested targets.

03

Curated index can be used as a tool to improve other models' recall.

Abstract

General-purpose LLMs with web search are increasingly used to scout the competitive landscape of pharmaceutical pipelines. We benchmark Gosset -- an AI platform with a chat interface backed by curated target-, modality-, and indication-level drug-asset annotations -- against four frontier systems with web access (Claude Opus 4.7, GPT 5.5, Gemini 3.1 Pro, Perplexity sonar-pro) on ten niche oncology/immunology targets where most of the pipeline lives in the long tail of preclinical and Asian-developed assets. All five systems receive the same natural-language query and the same JSON output schema. Across 10 targets Gosset returns 3.2x more verified drugs per query than the best frontier system, at perfect precision and 100% recall against the cross-system union of verified drugs. The same curated index is exposed as a Gosset MCP server that any frontier model can call as a tool,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.