AskBeacon—performing genomic data exchange and analytics with natural language
Anuradha Wickramarachchi, Shakila Tonni, Sonali Majumdar, Sarvnaz Karimi, Sulev Kõks, Brendan Hosking, Jordi Rambla, Natalie A Twine, Yatish Jain, Denis C Bauer

TL;DR
AskBeacon allows users to ask genomic data questions in natural language and get secure, publication-ready insights.
Contribution
AskBeacon introduces a secure natural language interface for querying genomic data using LLMs and the Beacon protocol.
Findings
AskBeacon found autosomal markers occurred 1.4 times more in males with Parkinson’s disease than females.
The system ensures genomic data is not directly exposed to LLMs, preventing leaks or falsification.
Different LLMs and architectures were evaluated to optimize translation of research questions into Beacon queries.
Abstract
Enabling clinicians and researchers to directly interact with global genomic data resources by removing technological barriers is vital for medical genomics. AskBeacon enables large language models (LLMs) to be applied to securely shared cohorts via the Global Alliance for Genomics and Health Beacon protocol. By simply “asking” Beacon, actionable insights can be gained, analyzed, and made publication-ready. In the Parkinson's Progression Markers Initiative (PPMI), we use natural language to ask whether the sex-differences observed in Parkinson's disease are due to X-linked or autosomal markers. AskBeacon returns a publication-ready visualization showing that for PPMI the autosomal marker occurred 1.4 times more often in males with Parkinson’s disease than females, compared to no differences for the X-linked marker. We evaluate commercial and open-weight LLM models, as well as different…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1- —CSIRO10.13039/501100000943
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Biomedical Text Mining and Ontologies
1 Introduction
Global Alliance for Genomics and Health (GA4GH) introduced the Beacon protocol (Rueda et al. 2022) to standardize the exchange of genomic and phenotypic information. The underlying schema is designed for clinical and research use and enables industry-standard security and data governance practices. Beacon can jointly query genotypic and phenotypic data, with metadata information encoded through ontologies. However, empowering the community to perform these advanced queries requires a user-interface that can hide the underlying complexities, such as query combinations across collections (Cohorts and Datasets), entity types (Individuals, Biosamples, Runs, Analyses and Genomic variants), and return types (Records, Boolean, Counts), as well as translating medical or colloquial terminology to an ontology code, e.g. SNOMED.
We developed AskBeacon to abstract the complexities of the Beacon schema using large language models (LLMs). AskBeacon is a web interface (Supplementary Section S1) on top of sBeacon (Wickramarachchi et al. 2023), a cloud-based production-ready implementation of the Beacon protocol. It acts as the interpreter between a clinical or research question and the Beacon schema-formatted query (Fig. 1).
Gaining actionable insights on genomic data. (1) AskBeacon works with the user to create a valid beacon query. (2) Checks with the User that all inferred information is correct (e.g. ontology term, locations, etc.). (3) The user interface has different tabs to process queries in parallel and keeping contexts separate. (4) AskBeacon performs data analytics from prompts, presenting the resulting code to the user for optional verification and refinement (5). It creates publication-ready visualizations (6).
For example, using AskBeacon, clinicians and researchers can validate in their data whether the genetically determined sex differences in Parkinson's Disease (Klein and König 2021) are due to X-linked [RPL10 (Le Guen et al. 2021)] or autosomal [SNCA (Angelini et al. 2024)] genetic factors. Using natural language, users can perform currently expert-only steps such as (i) translating “Parkinson’s disease” to an ontology code, (ii) identifying individuals in the cohort with the relevant genotypes, (iii) constructing the right query to obtain the data, and (iv) analysing the data using custom scripts to visualize the results.
AskBeacon can query across federated repositories (Sever 2023) in the global Beacon Network (Fiume et al. 2019), where each node can be stood up securely and efficiently using an implementation of the Beacon protocol (Fromont et al. 2024), like sBeacon. This enables even small clinical and research groups, e.g. from underrepresented populations, to share genomic data and enable the secure, fully consented, and controlled query across human genetic diversity (Supplementary Section S2).
With our examples (Supplementary Section S6.3), we aim to inspire end-users like clinicians and researchers, and by addressing the paper to data providers we hope to empower them to meet the demand for interactivity by standing up an askBeacon-augmented sBeacon (Wickramarachchi et al. 2023) instance.
2 Methods
AskBeacon implements LLM connectors for the Beacon schema. AskBeacon is methodology agnostic, enabling new LLMs to be added as they become available. We tested the chat facility of currently available open [Ollama (2024), HuggingFace (Vaswani et al. 2017)] and commercial or closed-weighted models [OpenAI models GPT3.5 (Ye et al. 2023) and GPT 4 (Achiam et al. 2023), Anthropic model Claude 3.5 (Anthropic 2024)], as well as two different architectures (parallel and multi-step) for extracting the necessary information to populate a valid Beacon query (Supplementary Section S3). Specifically, we tested the individual components of a successful query, such as scope extraction, granularity extraction, variants extraction, and filter extraction as well as query validity (Supplementary Section S4).
3 Results
3.1 AskBeacon joins convenience of LLM with security of Beacon
For the research question “I want to study how sex differences affect individuals with disease,” AskBeacon guides the user with prompts until all information for a successful Beacon query is presented (Fig. 1.1). AskBeacon does the heavy lifting of obtaining ontology terms, interpreting genomic locations, inferring the query structure, and confirming all of this with the user (Fig. 1.2). The actual query is executed by Beacon’s software development kit (SDK, Supplementary Section S6), guaranteeing deterministic results. In our example, the data extraction obtains Individuals with mutations in rs28602900 (RPL10) and rs356181 (SNCA) from the PPMI (Chahine et al. 2018) (http://www.ppmi-info.org see acknowledgement). For beacon instance that are authorized to provide numeric return values, AskBeacon also supports data analytics with prompts such as “Plot a pie chart for karyotypic sex” (Fig. 1.4). It automatically generates the required analysis code and displays it for the user to confirm or adjust (Fig. 1.5). In our case, Fig. 1.6 shows that the X-linked marker (RPL10) has no sex difference in PPMI, while the autosomal marker on chromosome 4 occurs 1.4 times more often in males with Parkinson’s than females.
3.2 Modular use of LLM to keep pace with innovation
The parallel extraction approach was more resilient to network failures, incorrect inputs, partial inputs, and malformed LLM, however, it consumes more tokens compared to multi-step extractor. The multi-step approach can instead pick the suitable next chain depending on previous input or terminate chains early resulting in lower token consumption. However, failures at any point can terminate the query without extracting any information (Supplementary Section S4). Parallel workflows better scale to future throughputs with the elevated token consumption offset by an expected decrease in cost.
Overall, Gemma 2 was the most suitable open model for both parallel and multistep workflows (Supplementary Tables S4.1 and S4.2) with average F1 scores of 0.92 and 0.81 for scope, granularity, variant, and filter extraction whereas GPT models were the most suitable commercial models with F1 scores of 0.91 and 0.81, respectively. Gemma 2’s superior performance in parallel workflow may be due to its training on knowledge distillation rather than the next token prediction of GPT-4 (Riviere et al. 2024). Commercial models overall had greater performance than open-weight models, probably due to the larger model size and up-to-date knowledge of bioinformatics and genome beacons. We have also presented LLM specific analyses of performance in Supplementary Section S5.
3.3 Context management for building progressively complex interactions across multiple topics
AskBeacon manages chat histories to enable users to build on previous queries. These history objects contain the summary of a conversation, specifically the variants, filters, chosen scope, and granularity. This enables AskBeacon to request additional information to generate a Beacon compliant query. As shown in Fig. 1.3 the sBeacon User Interface (UI) allows the addition of multiple tabs so different queries can be maintained in parallel, enabling concurrent but independent querying.
3.4 AskBeacon keeps the human in the loop
AskBeacon keeps data extraction separate from data analysis to enable checkpointing from the human expert. All inferences are verified with the user to ensure elements such as ontology terms or genomic locations queried are aligned with the research question (Fig. 1.2). For users with programming capability the generated code for data transformation and plotting can be reviewed and amended in an editor before execution (Fig. 1.5). This optional step is further supported by also displaying both standard output and error streams aiding debugging (Supplementary Section S6).
3.5 Security measures
The analysis and exchange of sensitive data demands strict guard-rails to prevent misuse, leakage, or exploitation by malicious actors. While all LLM vendors assure data privacy and security contractually, AskBeacon augments this by keeping sBeacon as the conduit between the data and the LLM, so that data is never exposed to the LLM directly. sBeacon ensures the safety of underlying data through state-of-the-art data security, with user management adhering to GA4GH protocol guidelines. sBeacon also ensures that AskBeacon’s data extractors operate at the same level of access as the active user, hence no data can be extracted through AskBeacon that the user does not already have access to. AskBeacon’s analytics is protected by using static code analysis and sandboxing, preventing accidental exposure of source code, files, or runtime variables. We also perform static code analysis before any code execution and remove code with adversarial effect.
4 Conclusion
We built AskBeacon to leverage the GA4GH’s Beacon protocol and LLMs to enable genomic data analysis using natural language across the global Beacon network. In the future, we will extend the conversation capability to combine data extraction and analytics task, while retaining our extensive human checkpointing. This will enable users to ask comparative questions between cohorts, with AskBeacon automatically choosing the best statistical and visualization method. Given that Beacon is a discovery tool, differences in return values need to be considered, e.g. while some beacons return individual genotypes others might return summary statistics that are incompatible with each other. This also extends to differences in ontologies. We will leverage efforts from the terminology community to translate between different dictionaries and develop principles for beacons to communicate across different abstraction levels.
Supplementary Material
btaf079_Supplementary_Data
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Achiam J , Adler S, Agarwal S et al; Open AI. GPT-4 Technical Report. 2023. https://arxiv.org/abs/2303.08774
- 2Angelini G , Malvaso A, Schirripa A et al Unraveling sex differences in Parkinson’s disease through explainable machine learning. J Neurol Sci 2024;462:123091. 10.1016/j.jns.2024.12309138870732 · doi ↗ · pubmed ↗
- 3Anthropic AI. The Claude 3 model family: Opus, sonnet, haiku. Claude-3 Model Card 2024;1. https://www.anthropic.com/news/claude-3-5-sonnet, https://www-cdn.anthropic.com/de 8ba 9b 01c 9ab 7cbabf 5c 33b 80b 7bbc 618857627/Model_Card_Claude_3.pdf
- 4Chahine LM , Urbe L, Caspell-Garcia C et al; Parkinson’s Progression Markers Initiative. Cognition among individuals along a spectrum of increased risk for Parkinson’s disease. P Lo S ONE 2018;13:e 0201964. 10.1371/journal.pone.020196430125297 PMC 6101368 · doi ↗ · pubmed ↗
- 5Fiume M , Cupak M, Keenan S et al Federated discovery and sharing of genomic data using Beacons. Nat Biotechnol 2019;37:220–4. 10.1038/s 41587-019-0046-x 30833764 PMC 6728157 · doi ↗ · pubmed ↗
- 6Fromont LA , Moldes M, Baudis M et al Twelve quick tips for deploying a Beacon. P Lo S Comput Biol 2024;20:e 1011817. 10.1371/journal.pcbi.101181738427629 PMC 10906850 · doi ↗ · pubmed ↗
- 7Klein C , König IR. Exploring uncharted territory: genetically determined sex differences in Parkinson’s disease. Ann Neurol 2021;90:15–8. 10.1002/ana.2609133938006 · doi ↗ · pubmed ↗
- 8Le Guen Y , Napolioni V, Belloy ME et al Common X‐chromosome variants are associated with Parkinson disease risk. Ann Neurol 2021;90:22–34. 10.1002/ana.2605133583074 PMC 8601399 · doi ↗ · pubmed ↗
