biorecap: an R package for summarizing bioRxiv preprints with a local LLM
Stephen D. Turner

TL;DR
biorecap is an R package that enables researchers to efficiently retrieve and summarize bioRxiv preprints using local large language models, helping manage information overload in life sciences research.
Contribution
It introduces a novel R package that integrates local LLMs with bioRxiv preprint summarization, emphasizing security, flexibility, and user-friendly design.
Findings
Allows local LLM-based summarization of bioRxiv preprints
Generates timestamped CSV and HTML reports for recent preprints
Facilitates efficient updates in life sciences research workflows
Abstract
The establishment of bioRxiv facilitated the rapid adoption of preprints in the life sciences, accelerating the dissemination of new research findings. However, the sheer volume of preprints published daily can be overwhelming, making it challenging for researchers to stay updated on the latest developments. Here, I introduce biorecap, an R package that retrieves and summarizes bioRxiv preprints using a large language model (LLM) running locally on nearly any commodity laptop. biorecap leverages the ollamar package to interface with the Ollama server and API endpoints, allowing users to prompt any local LLM available through Ollama. The package follows tidyverse conventions, enabling users to pipe the output of one function as input to another. Additionally, biorecap provides a single wrapper function that generates a timestamped CSV file and HTML report containing short summaries of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic Publishing and Open Access · Biomedical Text Mining and Ontologies · Research Data Management Practices
