# HERMES: an open-source mining tool for open-access literature

**Authors:** Julien Charest, Katarina Priselac, Georg H Reischer, Andreas H Farnleitner, Robert L Mach, Astrid R Mach-Aigner

PMC · DOI: 10.1093/bioadv/vbag058 · Bioinformatics Advances · 2026-02-17

## TL;DR

HERMES is an open-source tool that helps researchers efficiently find and prioritize relevant open-access scientific papers using customizable criteria.

## Contribution

HERMES introduces a customizable, reproducible framework for mining open-access literature with a composite scoring algorithm and user-friendly interface.

## Key findings

- HERMES integrates keyword frequency, citation counts, and publication age to rank relevant papers.
- The tool supports biomedical entity recognition and generates PDF reports for efficient literature analysis.
- HERMES offers a GUI for non-programmers and multithreaded processing for large-scale queries.

## Abstract

The exponential growth of open-access scientific literature presents researchers with unprecedented opportunities but also poses a significant challenge: how to efficiently identify and prioritize relevant publications in a transparent and customizable manner. Existing search engines index large volumes of biomedical literature but rarely provide user-defined ranking options, reproducibility, or integration of domain-specific criteria. This gap is particularly limiting for specialized fields, where nuanced keyword combinations, literature recency, and contextual interpretation are critical.

We present HERMES, an open-source literature mining tool for targeted retrieval and ranking of full-text open-access publications from PubMed Central (PMC). HERMES employs a composite scoring algorithm that integrates keyword frequency, citation counts, and publication age to prioritize publications. It further supports summarization, biomedical entity recognition, and PDF report generation. An intuitive graphical user interface (GUI) allows researchers without programming expertise to perform complex literature mining tasks, while multithreaded processing ensures efficiency for large-scale queries. HERMES provides a reproducible and adaptable framework for literature discovery, empowering researchers to rapidly identify relevant literature and promoting transparency and community-driven extension.

HERMES (version 1.2) is implemented in Python (3.11). The source code is freely available on GitHub at https://github.com/julien-charest/hermes and is distributed under the GPL-3 license.

## Full-text entities

- **Genes:** che-1 (C2H2-type domain-containing protein;Transcription factor che-1) [NCBI Gene 183847], lsy-6 (ncRNA) [NCBI Gene 3565492]
- **Diseases:** toxic (MESH:D064420), PMC (MESH:D020210)
- **Species:** Trichoderma reesei (species) [taxon 51453], Caenorhabditis elegans (species) [taxon 6239], C. elegans [taxon 328850]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12952204/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12952204/full.md

## References

10 references — full list in the complete paper: https://tomesphere.com/paper/PMC12952204/full.md

---
Source: https://tomesphere.com/paper/PMC12952204