# Sometimes the apple does fall far from the tree: a case study on automatic indexing precision errors in PubMed

**Authors:** Paije Wilson

PMC · DOI: 10.5195/jmla.2025.2110 · Journal of the Medical Library Association : JMLA · 2025-10-23

## TL;DR

This study examines how often automatic indexing in PubMed incorrectly labels articles about apples or apple trees.

## Contribution

The study reveals the prevalence and types of automatic indexing errors caused by ambiguous word usage in PubMed.

## Key findings

- 7.9% of automatically indexed MEDLINE records with the MeSH term Malus were incorrectly indexed.
- Most errors occurred when the word 'apple' was used metaphorically or in names/terms.
- Errors also arose from acronyms and unrelated references like 'Sir Isaac Newton'.

## Abstract

This case study identifies the presence and prevalence of precision indexing errors in a subset of automatically indexed MEDLINE records in PubMed (specifically, all MEDLINE records automatically indexed with the MeSH term Malus, the genus name for apple trees). In short, how well does automatic indexing compare [figurative] apples to [literal] apples?

1,705 MEDLINE records automatically indexed with the MeSH term Malus underwent title/abstract and full text screening to determine whether they were correctly indexed (i.e., the records were about Malus, meaning they discussed the literal fruit or tree) or incorrectly indexed (i.e., they were not about Malus, meaning they did not discuss the literal fruit or tree). The context and type of indexing error were documented for each erroneously indexed record.

135 (7.9%) records were incorrectly indexed with the MeSH term Malus. The most common indexing error was due to the word “apple” being used in similes, metaphors, and idioms (80, or 59.2%), with the next most common error being due to “apple” being present in a name or term (50, or 37%). Additional indexing errors were attributed to the use of “apple” in acronyms, and, in one case, a reference to Sir Isaac Newton.

As indicated by this study's findings, automatic indexing can commit errors when indexing records that have words with non-literal or alternative meanings in their titles or abstracts. Librarians should be mindful of the existence of automatic indexing errors, and instruct authors on how best to ameliorate the effects of them within their own manuscripts.

## Linked entities

- **Species:** Malus (taxon 3749)

## Full-text entities

- **Species:** Malus domestica (apple, species) [taxon 3750]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12606386/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12606386/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/PMC12606386/full.md

---
Source: https://tomesphere.com/paper/PMC12606386