# Automating the analysis of public saliency and attitudes toward biodiversity from digital media

**Authors:** Noah Giebink, Amrita Gupta, Diogo Veríssimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett G. Dickson, Alex Bowmer, Jonathan Baillie

PMC · DOI: 10.1111/cobi.70217 · 2026-01-18

## TL;DR

This paper introduces a method to automatically analyze public attitudes toward wildlife from digital media, using natural language processing tools to filter and interpret large volumes of data.

## Contribution

The paper introduces a novel two-stage relevance filter combining unsupervised learning and zero-shot LLMs to analyze biodiversity-related public discourse in digital media.

## Key findings

- Up to 62% of articles with bat-related search terms were irrelevant, highlighting the need for robust filtering.
- News and X posts about horseshoe bats increased significantly in early 2020, with sentiment shifts later in the year.
- The method effectively applies modern NLP tools to analyze public perceptions of biodiversity.

## Abstract

Measuring public attitudes toward wildlife provides crucial insights into human relationships with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale presents challenges. Digital news and social media offer a rich record of public discourse, but extracting information about attitudes toward wildlife from these sources is not straightforward. Selecting effective search terms is complicated by differences between everyday names for taxa and their scientific or formal common names, and raw news and social media data are often cluttered with irrelevant content and syndicated articles. To address search term selection, we used a folk taxonomy approach that derives recognizable species groupings from shared common name endings. We identified syndicated articles by using cosine similarity on term frequency‐inverse document frequency vectors. To filter out irrelevant content while minimizing the need for corpus‐specific annotation and model training, we developed a 2‐stage relevance filter that uses unsupervised learning to reveal common topics and an open‐source zero‐shot large language model (LLM) to assign topics to article titles and estimate relevance. We conducted sentiment, topic, and volume analyses on the resulting data. To illustrate our method, we examined news and X posts containing search terms for bats, pangolins, elephants, and gorillas from 2019 through 2021, a period that covers the onset of the COVID‐19 pandemic. Up to 62% of articles containing bat search terms were unrelated to bats as wildlife, underscoring the importance of relevance filtering. News articles mentioning horseshoe bats, initially implicated in the outbreak, increased significantly in January 2020, with significant sentiment shifts in news and X posts mentioning horseshoe bats emerging later (October 2020). Our methods provide a practical application of modern, general‐purpose natural language processing (NLP) tools, including LLMs, for analyzing public perceptions of biodiversity relative to current events or conservation outreach and marketing campaigns.

## Linked entities

- **Diseases:** COVID-19 (MONDO:0100096)

## Full-text entities

- **Diseases:** COVID-19 (MESH:D000086382)
- **Species:** Bacillus sp. AT (species) [taxon 1196779], Chiroptera (bats, order) [taxon 9397], Rhinolophus (genus) [taxon 49442], Homo sapiens (human, species) [taxon 9606]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13036311/full.md

---
Source: https://tomesphere.com/paper/PMC13036311