# Helping authors produce FAIR taxonomic data: evaluation of an author-driven phenotype data production prototype

**Authors:** Limin Zhang, Julian Starr, Bruce Ford, Anton Reznicek, Yuxuan Zhou, Étienne Léveillé-Bourret, Étienne Lacroix-Carignan, Jacques Cayouette, Tyler W Smith, Donald Sutherland, Paul Catling, Jeffery M Saarela, Hong Cui, James Macklin

PMC · DOI: 10.1093/database/baae097 · 2025-01-29

## TL;DR

This paper introduces a prototype system to help biologists create standardized, FAIR-compliant phenotype data using ontologies and evaluates its effectiveness with students and experts.

## Contribution

A prototype system with ontology-enhanced tools for FAIR phenotype data production and its evaluation with users.

## Key findings

- Character Recorder was found to be quickly learnable and comparable to Excel in cognitive demand.
- Users produced higher-quality data with Character Recorder compared to Excel.
- Experts recommended and supported the tool's development into a comprehensive system.

## Abstract

It is well-known that the use of vocabulary in phenotype treatments is often inconsistent. An earlier survey of biologists who create or use phenotypic characters revealed that this lack of standardization leads to ambiguities, frustrating both the consumers and producers of phenotypic data. Such ambiguities are challenging for biologists, and more so for Artificial Intelligence, to resolve. That survey also indicated a strong interest in a new authoring workflow supported by ontologies to ensure published phenotype data are FAIR (Findable, Accessible, Interoperable, and Reusable) and suitable for large-scale computational analyses.

In this article, we introduce a prototype software system designed for authors to produce computational phenotype data. This platform includes a web-based, ontology-enhanced editor for taxonomic characters (Character Recorder), an Ontology Backend holding standardized vocabulary (the Cared Ontology), and a mobile application for resolving ontological conflicts (Conflict Resolver). We present two formal user evaluations of Character Recorder, the main interface authors would interact with to produce FAIR data. The evaluations were conducted with undergraduate biology students and Carex experts. We evaluated Character Recorder against Microsoft Excel on their effectiveness, efficiency, and the cognitive demands of the users in producing computable taxon-by-character matrices.

The evaluations showed that Character Recorder is quickly learnable for both student and professional participants, with its cognitive demand comparable to Excel’s. Participants agreed that the quality of the data Character Recorder yielded was superior. Students praised Character Recorder’s educational value, while Carex experts were keen to recommend it and help evolve it from a prototype into a comprehensive tool. Feature improvements recommended by expert participants have been implemented after the evaluation.

## Full-text entities

- **Chemicals:** CR (MESH:D002857), caespitose (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Hypogeophis rostratus (Frigate Island caecilian, species) [taxon 8450]

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11928229/full.md

---
Source: https://tomesphere.com/paper/PMC11928229