# Review of autism spectrum disorder databases for the identification of candidate genes

**Authors:** Diana Martínez-Minguet, René Noel, Alberto García S., Mireia Costa, Oscar Pastor

PMC · DOI: 10.1093/database/baaf067 · Database: The Journal of Biological Databases and Curation · 2025-10-15

## TL;DR

This paper reviews autism databases to assess their reliability in identifying genes linked to autism, finding significant inconsistencies between them.

## Contribution

The study introduces a systematic approach to evaluate the quality and reliability of ASD genetic databases.

## Key findings

- SFARI Gene showed the highest schema-level completeness at 89%, while AutDB had 90% data-level completeness.
- Only 1.5% consistency was observed in high-confidence ASD gene classifications across four databases.
- Differences in scoring criteria and evidence sources drive inconsistencies in gene classification.

## Abstract

Research into the genetics of autism spectrum disorder (ASD) seeks to unravel its complex genetic background by identifying genes associated with the condition at varying levels of confidence. While these findings hold significant potential for clinical applications, the dispersed nature of scientific evidence presents a challenge for the reliable identification of ASD candidate genes. Although ASD candidate genes are gathered in genetic databases, these vary widely in the gene sets, biological information, and confidence level classification methods, leading to inconsistencies and complicating research efforts. This study aims to identify and assess the quality and reliability of ASD genetic databases to support more robust identification of ASD candidate genes. Using a Systematic Mapping Study, we identified 13 specialized databases. We then followed a Data Quality Approach in two stages, first assessing Accessibility, Currency, and Relevance dimensions to select the potentially relevant databases to be used as ASD candidate gene sources. The selected databases were analysed, assessing Completeness—at schema and data level—, and Consistency between high-confidence ASD genes. The four selected databases are: AutDB, SFARI Gene, GeisingerDBD, and SysNDD. SFARI Gene demonstrated the highest completeness at schema level (89%), while AutDB showed the highest completeness at data level (90%). However, only 1.5% consistency was observed across the four databases in their classification of high-confidence ASD candidate genes. Our findings highlight the unique contributions of each database and reveal substantial inconsistencies in gene classification, driven by differences in scoring criteria and the scientific evidence considered. These inconsistencies have important implications for both clinical users and researchers, as conclusions may vary depending on the database used. This study supports researchers when using ASD genetic databases, promoting consistent interpretation and improved clinical decisions.

## Linked entities

- **Diseases:** autism spectrum disorder (MONDO:0005258), ASD (MONDO:0006664)

## Full-text entities

- **Diseases:** ASD (MESH:D000067877)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12527254/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12527254/full.md

## References

73 references — full list in the complete paper: https://tomesphere.com/paper/PMC12527254/full.md

---
Source: https://tomesphere.com/paper/PMC12527254