# LinkML: an open data modeling framework

**Authors:** Sierra A T Moxon, Harold Solbrig, Nomi L Harris, Patrick Kalita, Mark A Miller, Sujay Patil, Kevin Schaper, Chris Bizon, J Harry Caufield, Silvano Cirujano Cuesta, Corey Cox, Frank Dekervel, Damion M Dooley, William D Duncan, Tim Fliss, Sarah Gehrke, Adam S L Graefe, Harshad Hegde, A J Ireland, Julius O B Jacobsen, Madan Krishnamurthy, Carlo Kroll, David Linke, Ryan Ly, Nicolas Matentzoglu, James A Overton, Jonny L Saunders, Deepak R Unni, Gaurav Vaidya, Wouter-Michiel A M Vierdag, Richard M Bruskiewich, Richard M Bruskiewich, Seth Carbon, Eric Cavanna, John-Marc Chandonia, Shreyas Cholia, Ben Dichter, Emiley A Eloe-Fadrosh, Vincent Emonet, Shahim Essaid, James A Fellows Yates, Joseph Flack, Satrajit S , Goutte-Gattat, Damien Ghosh, Dorota Jarecka, Dazhi Jiao, Marcin P Joachimiak, Vlad Korolev, Volodymyr Lapkin, Noel McLoughlin, Sierra D Miller, Michael Milton, Josh Moore, Moni Munoz-Torres, B Nolan Nichols, Justin T Reese, Victoria Savage, Philip Stroemert, Jeremy Teoh, Anne Thessen, Isaac To, Puja Trivedi, Vincent Vialard, Trish Whetzel, Oliver Ruebel, Christopher G Chute, Matthew H Brush, Melissa A Haendel, Christopher J Mungall

PMC · DOI: 10.1093/gigascience/giaf152 · GigaScience · 2025-12-12

## TL;DR

LinkML is an open framework that helps standardize and share scientific data, making it easier to integrate and reuse across different fields.

## Contribution

LinkML introduces a flexible and accessible data modeling language that supports FAIR data standards and promotes interoperability.

## Key findings

- LinkML enables the creation of standardized data models that can be shared and reused across disciplines.
- The framework supports complex data structures and integrates with existing systems, reducing data heterogeneity.
- LinkML has been adopted in diverse fields like biology, biomedicine, and engineering to standardize data at the source.

## Abstract

Scientific research relies on well-structured, standardized data; however, much of it is stored in formats such as free-text lab notebooks, nonstandardized spreadsheets, or data repositories. This lack of structure challenges interoperability, making data integration, validation, and reuse difficult.

LinkML (Linked Data Modeling Language) is an open framework that simplifies the process of authoring, validating, and sharing data. LinkML can describe a range of data structures, from flat, list-based models to complex, interrelated, and normalized models that utilize polymorphism and compound inheritance. It offers an approachable syntax that is not tied to any one technical architecture and can be integrated seamlessly with many existing frameworks. The LinkML syntax provides a standard way to describe schemas, classes, and relationships, allowing modelers to build well-defined, stable, and optionally ontology-aligned data structures. Once defined, LinkML schemas may be imported into other LinkML schemas. These key features make LinkML an accessible platform for interdisciplinary collaboration and a reliable way to define and share data semantics.

LinkML helps reduce heterogeneity, complexity, and the proliferation of single-use data models while simultaneously enabling compliance with FAIR (Findable, Accessible, Interoperable, and Reusable) data standards. LinkML has seen increasing adoption in various fields, including biology, chemistry, biomedicine, microbiome research, finance, electrical engineering, transportation, and commercial software development. In short, LinkML makes implicit models explicitly computable and allows data to be standardized at their origin. LinkML documentation and code are available at https://linkml.io/.

## Full-text entities

- **Diseases:** rare diseases (MESH:D035583), LinkML (MESH:D004195)
- **Chemicals:** Carbon (MESH:D002244)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12993438/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12993438/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/PMC12993438/full.md

---
Source: https://tomesphere.com/paper/PMC12993438