LinkMLBring structure to your data.

An open framework that simplifies authoring, validating, and sharing data — providing an accessible platform for interdisciplinary collaboration and a reliable way to define and share data semantics.

LinkML Documentation
LinkML schema overview screenshot
YAML schema authoring in LinkML

Model your data easily by authoring YAML files

Use the LinkML modeling language to author schemas and data dictionaries. LinkML supports flexible inheritance, semantic enumerations, control of JSON inlining, pattern-based validation, and logical constraints.

The LinkML metamodel is itself written in LinkML — a mark of the language's expressive power and consistency.

Non-developers can author schemas using Schemasheets — define your data model directly in Excel or Google Sheets, no YAML required.

Modeling Documentation

FAIR data compliance by default

In LinkML, every class, slot, and enumeration automatically gets a globally unique URI or IRI. This means your schemas satisfy FAIR principles — Findable, Accessible, Interoperable, and Reusable — from day one, without extra tooling.

Generate JSON-LD context files for semantic-web interoperability while keeping your data in developer-friendly JSON or YAML. Developers can ignore the semantics until they need them.

JSON-Schema validation in PyCharm

Generate 30+ schema artifacts

The LinkML Generator framework compiles your schema into downstream artifacts across every major toolchain. The philosophy: embrace and reuse existing frameworks rather than replace them.

Schema frameworks
JSON Schema GraphQL ProtoBuf
Linked data
OWL ShEx SHACL RDF JSON-LD SPARQL YARRRML
Code generation
Python Pydantic TypeScript Java Rust
Databases
SQL DDL SQLAlchemy TypeDB
Documentation
Searchable docs ER Diagrams Excel / CSV
Generated Python dataclasses from a LinkML schema

Generate code in your language

LinkML generates native data classes for multiple programming languages. Python dataclasses and Pydantic models are available today, with generators for TypeScript, Java, and Rust also available.

The LinkML runtime allows generated Python classes to be automatically loaded and dumped from YAML, JSON, CSV, and RDF — keeping your data pipeline flexible.

Publish your schema on the web

Use gen-doc to generate searchable documentation sites for your schema, and gen-erdiagram to produce Mermaid entity-relationship diagrams — all automatically from your YAML file.

LinkML also assists in publishing schema artefacts using w3id.org for stable, resolvable URIs.

Auto-generated schema documentation site

A growing ecosystem

LinkML is more than a language — it's an ecosystem of tools for working with structured data across the research and data engineering communities.

Used across research and industry

LinkML powers data standards in biomedical research, environmental science, AI governance, and infrastructure projects worldwide.

National Microbiome Data Collaborative (NMDC) Center for Cancer Data Harmonization (CCDH) Alliance of Genome Resources Biomedical Data Translator NIH INCLUDE Project Bridge2AI Monarch Initiative ENTSO-E (power systems) NFDI (German research infrastructure) iSamples
30+
output formats
100+
projects in the registry
Apache 2.0
open source license

Moxon SAT, et al. "LinkML: an open data modeling framework." GigaScience. 2026;15:giaf152.

Steps to Build a Model

You can get started right away!

  1. 1

    Use linkml-project-copier in GitHub to scaffold a new project.

  2. 2

    Edit your YAML schema file.

  3. 3

    Add example data to your project.

  4. 4

    Validate your data against the schema using linkml-validate.

  5. 5

    Convert between YAML, JSON, and RDF using linkml-convert.

  6. 6

    Use the just command runner to generate all downstream artefacts.