LinkML at a glance#

LinkML is a flexible modeling language that allows you to author schemas (“models”) in YAML that describe the structure of your data. The language is designed to allow for both simple use cases such as describing the column headers in a spreadsheet through to creating a complex interlinked schema.

LinkML is designed to work in harmony with other frameworks, including both semantic RDF-based frameworks, as well as frameworks that are more familiar to developers such as JSON.

Feature: Easy to author schemas#

LinkML models are organized around the core concepts of Classes and Slots. They are authored in YAML and allow both rich expressivity while keeping things simple in a way that allows non-technical domain modelers to contribute.

Example schema fragment, for modeling a simple “Person” concept:

classes:
  Person:
    is_a: NamedThing  ## parent class, defines id, name, ...
    description: >-
      A person (alive, dead, undead, or fictional).
    class_uri: schema:Person
    mixins:
      - HasAliases
    slots:
      - primary_email
      - birth_date
      - age_in_years
      - gender
      - current_address
      - has_employment_history
      - has_familial_relationships
      - has_medical_history
...

See PersonSchema for the complete example

Feature: Rich modeling language#

LinkML offers many features of use to data modelers, while retaining a simple core

A bridge between frameworks#

Many frameworks lock you in to a particular view of the world or technology. This can lead to silos, and the need to create mappings and transformations between different representations of the same data; for example, if your JSON documents need to work in concert with your relational database or graph store.

LinkML has many different generators for existing frameworks that allow the translation of a LinkML schema to other frameworks:

  • Convert to JSON-LD contexts, and instantly port your data to RDF

  • Convert to JSON-Schema and using JSON-Schema validators

  • Convert to SHACL or ShEx and validate your RDF data

  • Convert to Python dataclasses or pydantic for easy use within applications

  • Generate SQL Schemas or SQL Alchemy for use with relational databases

Feature: Generation of documentation and websites#

Using the LinkML toolchain you can go from a schema to a statically hosted searchable website in minutes, with pages for each of your schema elements. Using lightweight namespace registries such as w3id.org you can easily have resolvable URIs for all your concepts.

Showcase:

A rapidly growing toolchain#

LinkML can be thought of as two interlocking parts:

  • A standard for representing schemas, data dictionaries, standards, and metadata

  • A reference tool stack for doing things with artefacts that conform to the standard

The core LinkML toolchain is written in Python allows for:

  • generating downstream schema artefacts, including:

    • documentation and static sites

    • code for use by developers (data class in Python, Java, and Typescript, ORMs, enumerations)

    • conversion between alternate representations like JSON-Schema, SQL DDL, RDF Shapes, Protobuf, …

  • validation and linting of schemas

  • data conversion between JSON, TSV, and RDF (where that data conforms to a LinkML schema)

  • data validation of JSON, TSVs, or RDF using either JSON-Schema, SPARQL, or ShEx

  • easy programmatic manipulation of schemas

LinkML is part of a growing ecosystem of general purpose tools that make curating, mapping, ingesting, and organizing data much easier

  • schema-automator bootstraps schemas from existing structured and semi-structured sources

  • LinkML-OWL allows for generation of complex OWL axioms from datamodels

  • SchemaSheets converts between spreadsheets and schemas

  • DataHarmonizer is an ontology-based curation tool that is being adapted to LinkML

We eat our own dogfood!#

The LinkML schema language is itself defined in LinkML, and we use our own toolchain for working with it!

More examples#

See the examples pages