Introduction
This document is a functional draft specification for the Linked Data Modeling Language (LinkML).
LinkML is a data modeling language for describing the structure of a collection of instances, where instances are tree-like object oriented structures. Each instance instantiates a class from the LinkML metamodel. This is either a primitive class such as a scalar type, reference, enumeration, or a class class, which is associated with slot-value assignments.
LinkML schemas also specify rules for determining if instances conform to a the schema, and for adding additional implicit information to an instance collection.
LinkML is independent of any programming language, and independent of any concrete form for serializing instances of schemas. Mappings are provided for serializing instances as JSON, YAML, RDF, flat tables, or relational models, or for mapping to programming language structures, but are independent of any of these. Schemas are typically expressed using the YAML serialization, but this specification is independent of that serialization.
LinkML is self-describing, and any LinkML schema is itself a collection instances that instantiates elements in a special schema called the LinkML metamodel.
Audience
This document is intended for LinkML tool and framework implementors, and is intended to provide formal clarity about the structure and semantics of LinkML.
For a more lightweight introduction, consult the material on the main LinkML site, including the LinkML tutorial.
Conventions and terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
BNF
Grammars in this document are written using the BNF notation, summarized below:
Construct | Syntax |
---|---|
terminal symbols | enclosed in single quotes |
a set of terminal symbols described in English | italic |
nonterminal symbols | boldface |
zero or more | curly braces |
zero or one | square brackets |
alternative | vertical bar |
We also include a meta-production rule for expressing comma-delimited lists
<NT>List ::= [ <NT> { ',' <NT>List } ]
Outline
The specification is organized in 6 parts. The parts cannot be read independently, as each part builds on concepts introduced in previous parts.
Part 1: Introduction
This section. Provides background information and preliminary definitions
Part 2: Structure and Syntax of Instances
This specifies what an instance is in the context of LinkML.
The instance data model is shown as UML. A normative functional-style syntax is provided for instances, and this syntax is used throughout the specification.
This also introduces a path accessor syntax for specifying how to traverse LinkML instances.
Part 3: Structure of Schemas
This section specifies the core elements of a LinkML schema.
Part 4: Derived Schemas and Schema Semantics
This section specifies rules for inferring derived schemas, which can be used for purposes such as validation.
Part 5: Validation of Instance Data
This section specifies the procedure for validating LinkML instances using a derived schema
Part 6: Mapping of Instance Data
This section specifies how LinkML instances are mapped to other data models and syntaxes, including:
- JSON/YAML
- RDF