URIs and Mappings#
One feature that sets LinkML apart from frameworks such as JSON-Schema and UML is that fact that each element of your schema has a globally unique IRI/URI. This is somewhat hidden behind the scenes, so you can ignore this feature if you like, but it is also very easy to use this, which can provide benefits in terms of reusing and linking schemas, and working with the linked data stack.
background: URIs, IRIs, and CURIEs#
URIs and IRIs are generalizations of URLs. URIs are used as identifiers in linked data standards and vocabularies.
URIs can be shortended as CURIEs (Compact URIs). Given a prefix declaration where we map
http://schema.org/, then we can use the CURIE
schema:Person to denote the person concept.
For more on URIs and their importance in Linked Data, see
A typical header for a linkml schema may look like this:
id: https://w3id.org/linkml/examples/personinfo name: personinfo default_curi_maps: - semweb_context prefixes: personinfo: https://w3id.org/linkml/examples/personinfo/ linkml: https://w3id.org/linkml/ schema: http://schema.org/ rdfs: http://www.w3.org/2000/01/rdf-schema# prov: http://www.w3.org/ns/prov# default_prefix: personinfo emit_prefixes: - rdf - rdfs - xsd - skos imports: - linkml:types
The prefixes section contains a list of prefix expansions that can be used to specify CURIEs. Additionally, prefixmaps can be imported from prefixcommons.
With the above prefixmap, the CURIE
schema:Person will expand to http://schema.org/Person
LinkML also provides a way to import prefix maps to avoid repetitively declaring them, through default_curi_maps. However, we now consider it best practice to explicitly declare prefix maps, and to use the linter to check for consistency with standard prefix registries like bioregistry and prefix.cc.
class uri and slot uri#
Slot and class URIs in LinkML provide the meaning for a class or slot, and give a robust and unique place to consistently find information about a class or slot in a LinkML model on the web.
The two metamodel slots
can be used to declare URIs for classes and slots respectively. These are typically specified as CURIEs.
classes: Person: is_a: NamedThing description: >- A person (alive, dead, undead, or fictional). class_uri: schema:Person ... slots: id: identifier: true slot_uri: schema:identifier name: slot_uri: schema:name ...
When a JSON-LD context is generated for this schema, it will map json elements to their full linked data IRIs
If class and slot uris are omitted, then they are still generated behind the scenes, using the default_prefix slot at the schema level
For example, if Person did not declare a class_uri, then a CURIE
personinfo:Person would be used, which would expand to
Enumerations and the meaning slot#
For example, two of the permissible values in the following enumeration map to ontology terms via
enums: PersonStatus: permissible_values: ALIVE: description: the person is living meaning: PATO:0001421 DEAD: description: the person is deceased meaning: PATO:0001422 UNKNOWN: description: the vital status is not known todos: - map this to an ontology
Relationship to ISO-11179-3#
The LinkML metamodel is strongly influenced by and attempts to conform to the model in ISO/IEC 11179-3 - Metadata registries – Part 3: Registry metamodel and basic attributes.
For those familiar with the ISO 11179-3 model, the class_uri identifies the object class referenced by the data element. The slot_uri names the particular property.
For more, see this short slide deck:
You may wish to avoid committing to completely reusing a linked data concept, whilst wanting to retain a mapping. LinkML makes use of SKOS predicates as metamodel slots:
NamedThing: description: >- A generic grouping for any identifiable entity slots: - id - name - description - image close_mappings: - schema:Thing
The id_prefixes slot can be used to define a list of valid ID prefixes that instances of this class ought to have as part of their CURIE.
The order of the list matters since its a prioritized list with the ID prefix with the highest priority appearing at the top of the list.
For example, the biolink model defines a list of allowed id_prefixes for gene objects:
gene: slots: - id - name - symbol - description - synonym - xref exact_mappings: - SO:0000704 - SIO:010035 - WIKIDATA:Q7187 id_prefixes: - NCBIGene - ENSEMBL - HGNC - UniProtKB - MGI - ZFIN - dictyBase - WB - WormBase - FlyBase - FB - RGD - SGD - PomBase
Here we define the entity class
gene to have a list of ID prefixes with
NCBIGene having the highest priority.
this notebook demonstrates some potential pitfalls of JSON-LD 1.0 with some forms of CURIEs