Semantic Enumerations#

Enumerations are common features in modeling frameworks. These can be thought of as a “drop-down” of permissible values for a field/slot. For example, a “vital status” slot may have an enumeration with permissible values LIVING or DEAD.

LinkML supports enumerations, and goes beyond what is possible in frameworks like JSON-Schema

  • Permissible Values can be mapped to ontology terms, enhancing interoperability and FAIRness

  • Enumerations can be static or dynamic, where dynamic enums are defined by a set of constraints (e.g. a branch of an ontology)

Basic Enums#

The core enumeration model is the same as for familiar systems, where there is a set of allowed string values:

enums:
  FamilialRelationshipType:
    permissible_values:
      SIBLING_OF:
      PARENT_OF:
      CHILD_OF:

You can also make your enums into a richer controlled vocabulary, with definitions built in:

enums:
  FamilialRelationshipType:
    permissible_values:
      SIBLING OF:
        description: A family relationship where the two members have a parent on common
      PARENT OF:
        description: A family relationship between offspring and their parent
      CHILD OF:
        description: inverse of the PARENT_OF relationship

Mapping Permissible Values to Ontologies#

As an example, we will map the Permissible Values above to terms from the GA4GH pedigree standard kinship ontology.

We will first add a base prefix declaration (KIN concepts have PURLs of the form http://purl.org/ga4gh/kin.owl#KIN_007):

prefixes:
  kin: http://purl.org/ga4gh/kin.owl#
  ...

Then further down we can annotate our enums using meaning slots:

enums:
  FamilialRelationshipType:
    permissible_values:
      SIBLING OF:
        description: A family relationship where the two members have a parent on common
        meaning: kin:KIN_007
      PARENT OF:
        description: A family relationship between offspring and their parent
        meaning: kin:KIN_003
      CHILD OF:
        description: inverse of the PARENT_OF relationship
        meaning: kin:KIN_002

Working with Enums in Python#

Enumerations are mapped directly to Python enums. See

for examples.

Dynamic Enums#

Starting with LinkML 1.3, enums do not have to be a static hardcoded list; instead they can be dynamic, populated by a query.

This allows the enum to be synced with some upstream source, and avoids hardcoding very long lists where there are a lot of possibilities.

The following example defines an enumeration that selects any subtype of “neuron” from the OBO cell type ontology:

enums:
  NeuronTypeEnum:
    reachable_from:
      source_ontology: obo:cl
      source_nodes:
        - CL:0000540 ## neuron
      include_self: false
      relationship_types:
        - rdfs:subClassOf

Arbitrarily nested boolean expressions can be used, combined with the minus operator to subtract from sets:

enums:
  LoincExample:
    enum_uri: http://hl7.org/fhir/ValueSet/example-intensional
    see_also:
      - https://build.fhir.org/valueset-example-intensional.json.html
    include:
      - reachable_from:
          source_ontology: bioregistry:loinc
          source_nodes:
            - LOINC:LP43571-6
          is_direct: true
    minus:
      concepts:
        - LOINC:5932-9

Enums can extend other enums using inherits:

enums:
  Disease:
    reachable_from:
      source_ontology: bioregistry:mondo
      source_nodes:
        - MONDO:0000001 ## disease or disorder
      is_direct: false
      relationship_types:
        - rdfs:subClassOf
    minus:
      permissible_values:
        root_node:
          meaning: MONDO:0000001 ## disease or disorder

  HumanDisease:
    description: Extends the Disease value set, including NCIT neoplasms, excluding non-human diseases
    inherits:
      - Disease
    include:
      - reachable_from:
          source_ontology: bioregistry:ncit
          source_nodes:
            - NCIT:C3262
    minus:
      - reachable_from:
          source_ontology: bioregistry:mondo
          source_nodes:
            - MONDO:0005583 ## non-human animal disease
          relationship_types:
            - rdfs:subClassOf
      - permissible_values:
          NOT_THIS_ONE:
            meaning: MONDO:9999
            description: Example of excluding a single node

Tooling to support dynamic enums#

Different tool chains may choose to implement dynamic enums differently.

For example, if you have a stack that uses JSON-Schema for validation, then tools may choose to materialize a dynamic query into a static list of terms at the time of schema compilation.

Other tools may choose to perform the query at runtime. For example, a data entry tool may choose to use an advanced autocomplete API to restrict autocomplete to defined values.

At this time, tooling support for dynamic enums is maturing, but you can still go ahead and use them in your schemas. The default behavior will be too permissive – however, you still gain additional clarity in your schema documentation.

The Ontology Access Kit (OAK) has a tool called vskit for expanding value sets.

To run:

pip install oaklib
vskit expand -s my_schema.yaml -o my_schema_expanded.yaml