Part 4: Working with RDF#

Previously we saw how to do basic validation of data using JSON-Schema.

This section demonstrates how to work with LinkML in conjunction with Linked Data/ RDF. If this is not of interest you can skip to the next section. However, even if this is the case you may wish to revisit this section. LinkML is intended to make it easy to get the benefits of Linked Data, while staying simple and working within a stack many developers are familiar with.

And even if you aren’t using RDF yourself, declaring URIs for your schema elements can help make your data FAIR, and in particular can serve as hooks to interoperate data!

Example schema#

Let’s start with the schema we developed in the previous section, with some minor modifications:

personinfo.yaml:

id: https://w3id.org/linkml/examples/personinfo
name: personinfo
prefixes:
  linkml: https://w3id.org/linkml/
  personinfo: https://w3id.org/linkml/examples/personinfo/
  ORCID: https://orcid.org/
default_curi_maps:
  - semweb_context
imports:
  - linkml:types
default_prefix: personinfo
default_range: string
  
classes:
  Person:
    attributes:
      id:
        identifier: true
      full_name:
        required: true
        description:
          name of the person
      aliases:
        multivalued: true
        description:
          other names for the person
      phone:
        pattern: "^[\\d\\(\\)\\-]+$"
      age:
        range: integer
        minimum_value: 0
        maximum_value: 200
  Container:
    attributes:
      persons:
        multivalued: true
        inlined_as_list: true
        range: Person

We extended the previous schema in a few ways:

  • we included a prefix declaration for the ORCID IDs in our data records

  • we included an import of standard semantic web prefixes from semweb_context

We will use this schema with a collection of data records

data.yaml:

persons:
  - id: ORCID:1234
    full_name: Clark Kent
    age: 33
    phone: 555-555-5555
  - id: ORCID:4567
    full_name: Lois Lane
    age: 34

We can use the linkml conversion library to translate this to RDF (Turtle syntax default)

linkml-convert -s personinfo.yaml -t rdf data.yaml

Outputs:

@prefix ns1: <https://w3id.org/linkml/examples/personinfo/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://orcid.org/1234> a ns1:Person ;
    ns1:age 33 ;
    ns1:full_name "Clark Kent" ;
    ns1:phone "555-555-5555" .

<https://orcid.org/4567> a ns1:Person ;
    ns1:age 34 ;
    ns1:full_name "Lois Lane" .

[] a ns1:Container ;
    ns1:persons <https://orcid.org/1234>,
        <https://orcid.org/4567> .

Note that each person is now represented by an ORCID URI. This is a start, but note we are still using classes and properties in our own namespace - there are existing vocabularies such as schema.org we could be reusing here.

Adding URIs to our schema#

Let’s enhance our schema, using two schema slots:

  • class_uri: to declare a URI/IRI for a class

  • slot_uri: the same thing for a slot

In both cases, we provide the value as a CURIE, and include a prefixes map that maps CURIEs to URIs.

personinfo-semantic.yaml:

id: https://w3id.org/linkml/examples/personinfo
name: personinfo
prefixes:                                  ## Note are adding 3 new ones here
  linkml: https://w3id.org/linkml/
  schema: http://schema.org/
  personinfo: https://w3id.org/linkml/examples/personinfo/
  ORCID: https://orcid.org/
imports:
  - linkml:types
default_curi_maps:
  - semweb_context
default_prefix: personinfo
default_range: string
  
classes:
  Person:
    class_uri: schema:Person              ## reuse schema.org vocabulary
    attributes:
      id:
        identifier: true
      full_name:
        required: true
        description:
          name of the person
        slot_uri: schema:name             ## reuse schema.org vocabulary
      aliases:
        multivalued: true
        description:
          other names for the person
      phone:
        pattern: "^[\\d\\(\\)\\-]+$"
        slot_uri: schema:telephone       ## reuse schema.org vocabulary
      age:
        range: integer
        minimum_value: 0
        maximum_value: 200
    id_prefixes:
      - ORCID
  Container:
    attributes:
      persons:
        multivalued: true
        inlined_as_list: true
        range: Person

Now let’s try converting the same YAML/JSON using the enhanced schema

linkml-convert -s personinfo-semantic.yaml -t rdf data.yaml

Outputs:

@prefix ns1: <http://schema.org/> .
@prefix ns2: <https://w3id.org/linkml/examples/personinfo/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://orcid.org/1234> a ns1:Person ;
    ns1:name "Clark Kent" ;
    ns1:telephone "555-555-5555" ;
    ns2:age 33 .

<https://orcid.org/4567> a ns1:Person ;
    ns1:name "Lois Lane" ;
    ns2:age 34 .

[] a ns2:Container ;
    ns2:persons <https://orcid.org/1234>,
        <https://orcid.org/4567> .

Note that the prefixes are hidden but the effect is to reuse URIs such as schema:telephone

This can be visualized using rdf-grapher as:

rdf-visualization

JSON-LD contexts#

You can also generate JSON-LD context files that can be used to add semantics to JSON documents:

gen-jsonld-context --no-metadata personinfo-semantic.yaml

Outputs:

{
   "@context": {
      "ORCID": "https://orcid.org/",
      "linkml": "https://w3id.org/linkml/",
      "personinfo": "https://w3id.org/linkml/examples/personinfo/",
      "schema": "http://schema.org/",
      "@vocab": "https://w3id.org/linkml/examples/personinfo/",
      "persons": {
         "@type": "@id"
      },
      "age": {
         "@type": "xsd:integer"
      },
      "full_name": {
         "@id": "schema:name"
      },
      "id": "@id",
      "phone": {
         "@id": "schema:telephone"
      },
      "Person": {
         "@id": "schema:Person"
      }
   }
}

NOTE: currently you need to declare your own type object and map this to rdf:type for typing information to be shown

Using Shape Languages#

In the previous section we saw how to use JSON-Schema validators. LinkML also allows the use of ShEx (future versions will allow SPARQL)

gen-shex --no-metadata personinfo-semantic.yaml

Outputs:

BASE <https://w3id.org/linkml/examples/personinfo/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX linkml: <https://w3id.org/linkml/>
PREFIX schema: <http://schema.org/>


linkml:String xsd:string

linkml:Integer xsd:integer

linkml:Boolean xsd:boolean

linkml:Float xsd:float

linkml:Double xsd:double

linkml:Decimal xsd:decimal

linkml:Time xsd:dateTime

linkml:Date xsd:date

linkml:Datetime xsd:dateTime

linkml:Uriorcurie IRI

linkml:Uri IRI

linkml:Ncname xsd:string

linkml:Objectidentifier IRI

linkml:Nodeidentifier NONLITERAL

<Container> CLOSED {
    (  $<Container_tes> <persons> @<Person> * ;
       rdf:type [ <Container> ] ?
    )
}

<Person> CLOSED {
    (  $<Person_tes> (  schema:name @linkml:String ;
          <aliases> @linkml:String * ;
          schema:telephone @linkml:String ? ;
          <age> @linkml:Integer ?
       ) ;
       rdf:type [ schema:Person ]
    )
}
gen-shacl --no-metadata personinfo-semantic.yaml > personinfo.shacl.ttl

Outputs:

@prefix personinfo: <https://w3id.org/linkml/examples/personinfo/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

personinfo:Container a sh:NodeShape ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:property [ sh:class schema:Person ;
            sh:nodeKind sh:IRI ;
            sh:order 0 ;
            sh:path personinfo:persons ] ;
    sh:targetClass personinfo:Container .

schema:Person a sh:NodeShape ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:property [ sh:maxCount 1 ;
            sh:maxInclusive 200 ;
            sh:minInclusive 0 ;
            sh:order 4 ;
            sh:path personinfo:age ],
        [ sh:description "name of the person" ;
            sh:maxCount 1 ;
            sh:minCount 1 ;
            sh:order 1 ;
            sh:path schema:name ],
        [ sh:maxCount 1 ;
            sh:order 0 ;
            sh:path personinfo:id ],
        [ sh:description "other names for the person" ;
            sh:order 2 ;
            sh:path personinfo:aliases ],
        [ sh:maxCount 1 ;
            sh:order 3 ;
            sh:path schema:telephone ;
            sh:pattern "^[\\d\\(\\)\\-]+$" ] ;
    sh:targetClass schema:Person .

More Info#