FAQ: Tools

What tools do I need for LinkML?

Formally, LinkML is a specification for modeling data, and is independent of any set of tools.

However, for practical purposes, you will find the core python toolchain useful, whether you use this as a python library, or a command line tool.

This includes functionality like:

The GitHub repo is https://github.com/linkml/linkml

For installation, see installation

There are other tools in the LinkML ecosystem that you may find useful:

How do I install the LinkML tools?

See the installation guide.

What tools are available for authoring schemas?

Currently the main way to author a schema is to edit schema YAML files in a text editor or IDE (Integrated Development Environment).

We recommend using an IDE that has support for YAML format.

The meta model, here: https://w3id.org/linkml/meta.schema.json can be incorporated into pycharm for model development syntax validation. For more details on pycharm specifically: https://www.jetbrains.com/help/pycharm/json.html#ws_json_schema_add_custom

See the section below on “Are there tools for editing my data?” for suggestions (note that your schema is data - schemas instantiate the schema class in the metamodel)

One possible alternative to authoring schemas in YAML is to enter the schema in a spreadsheet, which is our next question…

Is there a tool to manage schemas as spreadsheets?

Yes! See:

How do I browse a schema?

For small schemas with limited inheritance, it should be possible to mentally picture the structure just by examining the source YAML. For larger schemas, with deep inheritance, it can help to have some kind of hierarchical browsing tool.

There are a few strategies:

  • Use gen-markdown to make markdown that can be viewed using mkdocs

  • Use gen-owl to make an OWL ontology, which can be browsed:

    • Using an ontology editing tool like Protege

    • By publishing the ontology with an ontology repository and using a web ontology browser

    • By running the Ontology Lookup Service docker image and browsing using a web browser

How can I check my schema is valid?

You can use any of the generator tools distributed as part of linkml to check for errors in your schema

Are there tools to create a schema from JSON-Schema/SHACL/SQL DDL/…?

Currently the core linkml framework can generate schemas in other frameworks from a linkml schema. The generators are part of the core framework.

We have experimental importers as part of the linkml-model-enrichment project, which can generate a schema from:

  • An OWL ontology

  • JSON-Schema

Others may be added in future

However, there importers are not part of the core, may be incomplete, and may not be as well supported, and not as well documented. You may still find them useful to kick-start a schema, but you should not rely on them in a production environment.

Are there tools to infer a schema from data?

The linkml-model-enrichment framework can seed a schema from:

  • CSV/TSV files

  • JSON data

  • RDF triples

Note that a number of heuristic measures are applied, and the results are not guaranteed to be correct. You may still find them useful to bootstrap a new schema.

This framework also has tools to:

  • Automatically annotate mappings in a schema using bioportal annotator service

  • Automatically assign meaning fields in enums using bioportal and OLS annotators

Again, this is a text-mining based approach, and will yield both false positives and negatives.

How do I programmatically create schemas?

As LinkML schemas are YAML files, you can use library that writes YAML.

For example, in Python you can write code like this:

import yaml

schema = {
  "id": my_schema_url,
  classes: [
   {
    "Person": {
      "description": "any person, living or dead",
      "attributes": {
          ...
       }
    }
   }
  ]
}
print(yaml.dump(schema))

You can also write similar code in most languages.

While this should work fine, the approach has some disadvantages. In particular you get no IDE support and there is no guard against making mistakes in key names or structure until you come to run the code.

A better approach for Python developers is to use the Python object model that is generated from the metamodel.

from linkml_runtime.linkml_model.meta import SchemaDefinition, ClassDefinition

s = SchemaDefinition(id= my_schema_id,
                     classes= [ ... ])

You can also use the SchemaView classes, see the developers guide section on manipulating schemas

How can I check my data is valid?

If you have data in RDF, JSON, or TSV then you can check for validiting using linkml-validate

See validating data for more details

Are there tools for editing my data?

the same LinkML data can be rendered as JSON or RDF, and for schemas that have a relatively flat structure, TSVs can be used. Any editing tool that can be used for those formats can be used for LinkML. For example, you can turn your schema into OWL and then use Protege to edit instance data. Or you can simply edit your data in a TSV.

For “flat” schemas such as those for collecting sample or specimen metadata, the DataHarmonizer accepts LinkML as a schema language.

If you are comfortable using an IDE like PyCharm, and with editing you data as JSON, then you can use your LinkML schema to provide dynamic schema validation and autocompletion while editing, see these slides for a guide

Are there guides for developing LinkML compliant tools?

See the tool developer guide

Can I generate a website from a LinkML schema

Yes!

See the markdown generator for details.

If you run:

gen-markdown -d docs personinfo.yaml

It will place all the markdown documents you need to run a mkdocs site

Can I customize the Markdown generation for my schema site?

For some purposes, the generic schema documentation provided by gen-markdown may look too… generic.

You can customize markdown generation using your own templates. This requires a basic understanding of Jinja2 templates.

The protocol is:

  1. copy the jinja templates from docgen to your own repo in a folder templates

  2. customize these templates

  3. run gen-docs --template-directory templates -d docs my_schema.yaml

  4. run mkdocs serve to test locally

  5. iterate until they look how you want, then deploy (e.g. mkdocs gh-deploy)

An example repo that uses highly customized templates: GSC MIxS

Can I use my schema to do reasoning over my data?

There are a number of strategies for performing deductive inference:

What does _if_missing mean in my JSON output?

If you pass a LinkML object directly to json.dump you will see internal hidden fields, these start with underscore: e.g. _if_missing.

We recommend instead using json_dumper.dump in the linkml-runtime package, which will give the canonical JSON representation of a LinkML object.

See: