Converting between different representations#
LinkML allows you to specify schemas for data in a variety of forms:
JSON / YAML
Python object models
SQL databases
Spreadsheets and tabular data
RDF/Linked Data
Property Graphs
The process of loading from one of these formats into an internal representation is called loading. The opposite process, going from an internal representation into an external format is called dumping
The “native” form for LinkML can be considered JSON/YAML.
See PersonSchema/data for example toy data files
Specification:
Part 6 of the LinkML specification provides a more formal treatment
LinkML-Convert#
The linkml-convert
script can be used to convert data from one form
to another, following a schema
This makes use of loaders and dumpers in the linkml-runtime.
See below for command line docs
Loading from and dumping to JSON#
You can use the linkml-convert script to load or dump from JSON into another representation.
Dumping to JSON can be lossy; if your objects contain typing information that cannot be inferred from range constraints.
For example, if you have a schema:
classes:
Person:
attributes:
employed_at:
range: Organization
Organization:
...
NonProfit:
is_a: Organization
...
Corportation:
is_a: Organization
...
and a person object:
{
"employed_at": {
"...": "..."
}
}
Then there is insufficient information to determine whether the internal representation of the organization the person is employed at should be instantiated as a NonProfit or a Corporation.
LinkML allows a slot to be set with designates_type, the value of which is a name of a class from the schema. However, the loaders currently do not yet make use of this when loading from JSON into the internal representation.
Loading from and dumping to YAML#
The native YAML representation for LinkML is essentially identical to JSON.
In future there may be support for a direct translation to YAML that utilizes YAML tags to encode typing information.
Loading from and dumping to RDF#
Loading and dumping works in a similar fashion for RDF. One difference is that the schema must be present as this contains crucial information for being able to map classes and slots to URIs.
See RDF for more details
Loading from and dumping to CSVs#
See CSVs for more details
Inferring missing values#
The --infer
flag can be provided to perform missing value inference
See advanced schemas for more information on inference
Programmatic usage#
See developer docs for documentation of the relevant python classes
Command Line#
linkml-convert#
Converts instance data to and from different LinkML Runtime serialization formats.
The instance data must conform to a LinkML model, and either a path to a python module must be passed, or a path to a schema.
The converter works by first using a linkml-runtime loader to instantiate in-memory model objects, then a dumper is used to serialize. A validation step is optionally performed in between
When converting to or from RDF, a path to a schema must be provided.
For more information, see https://linkml.io/linkml/data/index.html
linkml-convert [OPTIONS] INPUT
Options
- -m, --module <module>#
Path to python datamodel module
- -o, --output <output>#
Path to output file
- -f, --input-format <input_format>#
Input format. Inferred from input suffix if not specified
- Options:
yml | yaml | json | rdf | ttl | json-ld | csv | tsv
- -t, --output-format <output_format>#
Output format. Inferred from output suffix if not specified
- Options:
yml | yaml | json | rdf | ttl | json-ld | csv | tsv
- -C, --target-class <target_class>#
name of class in datamodel that the root node instantiates
- --target-class-from-path, --no-target-class-from-path#
Infer the target class from the filename, should be ClassName-<other-chars>.{yaml,json,…}
- Default:
False
- -S, --index-slot <index_slot>#
top level slot. Required for CSV dumping/loading
- -s, --schema <schema>#
Path to schema specified as LinkML yaml
- -P, --prefix <prefix>#
Prefixmap base=URI pairs
- --validate, --no-validate#
Validate against the schema
- Default:
True
- --infer, --no-infer#
Infer missing slot values
- Default:
False
- -c, --context <context>#
path to JSON-LD context file
- -V, --version#
Show the version and exit.
Arguments
- INPUT#
Required argument