JSON Schema#
Example Output#
Overview#
JSON Schema is a schema language for JSON documents.
JSON-Schema can be generated from a LinkML schema and used to validate JSON documents using standard JSON-Schema validators.
To run:
gen-json-schema personinfo.yaml > personinfo.schema.json
To use this in combination with the standard python jsonschema validator (bundled with linkml):
jsonschema -i data/example_personinfo_data.yaml personinfo.schema.json
See also
Data Validation for other validation strategies
Note
Note that any JSON that conforms to the derived JSON Schema can be converted to RDF using the derived JSON-LD context.
Inheritance#
Because JSON-Schema does not support inheritance hierarchy, slots are “rolled down” from parent.
For example, in the personinfo schema, slots such as id and name are inherited from NamedThing, and aliases are inherited from a mixin:
NamedThing:
slots:
- id
- name
HasAliases:
mixin: true
attributes:
aliases:
multivalued: true
Person:
is_a: NamedThing
mixins:
- HasAliases
slots:
- birth_date
- age_in_years
- gender
(some parts truncated for brevity)
This would generate the following JSON-Schema:
"Person": {
"additionalProperties": false,
"description": "A person (alive, dead, undead, or fictional).",
"properties": {
"age_in_years": {
"type": "integer"
},
"aliases": {
"items": {
"type": "string"
},
"type": "array"
},
"birth_date": {
"type": "string"
},
"gender": {
"$ref": "#/definitions/GenderType"
},
"id": {
"type": "string"
},
"name": {
"type": "string"
},
},
"required": [
"id"
],
"title": "Person",
"type": "object"
},
Composition#
JSON-Schema supports schema composition through:
allOf
anyOf
oneOf
not
LinkML supports analogous elements:
Use of these elements will be translated into the appropriate JSON-Schema construct.
Inlining#
LinkML separates the underlying logical model from choices of how references are inlined in JSON.
If an inlined directive is added to a slot definition as follows:
has_employment_history:
range: EmploymentEvent
multivalued: true
inlined: true
inlined_as_list: true
then the JSON-Schema will use a $ref
:
"has_employment_history": {
"items": {
"$ref": "#/definitions/EmploymentEvent"
},
"type": "array"
},
However, if a slot is not inlined and the range is a class with an identifier, then the reference is by key.
For example, given:
FamilialRelationship:
is_a: Relationship
slot_usage:
related to:
range: Person
required: true
Here the value of related_to
is expected to be a string must be an
identifier for a Person
object:
the range is treated as a simple string in the JSON-Schema
"FamilialRelationship": {
"additionalProperties": false,
"description": "",
"properties": {
"ended_at_time": {
"format": "date",
"type": "string"
},
"related_to": {
"type": "string"
},
"started_at_time": {
"format": "date",
"type": "string"
}
},
"required": [
"type",
"related_to"
],
"title": "FamilialRelationship",
"type": "object"
},
Thus the JSON-Schema loses some information that is useful for validation, and for understanding of the schema.
LinkML also supports the ability to inline multivalued slots as dictionaries, where the key is the object identifier. See Inlining
This example schema supports inlining a list of people as a dictionary:
classes:
Container:
tree_root: true
attributes:
persons:
range: Person
inlined: true
multivalued: true
Person:
attributes:
name:
identifier: true
age:
range: integer
required: true
gender:
range: string
required: true
The following data is conformant according to LinkML semantics:
{
"persons":
{
"Bob": {
"age": 42,
"gender": "male"
},
"Alice": {
"age": 37,
"gender": "female"
}
}
}
This presents an additional complication when generating JSON-Schema:
semantically the name
field is required (all identifiers are
automatically required in json-schema). However, we don’t want it to
be required in the body of the dictionary since it is already
present as a key.
The JSON-Schema generator takes care of this for you by making an
alternative “laxer” version of the Person class that is used for
validating the body of the persons
dict.
This is what the underlying JSON-Schema looks like:
"$defs": {
"Person": {
"additionalProperties": false,
"description": "",
"properties": {
"age": {
"type": "integer"
},
"gender": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"name",
"age",
"gender"
],
"title": "Person",
"type": "object"
},
"Person__identifier_optional": {
"additionalProperties": false,
"description": "",
"properties": {
"age": {
"type": "integer"
},
"gender": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"age",
"gender"
],
"title": "Person",
"type": "object"
}
},
"$id": "http://example.org",
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"persons": {
"additionalProperties": {
"$ref": "#/$defs/Person__identifier_optional"
}
}
},
"title": "example.org",
"type": "object"
Patterns#
Both LinkML and JSON-Schema support the same subset of ECMA-262 regular expressions.
See Regular Expressions.
For example, the following schema fragment
classes:
# ...
Person:
# ...
slot_usage:
primary_email:
will generate:
"primary_email": {
"pattern": "^\\S+@[\\S+\\.]+\\S+",
"type": "string"
}
LinkML also supports Structured patterns, these are compiled down to patterns during JSON Schema generation.
Rules#
LinkML supports Rules which allow for conditional application of constraints.
These are converted to if/then/else constructs in JSON-Schema.
Uniqueness constraints#
LinkML provides different mechanisms for stating uniqueness constraints:
The identifier and key slots metamodel slots allow a class to have a single primary key
The unique_keys slot allows for additional unique keys. These can be singular or compound.
Currently JSON-Schema does not yet support unique keys. See This stackoverflow question for a discussion.
It is possible to get a limited form of uniqueness key checking in JSON-Schema: slots
marked as identifier
or key
that are also inlined
are enforced to be unique by virtue of the fact that the slot is used as the key in a dictionary,
and dictionaries in JSON cannot have duplicate keys.
Enums#
Enumerations are treated as simple strings. If the LinkML schema has additional metadata about the enumeration values, this is lost in translation.
Example:
classes:
# ...
FamilialRelationship:
is_a: Relationship
slot_usage:
type:
range: FamilialRelationshipType
required: true
related to:
range: Person
required: true
#...
enums:
FamilialRelationshipType:
permissible_values:
SIBLING_OF:
description: a relationship between two individuals who share a parent
PARENT_OF:
description: a relationship between a parent (biological or non-biological) and their child
CHILD_OF:
description: inverse of the PARENT_OF type
Generates
"FamilialRelationship": {
"additionalProperties": false,
"description": "",
"properties": {
"ended_at_time": {
"format": "date",
"type": "string"
},
"related_to": {
"type": "string"
},
"started_at_time": {
"format": "date",
"type": "string"
},
"type": {
"$ref": "#/definitions/FamilialRelationshipType"
}
},
"required": [
"type",
"related_to"
],
"title": "FamilialRelationship",
"type": "object"
},
"FamilialRelationshipType": {
"description": "",
"enum": [
"SIBLING_OF",
"PARENT_OF",
"CHILD_OF"
],
"title": "FamilialRelationshipType",
"type": "string"
},
Docs#
Command Line#
gen-json-schema#
Generate JSON Schema representation of a LinkML model
gen-json-schema [OPTIONS] YAMLFILE
Options
- -i, --inline#
Generate references to types rather than inlining them. Note that declaring a slot as inlined: true will always inline the class
- -t, --top-class <top_class>#
Top level class; slots of this class will become top level properties in the json-schema
- --not-closed, --closed#
Set additionalProperties=False if closed otherwise true if not closed at the global level
- Default:
True
- --include-range-class-descendants, --no-range-class-descendants#
When handling range constraints, include all descendants of the range class instead of just the range class
- --indent <indent>#
If this is a positive number the resulting JSON will be pretty-printed with that indent level. Set to 0 to disable pretty-printing and return the most compact JSON representation
- Default:
4
- --title-from <title_from>#
Specify from which slot are JSON Schema ‘title’ annotations generated.
- Options:
name | title
- -d, --include <include>#
Include LinkML Schema outside of imports mechanism. Helpful in including deprecated classes and slots in a separate YAML, and including it when necessary but not by default (e.g. in documentation or for backwards compatibility)
- -V, --version#
Show the version and exit.
- -f, --format <format>#
Output format
- Default:
'json'
- Options:
json
- --metadata, --no-metadata#
Include metadata in output
- Default:
True
- --useuris, --metauris#
Use class and slot URIs over model uris
- Default:
True
- -im, --importmap <importmap>#
Import mapping file
- --log_level <log_level>#
Logging level
- Default:
'WARNING'
- Options:
CRITICAL | ERROR | WARNING | INFO | DEBUG
- -v, --verbose#
Verbosity. Takes precedence over –log_level.
- --mergeimports, --no-mergeimports#
Merge imports into source file (default=mergeimports)
- --stacktrace, --no-stacktrace#
Print a stack trace when an error occurs
- Default:
False
Arguments
- YAMLFILE#
Required argument
Code#
- class linkml.generators.jsonschemagen.JsonSchemaGenerator(schema: str | ~typing.TextIO | ~linkml_runtime.linkml_model.meta.SchemaDefinition | ~linkml.utils.generator.Generator | ~pathlib.Path, schemaview: ~linkml_runtime.utils.schemaview.SchemaView | None = None, format: str | None = None, metadata: bool = True, useuris: bool | None = None, log_level: int | None = 30, mergeimports: bool | None = True, source_file_date: str | None = None, source_file_size: int | None = None, logger: ~logging.Logger | None = None, verbose: bool | None = None, output: str | None = None, namespaces: ~linkml_runtime.utils.namespaces.Namespaces | None = None, directory_output: bool = False, base_dir: str = None, metamodel_name_map: ~typing.Dict[str, str] = None, importmap: str | ~typing.Mapping[str, str] | None = None, emit_prefixes: ~typing.Set[str] = <factory>, metamodel: ~linkml.utils.schemaloader.SchemaLoader = None, stacktrace: bool = False, include: str | ~pathlib.Path | ~linkml_runtime.linkml_model.meta.SchemaDefinition | None = None, topClass: str | None = None, not_closed: bool | None = True, indent: int = 4, inline: bool = False, top_class: ~linkml_runtime.linkml_model.meta.ClassDefinitionName | str | None = None, include_range_class_descendants: bool = False, title_from: str = 'name', top_level_schema: ~linkml.generators.jsonschemagen.JsonSchema = None, include_null: bool = True, **_kwargs)[source]#
Generates JSONSchema documents from a LinkML SchemaDefinition
Each linkml class generates a schema
inheritance hierarchies are rolled-down from ancestors
Composition not yet implemented
Enumerations treated as strings
Foreign key references are treated as semantics-free strings
This generator implements the following
LifecycleMixin
methods:LifecycleMixin.before_generate_schema()
LifecycleMixin.after_generate_schema()
LifecycleMixin.before_generate_classes()
LifecycleMixin.before_generate_enums()
LifecycleMixin.before_generate_class_slots()
LifecycleMixin.before_generate_class()
LifecycleMixin.after_generate_class()
LifecycleMixin.before_generate_class_slot()
LifecycleMixin.after_generate_class_slot()
LifecycleMixin.before_generate_enum()
LifecycleMixin.after_generate_enum()