Validator#
The linkml.validator
package contains a new LinkML validation framework that is more flexible
than the linkml.validators
package. While that package still exists, it may become deprecated
in the future.
- class linkml.validator.Validator(schema: str | dict | TextIO | Path | SchemaDefinition, validation_plugins: List[ValidationPlugin] | None = None, *, strict: bool = False)[source]#
A class for coordinating instance validation using configurable plugins
- Parameters:
schema – The schema to validate against. If a string or Path, the schema will be loaded from that location. Otherwise, a
SchemaDefinition
is required.validation_plugins – A list of plugins that be used to validate instances using the given schema. Each element should be an instance of a subclass of
linkml.validator.plugins.ValidationPlugin
. Defaults toNone
.strict – If
True
, stop validating after the first validation problem is found. Defaults toFalse
.
- iter_results(instance: Any, target_class: str | None = None) Iterator[ValidationResult] [source]#
Lazily yield validation results for the given instance
- Parameters:
instance – The instance to validate
target_class – Name of the class within the schema to validate against. If
None
, the class will be inferred from the schema by looking for a class withtree_root: true
. Defaults toNone
.
- Returns:
Iterator over validation results
- Return type:
Iterator[ValidationResult]
- iter_results_from_source(loader: Loader, target_class: str | None = None) Iterator[ValidationResult] [source]#
Lazily yield validation results for the instances provided by a loader
- Parameters:
loader – An instance of a subclass of
linkml.validator.loaders.Loader
which provides the instances to validatetarget_class – Name of the class within the schema to validate against. If
None
, the class will be inferred from the schema by looking for a class withtree_root: true
. Defaults toNone
.
- Returns:
Iterator over validation results
- Return type:
Iterator[ValidationResult]
- validate(instance: Any, target_class: str | None = None) ValidationReport [source]#
Validate the given instance
- Parameters:
instance – The instance to validate
target_class – Name of the class within the schema to validate against. If
None
, the class will be inferred from the schema by looking for a class withtree_root: true
. Defaults toNone
.
- Returns:
A validation report
- Return type:
ValidationReport
- validate_source(loader: Loader, target_class: str | None = None) ValidationReport [source]#
Validate instances from a data source
- Parameters:
loader – An instance of a subclass of
linkml.validator.loaders.Loader
which provides the instances to validatetarget_class – Name of the class within the schema to validate against. If
None
, the class will be inferred from the schema by looking for a class withtree_root: true
. Defaults toNone
.
- Returns:
A validation report
- Return type:
ValidationReport
- linkml.validator.validate(instance: Any, schema: str | dict | SchemaDefinition, target_class: str | None = None, *, strict: bool = False) ValidationReport [source]#
Validate a data instance against a schema
This function provides a simple interface to do basic validation performed by a JSON Schema validator on a single instance. To have more control over the type of validation performed, see the
Validator
class.- Parameters:
instance – The instance to validate
schema – The schema used to validate the instance. If a string is it will be interpreted as a path, URL, or other loadable location. If it is a dict it should be compatible with
SchemaDefinition
, otherwise it should be aSchemaDefinition
instance.target_class – Name of the class within the schema to validate against. If
None
, the class will be inferred from the schema by looking for a class withtree_root: true
. Defaults toNone
.strict – If
True
, validation will stop after the first validation error is found, Otherwise all validation problems will be reported. Defaults toFalse
.
- Raises:
ValueError – If a valid
SchemaDefinition
cannot be constructed from theschema
parameter.- Returns:
A validation report
- Return type:
ValidationReport
- linkml.validator.validate_file(file: str | bytes | PathLike, schema: str | dict | SchemaDefinition, target_class: str | None = None, *, strict: bool = False) ValidationReport [source]#
Validate instances loaded from a file against a schema
This function provides a simple interface to do basic validation performed by a JSON Schema validator on instances loaded from a file. Loading is done according to the file’s extension. Accepted file extensions are:
.csv
,.tsv
,.yaml
,.yml
, and.json
. Individual rows of CSV and TSV files are treated as instances to validate. Each document within a YAML file is treated as an individual instance to validate. If the top-level of a JSON file is an array, each element of the array is treated as an instance to validate. Otherwise, if the top-level is an object it is treated as a single instance to validate.To have more control over the type of validation performed, see the
Validator
class.- Parameters:
file – Path-like object of the file to be read
schema – The schema used to validate the instance. If a string is it will be interpreted as a path, URL, or other loadable location. If it is a dict it should be compatible with
SchemaDefinition
, otherwise it should be aSchemaDefinition
instance.target_class – Name of the class within the schema to validate against. If
None
, the class will be inferred from the schema by looking for a class withtree_root: true
. Defaults toNone
.strict – If
True
, validation will stop after the first validation error is found, Otherwise all validation problems will be reported. Defaults toFalse
.
- Returns:
A validation report
- Return type:
ValidationReport
Loaders#
The linkml.validator.loaders
package contain classes which are responsible for yielding data
instance from a source. Instances of these classes are passed to
linkml.validator.Validator.validate_source()
and
linkml.validator.Validator.iter_results_from_source()
- class linkml.validator.loaders.CsvLoader(source, *, skip_empty_rows: bool = False, index_slot_name: str | None = None)[source]#
A loader for instances serialized as CSV
- Parameters:
skip_empty_rows – If
True
, skip empty rows instead of yielding empty dicts. Defaults toFalse
.index_slot_name – If provided,
iter_instances
will yield one dict where all rows of the CSV file are collected into a list withindex_slot_name
as the key. IfNone
,iter_instances
will yield each row as a dict individually. Defaults toNone
.
- class linkml.validator.loaders.JsonLoader(source)[source]#
A loader for instances serialized as JSON
- Parameters:
source – Path or URL to JSON source
- class linkml.validator.loaders.Loader(source)[source]#
Abstract base class for instance data loaders.
Subclasses must implement the iter_instances method.
- Parameters:
source – Path or URL to load instances from
- class linkml.validator.loaders.TsvLoader(source, *, skip_empty_rows: bool = False, index_slot_name: str | None = None)[source]#
A loader for instances serialized as TSV
- Parameters:
skip_empty_rows – If
True
, skip empty rows instead of yielding empty dicts. Defaults toFalse
.index_slot_name – If provided,
iter_instances
will yield one dict where all rows of the TSV file are collected into a list withindex_slot_name
as the key. IfNone
,iter_instances
will yield each row as a dict individually. Defaults toNone
.
Plugins#
The linkml.validator.plugins
package contains classes that perform the actual validation work
on data instances. Instances of these classes should be provided when constructing a
linkml.validator.Validator
instance.
- class linkml.validator.plugins.JsonschemaValidationPlugin(*, closed: bool = False, include_range_class_descendants: bool = True, json_schema_path: PathLike | None = None)[source]#
A validation plugin which validates instances using a JSON Schema validator.
- Parameters:
closed – If
True
, additional properties are not allowed on instances. Defaults toFalse
.include_range_class_descendants – If True, use an open world assumption and allow the range of a slot to be any descendant of the declared range. Note that if the range of a slot has a type designator, descendants will always be included.
json_schema_path – If provided, JSON Schema will not be generated from the schema, instead it will be read from this path. In this case the value of the
closed
argument is disregarded and the open- or closed-ness of the existing JSON Schema is taken as-is.
- class linkml.validator.plugins.PydanticValidationPlugin(closed: bool = False)[source]#
A validation plugin which validates instances using a Pydantic validator.
Note that this plugin provides less complete validation than
JsonschemaValidationPlugin
. Also, due to the nature of Pydantic, it will fail fast on errors and only report the first error found.For general use cases, JsonschemaValidationPlugin is recommended. However, this plugin may be useful in some scenarios:
You are using in a pipeline to ensure objects will be valid for loading into Pydantic.
You are exploring relative capabilities of Pydantic and JSON Schema validation.
Pydantic is faster for your use case (to be tested).
- Parameters:
closed – If
True
, additional properties are not allowed on instances. Defaults toFalse
.
CLI#
linkml-validate#
Validate data according to a LinkML Schema
linkml-validate [OPTIONS] [DATA_SOURCES]...
Options
- -s, --schema <schema>#
Schema file to validate data against
- -C, --target-class <target_class>#
Class within the schema to validate data against
- --config <config>#
Validation configuration YAML file.
- --exit-on-first-failure#
Exit after the first validation failure is found. If not specified all validation failures are reported.
- --legacy-mode#
Use legacy linkml-validate behavior.
- -m, --module <module>#
[DEPRECATED: only used in legacy mode] Path to python datamodel module
- -f, --input-format <input_format>#
[DEPRECATED: only used in legacy mode] Input format. Inferred from input suffix if not specified
- Options:
yml | yaml | json | rdf | ttl | json-ld | csv | tsv
- -S, --index-slot <index_slot>#
[DEPRECATED: only used in legacy mode] top level slot. Required for CSV dumping/loading
- --include-range-class-descendants, --no-range-class-descendants#
[DEPRECATED: only used in legacy mode] When handling range constraints, include all descendants of the range class instead of just the range class
- -D, --include-context, --no-include-context#
Include additional context when reporting of validation errors.
- Default:
False
- -V, --version#
Show the version and exit.
Arguments
- DATA_SOURCES#
Optional argument(s)