Schema Linter

When authoring a schema – especially when it is large or there are many authors – it is important to establish and adhere to best practices. For example, while providing a description for each schema element is not required, descriptions can help reduce miscommunication between schema authors. LinkML provides a configurable linter to identify violations of best practices or error-prone patterns.

Introduction

The linkml-lint command performs checks on a schema which – while valid – may not represent best practices or may indicate a likely mistake in the schema. These checks are referred to as rules and without additional configuration a default set of rules will be used.

To lint a single schema file:

linkml-lint schema.yaml

To recursively lint a directory of schema files:

linkml-lint schemas

Configuration

The linkml-lint command can be configured with a YAML configuration file. The configuration file can be provided using the --config command line option:

linkml-lint --config myconfig.yaml myschema.yaml

Alternatively, if there is a file named .linkmllint.yaml in the current working directory when you run linkml-lint that file will automatically be loaded as the configuration file. If all or most of the schema files in a project can be checked with the same rules, storing the configuration at the root of the project in a .linkmllink.yaml file can make it more convenient to check them without having to pass the --config option.

Here is an example of a configuration file which includes all of the recommended rules and enables an additional rule named no_empty_title:

# Use all of the recommended rule and also enable the no_empty_title rule
extends: recommended
rules:
  no_empty_title:
    level: error

The extends field of the configuration file allows you to inherit from an existing configuration. Currently the only valid value for this field is recommended.

The rules field is a dictionary where each key is a rule name and the value is the configuration for that rule. Every rule has at least one configurable property: level. Each rule’s level can be set to disabled (the default) meaning the schema will not be checked with that rule, or level can be set to warning or error. If set to warning or error the schema will be checked with that rule and violations will be reported. The distinction between warning and error is solely cosmetic and is designed to help you prioritize issues to fix. If you are unsure there is no harm in only using the error level. Some rules have additional configuration beyond the level property, as described in the Rules section.

To use the recommended configuration set except for one of the rules, manually specify level: disabled for that rule:

# Use the recommended rule set except for the standard_naming rule
extends: recommended
rules:
  standard_naming:
    level: disabled

It is also acceptable to not extend the recommended set. Simply omit the extends field from your configuration. In that case, only the rules that your configuration enables are checked:

# Only the no_empty_title and standard_naming rules will be checked
rules:
  no_empty_title:
    level: error
  standard_naming:
    level: error

Rules

Rule names denoted with a star ⭐ are part of the recommended configuration set.

canonical_prefixes ⭐

Enforce canonical prefixes by verifying that the mappings defined in the schema’s prefixes slot agree with those provided by the prefixmaps package.

Additional Configuration

  • prefixmaps_contexts: The list of context names which will be loaded by the prefixmaps library to do the validation. The order of names is meaningful and will be preserved. See also: prefixmaps documentation. Default: [merged]

no_empty_title

Disallow empty titles on schema elements.

no_invalid_slot_usage ⭐

Disallow slot_usage definitions where the name of the slot does not refer to an existing slot of the class.

no_xsd_int_type ⭐

Disallow use of uri: xsd:int in type definitions. In nearly all cases, xsd:integer should be used instead.

permissible_values_format

Enforce consistent formatting of enum permissible values. This rule may conflict with the standard_naming rule, but it is more flexible.

Additional Configuration

  • format: The enforced format of enum permissible values. Special values “snake”, “uppersnake”, “camel”, and “kebab” will be recognized, otherwise the value will be interpreted as a regular expression. Default: uppersnake.

standard_naming ⭐

Enforce standard naming conventions: CamelCase for classes, snake_case for slots, CamelCase for enums, snake_case (default) or UPPER_SNAKE for permissible_values (see permissible_values_upper_case option). This rule may conflict with the permissible_values_format rule.

Additional Configuration

  • permissible_values_upper_case: If true, permissible values will be checked for UPPER_SNAKE, otherwise snake_case. Default: false.

tree_root_class

Require a single class with tree_root: true and optionally verify that class’s name.

  • validate_existing_class_name: If true, in addition to validating that a tree_root: true ClassDefinition exists, the rule will also validate that is has the name provided by the root_class_name option. Default: false.

  • root_class_name: The name of the root class. Default: Container.

Reports

By default, if the linkml-lint command identifies violations of the configured rules it will print the files and issues to the terminal. This behavior can be changed with the --format and --output command line options.

The valid values for --format are terminal (the default), markdown, json, and tsv.

If the --output option is provided the report will be written to the specified file instead of the terminal.

For example, to generate a markdown report in a file named linter-results.md:

linkml-lint --format markdown --output linter-results.md myschema.yaml