Binding Validation Reference
This reference covers validation of binding constraints on nested objects—ensuring that fields within complex objects satisfy enum range constraints, with optional label validation.
Overview
Binding validation uses the BindingValidationPlugin to validate that:
- Fields within nested objects satisfy their binding range constraints
- (Optionally) Labels match the ontology's canonical labels
Bindings are essential when your data uses complex objects (like OntologyTerm with id and label) rather than simple CURIE strings.
CLI Usage
# Basic binding validation
linkml-term-validator validate-data data.yaml --schema schema.yaml
# With label validation (anti-hallucination)
linkml-term-validator validate-data data.yaml -s schema.yaml --labels
# With target class
linkml-term-validator validate-data data.yaml -s schema.yaml -t GeneAnnotation --labels
# With custom OAK configuration
linkml-term-validator validate-data data.yaml -s schema.yaml --oak-config oak_config.yaml --labels
CLI Options
| Option | Description |
|---|---|
--schema, -s |
Path to LinkML schema (required) |
--target-class, -t |
Target class for validation |
--labels |
Also validate labels against ontology |
--oak-adapter |
OAK adapter string (default: sqlite:obo:) |
--oak-config |
Path to OAK configuration file |
--cache-dir |
Directory for cache files (default: cache) |
--verbose / -v |
Enable verbose output |
Binding Syntax
Basic Binding
classes:
GeneAnnotation:
attributes:
go_term:
range: OntologyTerm
bindings:
- binds_value_of: id # Field to constrain
range: BiologicalProcessEnum # Enum defining valid values
Binding Properties
| Property | Required | Description |
|---|---|---|
binds_value_of |
Yes | The field path within the nested object |
range |
Yes | The enum (static or dynamic) defining allowed values |
obligation_level |
No | REQUIRED, RECOMMENDED, or OPTIONAL |
Obligation Levels
| Level | Description | Validation Behavior |
|---|---|---|
REQUIRED |
Must satisfy constraint | ERROR on violation |
RECOMMENDED |
Should satisfy constraint | WARN on violation |
OPTIONAL |
May satisfy constraint | No validation |
bindings:
- binds_value_of: id
range: BiologicalProcessEnum
obligation_level: REQUIRED # Default if not specified
Label Field Detection
The plugin needs to know which field contains the label to validate. There are three mechanisms:
1. Using implements (Recommended)
Use implements to declare the label field:
classes:
OntologyTerm:
attributes:
id:
identifier: true
name:
implements:
- rdfs:label # Declares this is the label field
2. Using slot_uri
Alternatively, use slot_uri to declare the field's semantic meaning:
classes:
OntologyTerm:
attributes:
id:
identifier: true
label:
slot_uri: rdfs:label # Also declares this is the label field
Supported Label Properties
| Property | Description |
|---|---|
rdfs:label |
Standard RDF label |
skos:prefLabel |
SKOS preferred label |
schema:name |
Schema.org name |
oboInOwl:hasExactSynonym |
OBO exact synonym |
3. Convention-Based (Fallback)
If no implements or slot_uri is declared, the plugin falls back to looking for a field named label:
classes:
OntologyTerm:
attributes:
id:
identifier: true
label: # Detected by convention
range: string
Nested Structure Validation
The plugin recursively validates all nesting levels, not just the top-level class. This is critical for real-world schemas.
Example: Deeply Nested Structure
Schema:
classes:
Study:
attributes:
samples:
range: Sample
multivalued: true
inlined_as_list: true
Sample:
attributes:
annotations:
range: Annotation
multivalued: true
inlined_as_list: true
Annotation:
attributes:
term:
range: OntologyTerm
bindings: # ← Binding at nested level
- binds_value_of: id
range: AnnotationTermEnum
Data:
samples:
- annotations:
- term:
id: GO:0007049 # ← Validated!
label: cell cycle
Error message with path:
ERROR: Value 'GO:9999999' not in enum 'AnnotationTermEnum'
path: samples[0].annotations[1].term
slot: term
field: id
Multivalued Slots
Both the parent slot and the nested objects can be multivalued:
classes:
Disease:
attributes:
affected_tissues:
range: TissueDescriptor
multivalued: true # Multiple tissues
inlined_as_list: true
bindings:
- binds_value_of: id
range: AnatomyEnum
Each item in the list is validated independently:
affected_tissues:
- id: UBERON:0000955 # brain - validated
label: brain
- id: UBERON:0000948 # heart - validated
label: heart
- id: PIZZA:MARGHERITA # INVALID - not anatomy
label: delicious
Complete Example
Schema
id: https://example.org/annotation-schema
name: annotation-schema
prefixes:
GO: http://purl.obolibrary.org/obo/GO_
rdfs: http://www.w3.org/2000/01/rdf-schema#
linkml: https://w3id.org/linkml/
classes:
GeneAnnotation:
attributes:
gene_id:
range: string
identifier: true
process:
range: GOTerm
bindings:
- binds_value_of: id
range: BiologicalProcessEnum
location:
range: GOTerm
bindings:
- binds_value_of: id
range: CellularComponentEnum
GOTerm:
attributes:
id:
identifier: true
label:
implements:
- rdfs:label
enums:
BiologicalProcessEnum:
reachable_from:
source_ontology: sqlite:obo:go
source_nodes:
- GO:0008150 # biological_process
relationship_types:
- rdfs:subClassOf
CellularComponentEnum:
reachable_from:
source_ontology: sqlite:obo:go
source_nodes:
- GO:0005575 # cellular_component
relationship_types:
- rdfs:subClassOf
Valid Data
gene_id: BRCA1
process:
id: GO:0007049 # cell cycle - is a biological process
label: cell cycle
location:
id: GO:0005634 # nucleus - is a cellular component
label: nucleus
Invalid Data
gene_id: BRCA1
process:
id: GO:0005634 # nucleus - WRONG! Not a process
label: nucleus
location:
id: GO:0007049 # cell cycle - WRONG! Not a component
label: cell cycle
Validation Commands
# Binding validation only
linkml-term-validator validate-data annotations.yaml -s schema.yaml -t GeneAnnotation
# Output: 2 binding errors (wrong enum values)
# With label validation
linkml-term-validator validate-data annotations.yaml -s schema.yaml -t GeneAnnotation --labels
# Also validates that labels match ontology
Python API
from linkml.validator import Validator
from linkml_runtime.loaders import yaml_loader
from linkml_term_validator.plugins import BindingValidationPlugin
# Create plugin with label validation
plugin = BindingValidationPlugin(
oak_adapter_string="sqlite:obo:",
validate_labels=True, # Enable label checking
cache_labels=True,
cache_dir="cache",
)
# Create validator
validator = Validator(
schema="schema.yaml",
validation_plugins=[plugin]
)
# Validate
loader = yaml_loader.YamlLoader()
report = validator.validate_source(
loader,
"data.yaml",
target_class="GeneAnnotation"
)
# Check results
for result in report.results:
print(f"{result.severity.name}: {result.message}")
Plugin Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
oak_adapter_string |
str |
"sqlite:obo:" |
Default OAK adapter |
oak_config_path |
str \| None |
None |
Path to OAK config file |
validate_labels |
bool |
False |
Also validate labels |
cache_labels |
bool |
True |
Enable file-based caching |
cache_dir |
str |
"cache" |
Cache directory |
Error Messages
Binding Violation
ERROR: Value 'GO:0005634' not in enum 'BiologicalProcessEnum'
path: process
slot: process
field: id
Label Mismatch (with --labels)
ERROR: Label mismatch for GO:0007049
Expected (from data): "Cell Cycle"
Found (from ontology): "cell cycle"
path: process.label
Nested Path Example
ERROR: Value 'GO:9999999' not in enum 'CellularComponentEnum'
path: samples[0].annotations[2].term
slot: term
field: id
Anti-Hallucination Use Case
When validating AI-generated data, enable label validation to catch hallucinated terms:
# AI might generate plausible-looking but wrong data:
# {
# "id": "GO:0007049",
# "label": "DNA repair" # WRONG! Actual label is "cell cycle"
# }
plugin = BindingValidationPlugin(validate_labels=True)
# This will catch the mismatch!
See Anti-Hallucination Guardrails for more details.
Combining with DynamicEnumPlugin
For comprehensive validation, use both plugins:
from linkml.validator import Validator
from linkml_term_validator.plugins import (
DynamicEnumPlugin,
BindingValidationPlugin,
)
plugins = [
DynamicEnumPlugin(), # Direct enum slots
BindingValidationPlugin(validate_labels=True), # Nested object bindings
]
validator = Validator(schema="schema.yaml", validation_plugins=plugins)
- DynamicEnumPlugin: Validates slots that directly use dynamic enum ranges
- BindingValidationPlugin: Validates fields within nested objects via bindings
Common Patterns
Reusable Term Class
classes:
Term:
attributes:
id:
identifier: true
label:
implements:
- rdfs:label
GeneAnnotation:
attributes:
process:
range: Term
bindings:
- binds_value_of: id
range: ProcessEnum
component:
range: Term
bindings:
- binds_value_of: id
range: ComponentEnum
Slot Usage Override
Override bindings in subclasses:
classes:
Annotation:
attributes:
term:
range: Term
GeneAnnotation:
is_a: Annotation
slot_usage:
term:
bindings:
- binds_value_of: id
range: GOTermEnum
See Also
- Bindings Explained - Conceptual overview
- Data Validation - Dynamic enum validation
- Anti-Hallucination Guardrails - AI validation
- Plugin Reference - Complete API reference