CLI Reference
Complete command-line interface documentation.
Main Command
linkml-reference-validator [OPTIONS] COMMAND [ARGS]...
Options
--help- Show help message and exit
Commands
lookup- Look up reference metadata (quick lookups)validate- Validate supporting text against referencesrepair- Repair supporting text validation errorscache- Manage reference cache
lookup
Look up reference metadata and content. Useful for quick "what is this PMID?" lookups.
Usage
linkml-reference-validator lookup [OPTIONS] REFERENCE_ID [REFERENCE_ID...]
Arguments
- REFERENCE_ID (required) - One or more reference IDs (e.g., PMID:12345678, DOI:10.1234/example)
Options
--format, -f [md|json|yaml|text]- Output format (default: md)--no-cache- Bypass disk cache and fetch fresh from source--download-files, -D- Download supplementary files from repositories (e.g., Zenodo)--cache-dir PATH, -c PATH- Directory for caching references (default:references_cache)--config PATH- Path to validation configuration file (.yaml)--verbose, -v- Verbose output with detailed logging--help- Show help message
Examples
Basic lookup:
linkml-reference-validator lookup PMID:16888623
Multiple references:
linkml-reference-validator lookup PMID:16888623 PMID:33505029
JSON output:
linkml-reference-validator lookup PMID:16888623 --format json
YAML output:
linkml-reference-validator lookup PMID:16888623 --format yaml
Text output (human-readable):
linkml-reference-validator lookup PMID:16888623 --format text
Force fresh fetch (bypass cache):
linkml-reference-validator lookup PMID:16888623 --no-cache
Zenodo DOI with supplementary files:
linkml-reference-validator lookup DOI:10.5281/zenodo.7961621
Download supplementary files:
linkml-reference-validator lookup -D DOI:10.5281/zenodo.7961621
Output Format
Markdown (default):
---
reference_id: PMID:16888623
title: MUC1 oncoprotein blocks nuclear targeting...
authors:
- Raina D
- Ahmad R
journal: Cancer Research
year: '2006'
doi: 10.1158/0008-5472.CAN-06-0205
keywords:
- Adaptor Proteins, Signal Transducing/metabolism
- Cell Line, Tumor
content_type: abstract_only
---
# MUC1 oncoprotein blocks nuclear targeting...
**Authors:** Raina D, Ahmad R, ...
**Journal:** Cancer Research (2006)
**DOI:** [10.1158/...](https://doi.org/10.1158/...)
## Content
1. Cancer Res. 2006 Jul 1;66(13):6715-21...
Text format:
Reference: PMID:16888623
Title: MUC1 oncoprotein blocks nuclear targeting...
Authors: Raina D, Ahmad R, ...
Journal: Cancer Research (2006)
DOI: 10.1158/0008-5472.CAN-06-0205
Keywords: Adaptor Proteins, Signal Transducing/metabolism, Cell Line, Tumor, ...
Content type: abstract_only
--- Content ---
1. Cancer Res. 2006 Jul 1;66(13):6715-21...
With supplementary files (Zenodo):
Reference: DOI:10.5281/zenodo.7961621
Title: Gene Ontology Curators AI Workshop
Authors: Dickinson R, Carbon S, Mungall CJ
...
Content type: abstract_only
--- Supplementary Files (3) ---
- Dickinson_Varenna2022.pdf (1,975,995 bytes)
- workshop_slides.pptx (2,345,678 bytes)
- data_analysis.xlsx (123,456 bytes)
--- Content ---
...
Exit Codes
0- At least one reference fetched successfully1- All references failed to fetch
validate
Validate supporting text against references.
linkml-reference-validator validate COMMAND [ARGS]...
Subcommands
text- Validate a single text quotetext-file- Validate supporting text extracted from a text file via regexdata- Validate supporting text in data files
validate text
Validate a single supporting text quote against a reference.
Usage
linkml-reference-validator validate text [OPTIONS] TEXT REFERENCE_ID
Arguments
- TEXT (required) - The supporting text to validate
- REFERENCE_ID (required) - Reference ID (e.g., PMID:12345678 or DOI:10.1234/example)
Options
--title, -t TEXT- Expected title to validate against the reference title--cache-dir PATH- Directory for caching references (default:references_cache)--config PATH- Path to validation configuration file (.yaml)--verbose, -v- Verbose output with detailed logging--help- Show help message
Examples
Basic validation:
linkml-reference-validator validate text \
"MUC1 oncoprotein blocks nuclear targeting" \
PMID:16888623
With custom cache directory:
linkml-reference-validator validate text \
"MUC1 oncoprotein blocks nuclear targeting" \
PMID:16888623 \
--cache-dir /path/to/cache
With verbose output:
linkml-reference-validator validate text \
"MUC1 oncoprotein blocks nuclear targeting" \
PMID:16888623 \
--verbose
With title check:
linkml-reference-validator validate text \
"Airway epithelial brushings" \
GEO:GSE67472 \
--title "Airway epithelial gene expression in asthma versus healthy controls"
With editorial notes:
linkml-reference-validator validate text \
'MUC1 [mucin 1] oncoprotein blocks nuclear targeting' \
PMID:16888623
With ellipsis:
linkml-reference-validator validate text \
"MUC1 oncoprotein ... nuclear targeting" \
PMID:16888623
With DOI:
linkml-reference-validator validate text \
"Nanometre-scale thermometry" \
DOI:10.1038/nature12373
Exit Codes
0- Validation successful1- Validation failed
Output Format
Validating text against PMID:16888623...
Text: MUC1 oncoprotein blocks nuclear targeting
Result:
Valid: True
Message: Supporting text validated successfully in PMID:16888623
Matched text: MUC1 oncoprotein blocks nuclear targeting...
validate text-file
Validate supporting text in a text file by extracting quotes and references with a regex.
Usage
linkml-reference-validator validate text-file [OPTIONS] FILE_PATH
Arguments
- FILE_PATH (required) - Path to a text file (e.g., OBO, plain text)
Options
--regex, -r TEXT(required) - Regular expression with capture groups for text and reference ID--text-group, -t INTEGER- Capture group number for supporting text (default: 1)--ref-group, -R INTEGER- Capture group number for reference ID (default: 2)--summary, -s- Show only summary statistics (skip per-line output)--cache-dir PATH, -c PATH- Directory for caching references (default:references_cache)--config PATH- Path to validation configuration file (.yaml)--verbose, -v- Verbose output with detailed logging--help- Show help message
Examples
Validate OBO axiom annotations:
linkml-reference-validator validate text-file my_ontology.obo \
--regex 'ex:supporting_text="([^"]*)\[(\S+:\S+)\]"' \
--text-group 1 \
--ref-group 2
Summary only:
linkml-reference-validator validate text-file my_ontology.obo \
--regex 'ex:supporting_text="([^"]*)\[(\S+:\S+)\]"' \
--summary
Exit Codes
0- Validation successful1- Validation failed
validate data
Validate supporting text in data files against their cited references.
Usage
linkml-reference-validator validate data [OPTIONS] DATA_FILE
Arguments
- DATA_FILE (required) - Path to data file (YAML/JSON)
Options
--schema PATH, -s PATH(required) - Path to LinkML schema file--target-class TEXT, -t TEXT- Target class to validate (optional)--cache-dir PATH, -c PATH- Directory for caching references (default:references_cache)--config PATH- Path to validation configuration file (.yaml)--verbose, -v- Verbose output with detailed logging--help- Show help message
Examples
Basic validation:
linkml-reference-validator validate data \
data.yaml \
--schema schema.yaml
With target class:
linkml-reference-validator validate data \
data.yaml \
--schema schema.yaml \
--target-class Statement
With custom cache:
linkml-reference-validator validate data \
data.yaml \
--schema schema.yaml \
--cache-dir /path/to/cache
With verbose output:
linkml-reference-validator validate data \
data.yaml \
--schema schema.yaml \
--verbose
Exit Codes
0- All validations passed1- One or more validations failed
Output Format
Success:
Validating data.yaml against schema schema.yaml
Cache directory: references_cache
Validation Summary:
Total checks: 3
All validations passed!
Failure:
Validating data.yaml against schema schema.yaml
Cache directory: references_cache
Validation Issues (2):
[ERROR] Text part not found as substring: 'MUC1 activates JAK-STAT'
Location: Statement
Validation Summary:
Total checks: 3
Issues found: 2
repair
Repair supporting text validation errors.
linkml-reference-validator repair COMMAND [ARGS]...
Subcommands
text- Repair a single text quotedata- Repair supporting text in data files
repair text
Attempt to repair a single supporting text quote.
Usage
linkml-reference-validator repair text [OPTIONS] TEXT REFERENCE_ID
Arguments
- TEXT (required) - The supporting text to repair
- REFERENCE_ID (required) - Reference ID (e.g., PMID:12345678 or DOI:10.1234/example)
Options
--cache-dir PATH, -c PATH- Directory for caching references--config PATH- Path to configuration file (.yaml)--verbose, -v- Verbose output with detailed logging--auto-fix-threshold FLOAT, -a FLOAT- Minimum similarity for auto-fixes (default: 0.95)--help- Show help message
Examples
Repair character normalization:
linkml-reference-validator repair text \
"CO2 levels were measured" \
PMID:12345678
With verbose output:
linkml-reference-validator repair text \
"protein functions in cells" \
PMID:12345678 \
--verbose
Exit Codes
0- Repair successful or already valid1- Could not repair
Output Format
Successful repair:
Attempting repair for PMID:12345678...
Text: CO2 levels were measured
Result:
✓ Repaired successfully
Original: CO2 levels were measured
Repaired: CO₂ levels were measured
Action: CHARACTER_NORMALIZATION (Character normalization fix)
Confidence: HIGH
Already valid:
Result:
✓ Text already valid - no repair needed
Could not repair:
Result:
✗ Could not repair: Flagged for removal - text not found in reference
Suggestion: REMOVAL
Confidence: VERY_LOW (12%)
repair data
Repair supporting text in data files.
Usage
linkml-reference-validator repair data [OPTIONS] DATA_FILE
Arguments
- DATA_FILE (required) - Path to data file (YAML)
Options
--schema PATH, -s PATH(required) - Path to LinkML schema file--target-class TEXT, -t TEXT- Target class to validate--dry-run / --no-dry-run, -n / -N- Show changes without applying (default: dry-run)--auto-fix-threshold FLOAT, -a FLOAT- Minimum similarity for auto-fixes (default: 0.95)--output PATH, -o PATH- Output file path (default: overwrite with backup)--config PATH- Path to configuration file (.yaml)--cache-dir PATH, -c PATH- Directory for caching references--verbose, -v- Verbose output with detailed logging--help- Show help message
Examples
Dry run (default):
linkml-reference-validator repair data \
disease.yaml \
--schema schema.yaml \
--dry-run
Apply repairs:
linkml-reference-validator repair data \
disease.yaml \
--schema schema.yaml \
--no-dry-run
Output to new file:
linkml-reference-validator repair data \
disease.yaml \
--schema schema.yaml \
--no-dry-run \
--output repaired.yaml
With configuration file:
linkml-reference-validator repair data \
disease.yaml \
--schema schema.yaml \
--config .linkml-reference-validator.yaml
Custom threshold:
linkml-reference-validator repair data \
disease.yaml \
--schema schema.yaml \
--auto-fix-threshold 0.98 \
--no-dry-run
Exit Codes
0- Repair completed (may have suggestions)1- Repair completed but has removals or unverifiable items
Output Format
[DRY RUN] Repairing disease.yaml
Schema: schema.yaml
Auto-fix threshold: 0.95
Cache directory: references_cache
Found 5 evidence item(s) to process
============================================================
Repair Report
============================================================
HIGH CONFIDENCE FIXES (auto-applicable):
PMID:12345678 at evidence[0]:
Character normalization fix
'CO2 levels...' → 'CO₂ levels...'
SUGGESTED FIXES (review recommended):
PMID:23456789 at evidence[1]:
Inserted ellipsis between non-contiguous parts
RECOMMENDED REMOVALS (low confidence):
PMID:34567890 at evidence[2]:
Similarity: 8%
Snippet: 'Fabricated text...'
------------------------------------------------------------
Summary:
Total items: 5
Already valid: 2
Auto-fixes: 1
Suggestions: 1
Removals: 1
Unverifiable: 0
Configuration File
Create .linkml-reference-validator.yaml for project-specific settings. Use
the validation section for reference fetching behavior and repair for
auto-fix settings.
validation:
reference_prefix_map:
geo: GEO
NCBIGeo: GEO
repair:
# Confidence thresholds
auto_fix_threshold: 0.95
suggest_threshold: 0.80
removal_threshold: 0.50
# Character mappings
character_mappings:
"+/-": "±"
"CO2": "CO₂"
"H2O": "H₂O"
# References to skip
skip_references:
- "PMID:12345678"
# References trusted despite low similarity
trusted_low_similarity:
- "PMID:98765432"
cache
Manage reference cache.
linkml-reference-validator cache COMMAND [ARGS]...
Subcommands
reference- Cache a reference for offline uselookup- Show the cache path for a reference (or print file contents)
cache reference
Pre-fetch and cache a reference for offline use.
Usage
linkml-reference-validator cache reference [OPTIONS] REFERENCE_ID
Arguments
- REFERENCE_ID (required) - Reference ID (e.g., PMID:12345678 or DOI:10.1234/example)
Options
--cache-dir PATH, -c PATH- Directory for caching references (default:references_cache)--config PATH- Path to validation configuration file (.yaml)--force, -f- Force re-fetch even if cached--verbose, -v- Verbose output with detailed logging--help- Show help message
Examples
Cache a reference:
linkml-reference-validator cache reference PMID:16888623
Force refresh:
linkml-reference-validator cache reference \
PMID:16888623 \
--force
Custom cache directory:
linkml-reference-validator cache reference \
PMID:16888623 \
--cache-dir /path/to/cache
Cache a DOI:
linkml-reference-validator cache reference DOI:10.1038/nature12373
Output Format
Fetching PMID:16888623...
Successfully cached PMID:16888623
Title: MUC1 oncoprotein blocks nuclear targeting...
Authors: Raina D, Ahmad R, Joshi MD
Content type: abstract_only
Content length: 1523 characters
cache lookup
Show the cached file path for a reference, or print the cached file contents.
Usage
linkml-reference-validator cache lookup [OPTIONS] REFERENCE_ID
Arguments
- REFERENCE_ID (required) - Reference ID (e.g., PMID:12345678)
Options
--content- Show file contents instead of just the path--no-cache- Bypass disk cache and fetch fresh from source--cache-dir PATH, -c PATH- Directory for caching references (default:references_cache)--config PATH- Path to validation configuration file (.yaml)--verbose, -v- Verbose output with detailed logging--help- Show help message
Examples
Show cache path:
linkml-reference-validator cache lookup PMID:16888623
Print cached content:
linkml-reference-validator cache lookup PMID:16888623 --content
Refresh then show path:
linkml-reference-validator cache lookup PMID:16888623 --no-cache
Reference ID Formats
PubMed (PMID)
PMID:12345678
PMID:9876543
- Numeric identifier only
- Fetches abstract and metadata from NCBI
PubMed Central (PMC)
PMC:3458566
PMC:7654321
- Numeric identifier only
- Fetches full-text when available
DOI (Digital Object Identifier)
DOI:10.1038/nature12373
DOI:10.1126/science.1234567
- Standard DOI format (10.prefix/suffix)
- Fetches metadata from Crossref API
- Abstract availability depends on publisher
Configuration Notes
- The CLI currently does not read environment variables for cache dir or NCBI API keys.
- Use
--cache-diror setcache_dirin.linkml-reference-validator.yaml. - Set
emailin.linkml-reference-validator.yamlfor NCBI requests.
Shell Integration
Exit Code Usage
if linkml-reference-validator validate text \
"MUC1 oncoprotein blocks nuclear targeting" \
PMID:16888623 > /dev/null 2>&1; then
echo "✓ Valid"
else
echo "✗ Invalid"
fi
Batch Processing
for pmid in PMID:111 PMID:222 PMID:333; do
echo "Validating $pmid..."
linkml-reference-validator validate text \
"some text" \
"$pmid"
done
Piping Output
# Save output to file
linkml-reference-validator validate text \
"..." PMID:12345678 \
> validation_result.txt
# Grep for specific info
linkml-reference-validator validate data \
data.yaml \
--schema schema.yaml \
| grep "Valid:"
Backward Compatibility
Old hyphenated commands still work but are deprecated:
# Old (deprecated but working)
linkml-reference-validator validate-text "..." PMID:123
linkml-reference-validator validate-data data.yaml --schema schema.yaml
linkml-reference-validator cache-reference PMID:123
# New (preferred)
linkml-reference-validator validate text "..." PMID:123
linkml-reference-validator validate data data.yaml --schema schema.yaml
linkml-reference-validator cache reference PMID:123
The old commands are hidden from --help but continue to function.
See Also
- Quickstart - Get started quickly
- Tutorial 1 - CLI examples
- Python API - Programmatic usage