LinkML Metamodel Mappings
The primary use case of LinkML is to map data to data. However, because the LinkML metamodel is expressed as a LinkML schema, it is possible to map to and from the metamodel itself.
NOTE this workflow is not yet fully matured, and is subject to change.
Creation of ad-hoc metamodel
Let's assume that we have data represented using models conforming to an ad-hoc metamodel with core constructs: Schema, Table, and Column. We can create a LinkML schema that represents this metamodel as follows:
import yaml
from linkml_runtime.linkml_model import SlotDefinition
from linkml_runtime.utils.schema_builder import SchemaBuilder
sb = SchemaBuilder()
sb.add_class("Schema",
tree_root=True,
slots=[
SlotDefinition("id", required=True, description="The name of the schema"),
SlotDefinition("tables", range="Table", multivalued=True, inlined_as_list=True),
],
use_attributes=True,
)
sb.add_class("Table",
slots=[
SlotDefinition("id", required=True, description="The name of the table"),
SlotDefinition("columns", range="Column", multivalued=True, inlined_as_list=True),
],
use_attributes=True,
)
sb.add_class("Column",
slots=[
SlotDefinition("id", required=True, description="The name of the column"),
SlotDefinition("primary_key", range="boolean"),
SlotDefinition("datatype", range="string"),
],
use_attributes=True,
)
my_metamodel = sb.as_dict()
print(yaml.dump(sb.as_dict(), sort_keys=False))
Example schema conforming to ad-hoc metamodel
Now we'll make an example schema that conforms to this metamodel; this will be a fairly boring schema with a single table Person with three columns: id, name, and description:
my_schema = {
"id": "my_schema",
"tables": [
{
"id": "Person",
"columns": [
{
"id": "id",
"primary_key": True,
"datatype": "integer"
},
{
"id": "name",
"datatype": "string"
},
{
"id": "description",
"datatype": "string"
}
]
},
]
}
Mapping to LinkML metamodel
Now we'll create mappings from the ad-hoc metamodel to the LinkML metamodel, where Table maps to a LinkML ClassDefinition, Column maps to a LinkML *SlotDefinition.
metamap = {
"class_derivations": {
"SchemaDefinition": {
"populated_from": "Schema",
"slot_derivations": {
"name": {
"populated_from": "id",
},
"id": {
"expr": "'https://example.org/' + id",
},
"classes": {
"populated_from": "tables",
"dictionary_key": "name",
"cast_collection_as": "MultiValuedDict",
},
}
},
"ClassDefinition": {
"populated_from": "Table",
"slot_derivations": {
"name": {
"populated_from": "id",
},
"attributes": {
"populated_from": "columns",
"dictionary_key": "name",
"cast_collection_as": "MultiValuedDict",
},
}
},
"SlotDefinition": {
"populated_from": "Column",
"slot_derivations": {
"name": {
"populated_from": "id",
},
"identifier": {
"populated_from": "primary_key",
},
"range": {
"populated_from": "datatype",
},
},
},
}
}
from linkml_map.session import Session
session = Session()
session.set_source_schema(my_metamodel)
session.set_object_transformer(metamap)
session.transform(my_schema)
Customizing the LinkML model
A different scenario is where you might want to customize the existing LinkML metamodel, in particular, adding additional constraints. For example:
- class and names MUST be alphanumeric with no spaces
- every element MUST have a definition
TODO