LinkML-Map tutorial: Compiling to SQL
This tutorial walks through use of the SQL compiler
import yaml
Creating an example schema
We will use a LinkML SchemaBuilder object to progressively build up a schema, adding additional features as we go.
We'll start with a simple Person schema, with a few single valued scalar slots:
from linkml_runtime.utils.schema_builder import SchemaBuilder
from linkml_runtime.linkml_model import SlotDefinition
sb = SchemaBuilder()
sb.add_class("Person", slots=[SlotDefinition("family_name", range="string"),
SlotDefinition("given_name", range="string"),
SlotDefinition("age_in_years", range="integer"),
SlotDefinition("height_in_cm", range="float"),
])
sb.add_defaults()
print(yaml.dump(sb.as_dict(), sort_keys=False))
Creating a Transformer Session object
We will use a Session object which conveniently wraps a number of different capabilities;
the first of these capabilities is to map (transform) data objects from one schema to another
(implicit) schema using a transformer specification).
Our initial transformer specification will be a trivial isomorphic one that:
- maps the
Personclass to anIndividualclass - passes through
namefields as-is - renames measurement fields (
age_in_yearsandheight_in_cmtoageandheight)
from linkml_map.session import Session
session = Session()
session.set_source_schema(sb.as_dict())
# Transformer specification (in YAML)
session.set_object_transformer("""
class_derivations:
Individual:
populated_from: Person
slot_derivations:
family_name:
populated_from: family_name
given_name:
populated_from: given_name
age:
populated_from: age_in_years
height:
populated_from: height_in_cm
""")
Compiling target schema to SQL INSERTs
The LinkML-Map framework can infer the target schema from the mapping. We will take the inferred schema and write SQL INSERT statements for it.
from linkml_map.compiler.sql_compiler import SQLCompiler
compiler = SQLCompiler(source_schemaview=session.source_schemaview)
print(compiler.create_target_ddl(session.transformer_specification))
print(compiler.create_ddl(session.source_schemaview))
compiled = compiler.compile(session.transformer_specification)
print(compiled.serialization)
compiler.new_table_when_transforming = True
compiled = compiler.compile(session.transformer_specification)
print(compiled.serialization)
import duckdb
con = duckdb.connect(database=':memory:')