LinkML-Map tutorial: Compiling to SQL

This tutorial walks through use of the SQL compiler

import yaml

Creating an example schema

We will use a LinkML SchemaBuilder object to progressively build up a schema, adding additional features as we go.

We'll start with a simple Person schema, with a few single valued scalar slots:

from linkml_runtime.utils.schema_builder import SchemaBuilder
from linkml_runtime.linkml_model import SlotDefinition

sb = SchemaBuilder()
sb.add_class("Person", slots=[SlotDefinition("family_name", range="string"), 
                              SlotDefinition("given_name", range="string"),
                              SlotDefinition("age_in_years", range="integer"),
                              SlotDefinition("height_in_cm", range="float"),
                              ])
sb.add_defaults()
print(yaml.dump(sb.as_dict(), sort_keys=False))

Creating a Transformer Session object

We will use a Session object which conveniently wraps a number of different capabilities; the first of these capabilities is to map (transform) data objects from one schema to another (implicit) schema using a transformer specification).

Our initial transformer specification will be a trivial isomorphic one that:

  • maps the Person class to an Individual class
  • passes through name fields as-is
  • renames measurement fields (age_in_years and height_in_cm to age and height)
from linkml_map.session import Session

session = Session()
session.set_source_schema(sb.as_dict())
# Transformer specification (in YAML)
session.set_object_transformer("""
class_derivations:
  Individual:
    populated_from: Person
    slot_derivations:
      family_name:
        populated_from: family_name
      given_name:
        populated_from: given_name
      age:
        populated_from: age_in_years
      height:
        populated_from: height_in_cm
""")

Compiling target schema to SQL INSERTs

The LinkML-Map framework can infer the target schema from the mapping. We will take the inferred schema and write SQL INSERT statements for it.

from linkml_map.compiler.sql_compiler import SQLCompiler

compiler = SQLCompiler(source_schemaview=session.source_schemaview)
print(compiler.create_target_ddl(session.transformer_specification))
CREATE TABLE IF NOT EXISTS Individual (
  family_name TEXT,
  given_name TEXT,
  age INTEGER,
  height REAL
);

print(compiler.create_ddl(session.source_schemaview))
CREATE TABLE IF NOT EXISTS Person (
  family_name TEXT,
  given_name TEXT,
  age_in_years INTEGER,
  height_in_cm REAL
);

compiled = compiler.compile(session.transformer_specification)
print(compiled.serialization)
INSERT INTO Individual SELECT 
  family_name AS family_name, 
  given_name AS given_name, 
  age AS age_in_years, 
  height AS height_in_cm FROM Individual;

compiler.new_table_when_transforming = True
compiled = compiler.compile(session.transformer_specification)
print(compiled.serialization)
CREATE TABLE IF NOT EXISTS Individual 
  family_name AS family_name, 
  given_name AS given_name, 
  age AS age_in_years, 
  height AS height_in_cm); FROM Individual;

import duckdb
con = duckdb.connect(database=':memory:')