Part 9: Working with SQL databases

LinkML allows you to work with SQL/Relational databases in a number of different ways.

  • SQL Schemas (DDL) can be generated directly from LinkML schemas

  • Optional Object Relational Mapping (ORM) layers can be generated

    • Currently only Python / SQL Alchemy ORM layers are supported

  • SQL Backends can be used directly to store and retrieve data

    • No coding required

    • Currently only SQLite is supported

Converting LinkML Schemas to DDL

Example Schema

We will use the example from the previous tutorial (examples/tutorial/tutorial07/personinfo.yaml).

personinfo.yaml:

id: https://w3id.org/linkml/examples/personinfo
name: personinfo
prefixes:
  linkml: https://w3id.org/linkml/
  schema: http://schema.org/
  personinfo: https://w3id.org/linkml/examples/personinfo/
  ORCID: https://orcid.org/
imports:
  - linkml:types
default_range: string

classes:
  Person:
    class_uri: schema:Person
    slots:   ## specified as a list
     - id
     - full_name
     - aliases
     - phone
     - age
    id_prefixes:
      - ORCID
  Container:
    attributes:
      persons:
        multivalued: true
        inlined_as_list: true
        range: Person

# slots are first-class entities in the metamodel
# declaring them here allows them to be reused elsewhere
slots:
  id:
    identifier: true
  full_name:
    required: true
    description:
      name of the person
    slot_uri: schema:name
  aliases:
    multivalued: true
    description:
      other names for the person
  phone:
    pattern: "^[\\d\\(\\)\\-]+$"
    slot_uri: schema:telephone
  age:
    range: integer
    minimum_value: 0
    maximum_value: 200

data.yaml:

persons:
  - id: ORCID:1234
    full_name: Clark Kent
    age: 33
    phone: 555-555-5555
  - id: ORCID:4567
    full_name: Lois Lane
    age: 34

Generating SQL CREATE TABLE statements

Use the gen-sqltables command to make a SQL schema:

gen-sqltables personinfo.yaml

Outputs:

-- # Class: Person
--     * Slot: id
--     * Slot: full_name Description: name of the person
--     * Slot: phone
--     * Slot: age
--     * Slot: Container_id Description: Autocreated FK slot
-- # Class: Container
--     * Slot: id
-- # Class: Person_aliases
--     * Slot: Person_id Description: Autocreated FK slot
--     * Slot: aliases Description: other names for the person

CREATE TABLE "Container" (
	id INTEGER NOT NULL,
	PRIMARY KEY (id)
);
CREATE INDEX "ix_Container_id" ON "Container" (id);

CREATE TABLE "Person" (
	id TEXT NOT NULL,
	full_name TEXT NOT NULL,
	phone TEXT,
	age INTEGER,
	"Container_id" INTEGER,
	PRIMARY KEY (id),
	FOREIGN KEY("Container_id") REFERENCES "Container" (id)
);
CREATE INDEX "ix_Person_id" ON "Person" (id);

CREATE TABLE "Person_aliases" (
	"Person_id" TEXT,
	aliases TEXT,
	PRIMARY KEY ("Person_id", aliases),
	FOREIGN KEY("Person_id") REFERENCES "Person" (id)
);
CREATE INDEX "ix_Person_aliases_aliases" ON "Person_aliases" (aliases);
CREATE INDEX "ix_Person_aliases_Person_id" ON "Person_aliases" ("Person_id");

Using a SQL database as a backend for data

You can store any data that has a LinkML schema in a database using the linkml-sqldb command, and the dump subcommand:

linkml-sqldb dump -s personinfo.yaml --db persons.db data.yaml

This will create a SQLite database persons.db (you don’t have to worry about creating the schema, this is handled automatically)

Data can be retrieved from the database using the load subcommand:

linkml-sqldb load -s personinfo.yaml --db persons.db -o data_out.yaml

This will export the data from the database back into YAML format.

Alternatively, data can be queried directly from the database via sqlite3, e. g. the following command

sqlite3 persons.db "SELECT * FROM Person"

or alternatively using the sqlite3 module from Python (if you don’t have sqlite3 installed)

python -m sqlite3 persons.db "SELECT * FROM Person"

The coummand will result in:

ORCID:1234|Clark Kent|555-555-5555|33|1
ORCID:4567|Lois Lane||34|1

Currently the sqldb command doesn’t allow complex querying. For that you need to either write SQL as above, or use code as in an ORM layer.

Object Relational Mapping (ORMs)

Object Relational Mapping (ORM) layers provide a bridge between an Object-Oriented (OO) representation of database and a Relational Database. Examples of ORMs are:

  • Hibernate (Java)

  • SQL Alchemy (Python)

  • ActiveRecord (Ruby)

The LinkML metamodel has aspects of OO modeling, so ORMs can be useful here. Note that ORMs can be divisive among developers, with some believe ORMs to be add unnecessary complexity, and others finding them indispensable. Use of ORMs is completely optional with LinkML, but if you are using a relational database management system (RDBMS) you may find them useful.

Currently the only ORM directly supported is SQL Alchemy.

SQL Alchemy (advanced)

SQL Alchemy is a SQL framework for python. It has a core layer, and an ORM layer.

You can generate SQL Alchemy classes using:

gen-sqla personinfo.yaml

See the SQLTables generator docs for more details.