Part 9: Working with SQL databases¶
LinkML allows you to work with SQL/Relational databases in a number of different ways.
SQL Schemas (DDL) can be generated directly from LinkML schemas
Optional Object Relational Mapping (ORM) layers can be generated
Currently only Python / SQL Alchemy ORM layers are supported
SQL Backends can be used directly to store and retrieve data
No coding required
Currently only SQLite is supported
Converting LinkML Schemas to DDL¶
Example Schema¶
We will use the example from the previous tutorial (examples/tutorial/tutorial07/personinfo.yaml).
personinfo.yaml:
id: https://w3id.org/linkml/examples/personinfo
name: personinfo
prefixes:
linkml: https://w3id.org/linkml/
schema: http://schema.org/
personinfo: https://w3id.org/linkml/examples/personinfo/
ORCID: https://orcid.org/
imports:
- linkml:types
default_range: string
classes:
Person:
class_uri: schema:Person
slots: ## specified as a list
- id
- full_name
- aliases
- phone
- age
id_prefixes:
- ORCID
Container:
attributes:
persons:
multivalued: true
inlined_as_list: true
range: Person
# slots are first-class entities in the metamodel
# declaring them here allows them to be reused elsewhere
slots:
id:
identifier: true
full_name:
required: true
description:
name of the person
slot_uri: schema:name
aliases:
multivalued: true
description:
other names for the person
phone:
pattern: "^[\\d\\(\\)\\-]+$"
slot_uri: schema:telephone
age:
range: integer
minimum_value: 0
maximum_value: 200
data.yaml:
persons:
- id: ORCID:1234
full_name: Clark Kent
age: 33
phone: 555-555-5555
- id: ORCID:4567
full_name: Lois Lane
age: 34
Generating SQL CREATE TABLE statements¶
Use the gen-sqltables command to make a SQL schema:
gen-sqltables personinfo.yaml
Outputs:
-- # Class: Person
-- * Slot: id
-- * Slot: full_name Description: name of the person
-- * Slot: phone
-- * Slot: age
-- * Slot: Container_id Description: Autocreated FK slot
-- # Class: Container
-- * Slot: id
-- # Class: Person_aliases
-- * Slot: Person_id Description: Autocreated FK slot
-- * Slot: aliases Description: other names for the person
CREATE TABLE "Container" (
id INTEGER NOT NULL,
PRIMARY KEY (id)
);
CREATE INDEX "ix_Container_id" ON "Container" (id);
CREATE TABLE "Person" (
id TEXT NOT NULL,
full_name TEXT NOT NULL,
phone TEXT,
age INTEGER,
"Container_id" INTEGER,
PRIMARY KEY (id),
FOREIGN KEY("Container_id") REFERENCES "Container" (id)
);
CREATE INDEX "ix_Person_id" ON "Person" (id);
CREATE TABLE "Person_aliases" (
"Person_id" TEXT,
aliases TEXT,
PRIMARY KEY ("Person_id", aliases),
FOREIGN KEY("Person_id") REFERENCES "Person" (id)
);
CREATE INDEX "ix_Person_aliases_aliases" ON "Person_aliases" (aliases);
CREATE INDEX "ix_Person_aliases_Person_id" ON "Person_aliases" ("Person_id");
Using a SQL database as a backend for data¶
You can store any data that has a LinkML schema in a database using the linkml-sqldb command, and the dump subcommand:
linkml-sqldb dump -s personinfo.yaml --db persons.db data.yaml
This will create a SQLite database persons.db (you don’t have to worry about creating the schema, this is handled automatically)
Data can be retrieved from the database using the load subcommand:
linkml-sqldb load -s personinfo.yaml --db persons.db -o data_out.yaml
This will export the data from the database back into YAML format.
Alternatively, data can be queried directly from the database via sqlite3, e. g. the following command
sqlite3 persons.db "SELECT * FROM Person"
or alternatively using the sqlite3 module from Python (if you don’t have sqlite3 installed)
python -m sqlite3 persons.db "SELECT * FROM Person"
The coummand will result in:
ORCID:1234|Clark Kent|555-555-5555|33|1
ORCID:4567|Lois Lane||34|1
Currently the sqldb command doesn’t allow complex querying. For that you need to either write SQL as above, or use code as in an ORM layer.
Object Relational Mapping (ORMs)¶
Object Relational Mapping (ORM) layers provide a bridge between an Object-Oriented (OO) representation of database and a Relational Database. Examples of ORMs are:
Hibernate (Java)
SQL Alchemy (Python)
ActiveRecord (Ruby)
The LinkML metamodel has aspects of OO modeling, so ORMs can be useful here. Note that ORMs can be divisive among developers, with some believe ORMs to be add unnecessary complexity, and others finding them indispensable. Use of ORMs is completely optional with LinkML, but if you are using a relational database management system (RDBMS) you may find them useful.
Currently the only ORM directly supported is SQL Alchemy.
SQL Alchemy (advanced)¶
SQL Alchemy is a SQL framework for python. It has a core layer, and an ORM layer.
You can generate SQL Alchemy classes using:
gen-sqla personinfo.yaml
See the SQLTables generator docs for more details.