linkml_store.api.stores.duckdb package
Adapter for DuckDB embedded database.
Handles have the form:
duckdb:///<path>
for a file-based database
duckdb:///:memory:
for an in-memory database
- class DuckDBCollection(*args, **kwargs)[source]
Bases:
Collection
- insert(objs, **kwargs)[source]
Add one or more objects to the collection.
>>> from linkml_store import Client >>> client = Client() >>> db = client.attach_database("duckdb", alias="test") >>> collection = db.create_collection("Person") >>> objs = [{"id": "P1", "name": "John", "age_in_years": 30}, {"id": "P2", "name": "Alice", "age_in_years": 25}] >>> collection.insert(objs)
- Parameters:
objs (
Union
[Dict
[str
,Any
],BaseModel
,Type
,List
[Union
[Dict
[str
,Any
],BaseModel
,Type
]]])kwargs
- Returns:
- delete(objs, **kwargs)[source]
Delete one or more objects from the collection.
First let’s set up a collection:
>>> from linkml_store import Client >>> client = Client() >>> db = client.attach_database("duckdb", alias="test") >>> collection = db.create_collection("Person") >>> objs = [{"id": "P1", "name": "John", "age_in_years": 30}, {"id": "P2", "name": "Alice", "age_in_years": 25}] >>> collection.insert(objs) >>> collection.find({}).num_rows 2
Now let’s delete an object:
>>> collection.delete(objs[0]) >>> collection.find({}).num_rows 1
Deleting the same object again should have no effect:
>>> collection.delete(objs[0]) >>> collection.find({}).num_rows 1
- Parameters:
objs (
Union
[Dict
[str
,Any
],BaseModel
,Type
,List
[Union
[Dict
[str
,Any
],BaseModel
,Type
]]])kwargs
- Return type:
Optional
[int
]- Returns:
- delete_where(where=None, missing_ok=True, **kwargs)[source]
Delete objects that match a query.
First let’s set up a collection:
>>> from linkml_store import Client >>> client = Client() >>> db = client.attach_database("duckdb", alias="test") >>> collection = db.create_collection("Person") >>> objs = [{"id": "P1", "name": "John", "age_in_years": 30}, {"id": "P2", "name": "Alice", "age_in_years": 25}] >>> collection.insert(objs)
Now let’s delete an object:
>>> collection.delete_where({"id": "P1"}) >>> collection.find({}).num_rows 1
Match everything:
>>> collection.delete_where({}) >>> collection.find({}).num_rows 0
- Parameters:
where (
Optional
[Dict
[str
,Any
]]) – where conditionsmissing_ok – if True, do not raise an error if the collection does not exist
kwargs
- Return type:
Optional
[int
]- Returns:
number of objects deleted (or -1 if unsupported)
- query_facets(where=None, facet_columns=None, facet_limit=100, **kwargs)[source]
Run a query to get facet counts for one or more columns.
This function takes a database connection, a Query object, and a list of column names. It generates and executes a facet count query for each specified column and returns the results as a dictionary where the keys are the column names and the values are pandas DataFrames containing the facet counts.
The facet count query is generated by modifying the original query’s WHERE clause to exclude conditions directly related to the facet column. This allows for counting the occurrences of each unique value in the facet column while still applying the other filtering conditions.
- Parameters:
con – A DuckDB database connection.
query – A Query object representing the base query.
facet_columns (
Optional
[List
[str
]]) – A list of column names to get facet counts for.facet_limit
- Return type:
Dict
[str
,Dict
[str
,int
]]- Returns:
A dictionary where keys are column names and values are tuples containing the facet counts for each unique value in the respective column.
- class DuckDBDatabase(handle=None, recreate_if_exists=False, **kwargs)[source]
Bases:
Database
An adapter for DuckDB databases.
Note that this adapter does not make use of a LinkML relational model transformation and SQL Alchemy ORM layer. Instead, it attempts to map each collection (which is of type some LinkML class) to a single DuckDB table. New tables are not created for nested references, and linking tables are not created for many-to-many relationships.
Instead the native DuckDB ARRAY type is used to store multivalued attributes, and DuckDB JSON types are used for nested inlined objects.
- collection_class
alias of
DuckDBCollection
- property engine: Engine
- drop(missing_ok=True, **kwargs)[source]
Drop the database and all collections.
>>> from linkml_store.api.client import Client >>> client = Client() >>> path = Path("/tmp/test.db") >>> path.parent.mkdir(exist_ok=True, parents=True) >>> db = client.attach_database(f"duckdb:///{path}") >>> db.store({"persons": [{"id": "P1", "name": "John", "age_in_years": 30}]}) >>> coll = db.get_collection("persons") >>> coll.find({}).num_rows 1 >>> db.drop() >>> db = client.attach_database("duckdb:///tmp/test.db", alias="test") >>> coll = db.get_collection("persons") >>> coll.find({}).num_rows 0
- Parameters:
kwargs – additional arguments
- query(query, **kwargs)[source]
Run a query against the database.
Examples
>>> from linkml_store.api.client import Client >>> from linkml_store.api.queries import Query >>> client = Client() >>> db = client.attach_database("duckdb", alias="test") >>> collection = db.create_collection("Person") >>> collection.insert([{"id": "P1", "name": "John"}, {"id": "P2", "name": "Alice"}]) >>> query = Query(from_table="Person", where_clause={"name": "John"}) >>> result = db.query(query) >>> len(result.rows) 1 >>> result.rows[0]["id"] 'P1'
- type query:
Query
- param query:
- type kwargs:
- param kwargs:
- rtype:
QueryResult
- return:
- init_collections()[source]
Initialize collections.
TODO: Not typically called directly: consider making this private :return:
- induce_schema_view()[source]
Induce a schema view from a schema definition.
>>> from linkml_store.api.client import Client >>> from linkml_store.api.queries import Query >>> client = Client() >>> db = client.attach_database("duckdb", alias="test") >>> collection = db.create_collection("Person") >>> collection.insert([{"id": "P1", "name": "John", "age_in_years": 25}, ... {"id": "P2", "name": "Alice", "age_in_years": 25}]) >>> schema_view = db.induce_schema_view() >>> cd = schema_view.get_class("Person") >>> cd.attributes["id"].range 'string' >>> cd.attributes["age_in_years"].range 'integer'
- Return type:
SchemaView
- Returns:
A schema view
- export_database(location, target_format=None, **kwargs)[source]
Export a database to a file or location.
>>> from linkml_store.api.client import Client >>> client = Client() >>> db = client.attach_database("duckdb", alias="test") >>> db.import_database("tests/input/iris.csv", Format.CSV, collection_name="iris") >>> db.export_database("/tmp/iris.yaml", Format.YAML)
- Parameters:
location (
str
) – location of the filetarget_format (
Union
[str
,Format
,None
]) – target formatkwargs – additional arguments
Submodules
- linkml_store.api.stores.duckdb.duckdb_collection module
- linkml_store.api.stores.duckdb.duckdb_database module
DuckDBDatabase
DuckDBDatabase.collection_class
DuckDBDatabase.__init__()
DuckDBDatabase.engine
DuckDBDatabase.commit()
DuckDBDatabase.close()
DuckDBDatabase.drop()
DuckDBDatabase.query()
DuckDBDatabase.init_collections()
DuckDBDatabase.induce_schema_view()
DuckDBDatabase.export_database()
DuckDBDatabase.import_database()
- linkml_store.api.stores.duckdb.mappings module