How to index GO-CAMs with LinkML-Store

[1]:
import pandas as pd
import yaml

path = "input/gocam-models.yaml"
[2]:
models = list(yaml.safe_load_all(open(path)))

Creating a client and attaching to a database

First we will create a client as normal:

[3]:
from linkml_store import Client

client = Client()

Next we’ll attach to a MongoDB instance. this assumes you have one running already.

We will make a database called “GO-CAMs” and recreate it if it already exists

(note for people running this notebook locally - if you happen to have a database with this name in your current mongo instance it will be deleted!)

[4]:
db = client.attach_database("mongodb://localhost:27017/gocams", "gocams", recreate_if_exists=True)

Creating a collection

We’ll create a simple test collection. The concept of collection in linkml-store maps directly to mongodb collections

[5]:
collection = db.create_collection("main", recreate_if_exists=True)

Inserting objects into the store

We’ll use the standard insert method to insert the GO-CAMs into the collection. At this stage there is no explicit schema.

[6]:
collection.insert(models)

Check contents

We can check the number of rows in the collection, to ensure everything was inserted correctly:

[7]:
collection.find({}, limit=1).num_rows
[7]:
793
[8]:
assert collection.find({}, limit=1).num_rows == len(models)
[9]:
qr = collection.find({"taxon": "NCBITaxon:6239"}, limit=3)
qr.rows_dataframe
[9]:
id title taxon status comments activities objects
0 gomodel:568b0f9600000284 Antibacterial innate immune response in the in... NCBITaxon:6239 production [Automated change 2023-03-16: RO:0002212 repla... [{'id': 'gomodel:568b0f9600000284/57ec3a7e0000... [{'id': 'WB:WBGene00006599', 'label': 'tpa-1 C...
1 gomodel:5b528b1100000489 XBP-1 is a cell-nonautonomous regulator of str... NCBITaxon:6239 production [Automated change 2023-03-16: RO:0002213 repla... [{'id': 'gomodel:5b528b1100000489/5b528b110000... [{'id': 'WB:WBGene00006959', 'label': 'xbp-1 C...
2 gomodel:5b91dbd100002057 Antifungal innate immune response in the hypod... NCBITaxon:6239 production NaN [{'id': 'gomodel:5b91dbd100002057/5b91dbd10000... [{'id': 'WB:WBGene00010700', 'label': 'nipi-3 ...

Let’s check with pandas just to make sure it looks as expected; we’ll query for a specific OMIM disease:

[10]:
qr = collection.find({"activities.enabled_by": "WB:WBGene00006575"}, limit=3)
qr.rows_dataframe
[10]:
id title taxon status comments activities objects
0 gomodel:568b0f9600000284 Antibacterial innate immune response in the in... NCBITaxon:6239 production [Automated change 2023-03-16: RO:0002212 repla... [{'id': 'gomodel:568b0f9600000284/57ec3a7e0000... [{'id': 'WB:WBGene00006599', 'label': 'tpa-1 C...
1 gomodel:5b91dbd100002057 Antifungal innate immune response in the hypod... NCBITaxon:6239 production NaN [{'id': 'gomodel:5b91dbd100002057/5b91dbd10000... [{'id': 'WB:WBGene00010700', 'label': 'nipi-3 ...

As expected, there are three rows with the OMIM disease 618499.

Query faceting

We will now demonstrate faceted queries, allowing us to count the number of instances of different categorical values or categorical value combinations.

First we’ll facet on the subject sex. We can use path notation, e.g. subject.sex here:

[11]:
collection.query_facets({}, facet_columns=["taxon"])
[11]:
{'taxon': [('NCBITaxon:9606', 541),
  ('NCBITaxon:10090', 185),
  ('NCBITaxon:4896', 15),
  ('NCBITaxon:7955', 14),
  ('NCBITaxon:7227', 13),
  ('NCBITaxon:559292', 6),
  ('NCBITaxon:9823', 4),
  ('NCBITaxon:6239', 4),
  ('NCBITaxon:5074', 1),
  ('NCBITaxon:1735992', 1),
  ('NCBITaxon:229533', 1),
  ('NCBITaxon:1403190', 1),
  ('NCBITaxon:8355', 1),
  ('NCBITaxon:425011', 1),
  ('NCBITaxon:28576', 1),
  ('NCBITaxon:602072', 1),
  ('NCBITaxon:8364', 1),
  ('NCBITaxon:227321', 1),
  ('NCBITaxon:99287', 1)]}
[12]:
collection.query_facets({}, facet_columns=[("taxon", "status")])
[12]:
{('taxon',
  'status'): [({'taxon': 'NCBITaxon:9606', 'status': 'production'},
   541), ({'taxon': 'NCBITaxon:10090',
    'status': 'production'}, 185), ({'taxon': 'NCBITaxon:4896', 'status': 'production'},
   15), ({'taxon': 'NCBITaxon:7955', 'status': 'production'},
   14), ({'taxon': 'NCBITaxon:7227',
    'status': 'production'}, 13), ({'taxon': 'NCBITaxon:559292', 'status': 'production'},
   6), ({'taxon': 'NCBITaxon:6239', 'status': 'production'},
   4), ({'taxon': 'NCBITaxon:9823',
    'status': 'production'}, 4), ({'taxon': 'NCBITaxon:227321', 'status': 'production'},
   1), ({'taxon': 'NCBITaxon:8364', 'status': 'production'},
   1), ({'taxon': 'NCBITaxon:1735992',
    'status': 'production'}, 1), ({'taxon': 'NCBITaxon:8355', 'status': 'production'},
   1), ({'taxon': 'NCBITaxon:602072', 'status': 'production'},
   1), ({'taxon': 'NCBITaxon:229533',
    'status': 'production'}, 1), ({'taxon': 'NCBITaxon:1403190', 'status': 'production'},
   1), ({'taxon': 'NCBITaxon:425011', 'status': 'production'},
   1), ({'taxon': 'NCBITaxon:5074',
    'status': 'production'}, 1), ({'taxon': 'NCBITaxon:28576', 'status': 'production'},
   1), ({'taxon': 'NCBITaxon:99287', 'status': 'production'}, 1)]}

We can also facet by the disease name/label. We’ll restrict this to the top 20

[13]:
collection.query_facets({}, facet_columns=["activities.molecular_function.term"], facet_limit=20)

[13]:
{'activities.molecular_function.term': [(['GO:0004930',
    'GO:0019706',
    'GO:0004674'],
   2),
  (['GO:0048018', 'GO:0038023', 'GO:0004252', 'GO:0004252'], 2),
  (['GO:0008083', 'GO:0016167', 'GO:0004714'], 2),
  (['GO:0004714', 'GO:0004713', 'GO:0048018'], 2),
  (['GO:0005179', 'GO:0016500', 'GO:0005125'], 2),
  (['GO:0061630', 'GO:0004879', 'GO:0030374'], 2),
  (['GO:0140311', 'GO:0003700', 'GO:0140311', 'GO:1990756'], 1),
  (['GO:0004674', 'GO:0061665', 'GO:1990931', 'GO:0070139', 'GO:0004674'], 1),
  (['GO:0060090',
    'GO:0004674',
    'GO:0060090',
    'GO:0060090',
    'GO:0061630',
    'GO:0060090',
    'GO:0061630'],
   1),
  (['GO:0061630', 'GO:0170011', 'GO:0061630', 'GO:0061630'], 1),
  (['GO:0140463', 'GO:0140463', 'GO:0004674', 'GO:0004674', 'GO:0003887'], 1),
  (['GO:0038023', 'GO:0060090', 'GO:0060090', 'GO:0140311'], 1),
  (['GO:0003846', 'GO:0004144', 'GO:0005488'], 1),
  (['GO:0019706', 'GO:0140693', 'GO:0003690'], 1),
  (['GO:0004810', 'GO:0004521', 'GO:0004549', 'GO:1904678'], 1),
  (['GO:0003700', 'GO:0004715', 'GO:0005125', 'GO:0004900'], 1),
  (['GO:0004888', 'GO:0030674', 'GO:0004713', 'GO:0048018'], 1),
  (['GO:0003674',
    'GO:0003674',
    'GO:0000511',
    'GO:0003674',
    'GO:0140683',
    'GO:0140463',
    'GO:0003674',
    'GO:0003674',
    'GO:0030674',
    'GO:0008428',
    'GO:0003674',
    'GO:0004525',
    'GO:0003674',
    'GO:0003968',
    'GO:0032129',
    'GO:0031078',
    'GO:0000511',
    'GO:0140566',
    'GO:0031078',
    'GO:0003724',
    'GO:0016891',
    'GO:0140750',
    'GO:0046974',
    'GO:0140683',
    'GO:0042393',
    'GO:0003674',
    'GO:0005515',
    'GO:0060090',
    'GO:0140566'],
   1),
  (['GO:0004714',
    'GO:0005125',
    'GO:0008201',
    'GO:0005125',
    'GO:0005125',
    'GO:0004714'],
   1),
  (['GO:0061891', 'GO:0061891', 'GO:0005262', 'GO:0030674'], 1)]}

Semantic Search

Let’s query based on text criteria:

[18]:
qr = collection.search("pathways involving cell death")
qr.rows_dataframe[0:5]
[18]:
score id title taxon status activities objects comments
0 0.821816 gomodel:64e7eefa00001233 Extrinsic apoptotic signaling pathway via deat... NCBITaxon:10090 production [{'id': 'gomodel:64e7eefa00001233/64e7eefa0000... [{'id': 'GO:0035591', 'label': 'signaling adap... NaN
1 0.814937 gomodel:62b4ffe300000240 Perforin maturation leading to granzyme-mediat... NCBITaxon:10090 production [{'id': 'gomodel:62b4ffe300000240/62b4ffe30000... [{'id': 'GO:0005509', 'label': 'calcium ion bi... [Automated change 2022-09-22: GO:0005887 repla...
2 0.810594 gomodel:62b4ffe300000335 Perforin maturation leading to granzyme-mediat... NCBITaxon:9606 production [{'id': 'gomodel:62b4ffe300000335/62b4ffe30000... [{'id': 'GO:0140375', 'label': 'immune recepto... [Automated change 2022-09-22: GO:0005887 repla...
3 0.809383 gomodel:663d668500001246 Pyroptotic cell death mediated by GSDMD and NI... NCBITaxon:9606 production [{'id': 'gomodel:663d668500001246/663d66850000... [{'id': 'GO:0004197', 'label': 'cysteine-type ... NaN
4 0.805333 gomodel:62b4ffe300001804 Cleavage and inactivation of PARP1 by CASP3 an... NCBITaxon:9606 production [{'id': 'gomodel:62b4ffe300001804/62b4ffe30000... [{'id': 'GO:0097200', 'label': 'cysteine-type ... [Automated change 2023-03-16: RO:0002212 repla...

Let’s check the first one

[19]:
qr.ranked_rows[0]
[19]:
(0.8218155512474502,
 {'id': 'gomodel:64e7eefa00001233',
  'title': 'Extrinsic apoptotic signaling pathway via death domain receptors 1(Mouse)',
  'taxon': 'NCBITaxon:10090',
  'status': 'production',
  'activities': [{'id': 'gomodel:64e7eefa00001233/64e7eefa00001250',
    'enabled_by': 'MGI:MGI:109200',
    'molecular_function': {'evidence': [{'term': 'ECO:0000266',
       'reference': 'PMID:8565075',
       'with_objects': ['UniProtKB:Q15628'],
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'provenances': [],
     'term': 'GO:0035591'},
    'occurs_in': {'evidence': [], 'term': 'GO:0005829'},
    'part_of': {'evidence': [{'term': 'ECO:0000266',
       'reference': 'PMID:8565075',
       'with_objects': ['UniProtKB:Q15628'],
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'term': 'GO:1900119'},
    'causal_associations': [{'evidence': [],
      'predicate': 'RO:0002413',
      'downstream_activity': 'gomodel:64e7eefa00001233/64e7eefa00001506'},
     {'evidence': [{'term': 'ECO:0000314',
        'reference': 'PMID:8565075',
        'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
          'date': '2023-09-14'}]}],
      'predicate': 'RO:0002413',
      'downstream_activity': 'gomodel:64e7eefa00001233/64e7eefa00001241'}]},
   {'id': 'gomodel:64e7eefa00001233/64e7eefa00001249',
    'enabled_by': 'MGI:MGI:1261423',
    'molecular_function': {'evidence': [{'term': 'ECO:0000314',
       'reference': 'PMID:9837723',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'provenances': [],
     'term': 'GO:0008656'},
    'occurs_in': {'evidence': [], 'term': 'GO:0005829'},
    'part_of': {'evidence': [{'term': 'ECO:0000315',
       'reference': 'PMID:9729047',
       'with_objects': ['MGI:MGI:2159369'],
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]},
      {'term': 'ECO:0000314',
       'reference': 'PMID:9837723',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'term': 'GO:1900119'}},
   {'id': 'gomodel:64e7eefa00001233/64e7eefa00001505',
    'enabled_by': 'MGI:MGI:104798',
    'molecular_function': {'evidence': [],
     'provenances': [],
     'term': 'GO:0048018'},
    'occurs_in': {'evidence': [{'term': 'ECO:0000314',
       'reference': 'PMID:8409382',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-13'}]}],
     'term': 'GO:0005886'},
    'part_of': {'evidence': [{'term': 'ECO:0000314',
       'reference': 'PMID:8409382',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-13'}]}],
     'term': 'GO:1900119'},
    'causal_associations': [{'evidence': [{'term': 'ECO:0000314',
        'reference': 'PMID:1647956',
        'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
          'date': '2023-09-14'}]}],
      'predicate': 'RO:0002629',
      'downstream_activity': 'gomodel:64e7eefa00001233/64e7eefa00001503'}]},
   {'id': 'gomodel:64e7eefa00001233/64e7eefa00001241',
    'enabled_by': 'MGI:MGI:109324',
    'molecular_function': {'evidence': [{'term': 'ECO:0000314',
       'reference': 'PMID:8565075',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'provenances': [],
     'term': 'GO:0035591'},
    'occurs_in': {'evidence': [], 'term': 'GO:0005829'},
    'part_of': {'evidence': [{'term': 'ECO:0000314',
       'reference': 'PMID:8565075',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]},
      {'term': 'ECO:0000315',
       'reference': 'PMID:12887920',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'term': 'GO:1900119'},
    'causal_associations': [{'evidence': [],
      'predicate': 'RO:0002629',
      'downstream_activity': 'gomodel:64e7eefa00001233/64e7eefa00001249'}]},
   {'id': 'gomodel:64e7eefa00001233/64e7eefa00001503',
    'enabled_by': 'MGI:MGI:1314884',
    'molecular_function': {'evidence': [{'term': 'ECO:0000314',
       'reference': 'PMID:1647956',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'provenances': [],
     'term': 'GO:0005031'},
    'occurs_in': {'evidence': [{'term': 'ECO:0000314',
       'reference': 'PMID:1647956',
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'term': 'GO:0005886'},
    'part_of': {'evidence': [{'term': 'ECO:0000316',
       'reference': 'PMID:10702415',
       'with_objects': ['MGI:MGI:103290'],
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]},
      {'term': 'ECO:0000315',
       'reference': 'PMID:10678933',
       'with_objects': ['MGI:MGI:1857468'],
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'term': 'GO:1900119'},
    'causal_associations': [{'evidence': [{'term': 'ECO:0000266',
        'reference': 'PMID:8565075',
        'with_objects': ['UniProtKB:Q15628'],
        'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
          'date': '2023-09-14'}]}],
      'predicate': 'RO:0002413',
      'downstream_activity': 'gomodel:64e7eefa00001233/64e7eefa00001250'}]},
   {'id': 'gomodel:64e7eefa00001233/64e7eefa00001506',
    'enabled_by': 'MGI:MGI:108212',
    'molecular_function': {'evidence': [{'term': 'ECO:0000315',
       'reference': 'PMID:25015821',
       'with_objects': ['MGI:MGI:6272324'],
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'provenances': [],
     'term': 'GO:0004672'},
    'occurs_in': {'evidence': [], 'term': 'GO:0005829'},
    'part_of': {'evidence': [{'term': 'ECO:0000315',
       'reference': 'PMID:26555174',
       'with_objects': ['MGI:MGI:6272324'],
       'provenances': [{'contributor': 'https://orcid.org/0000-0001-7476-6306',
         'date': '2023-09-14'}]}],
     'term': 'GO:1900119'},
    'causal_associations': [{'evidence': [],
      'predicate': 'RO:0002413',
      'downstream_activity': 'gomodel:64e7eefa00001233/64e7eefa00001241'}]}],
  'objects': [{'id': 'GO:0035591', 'label': 'signaling adaptor activity'},
   {'id': 'GO:0008656',
    'label': 'cysteine-type endopeptidase activator activity involved in apoptotic process'},
   {'id': 'GO:0005829', 'label': 'cytosol'},
   {'id': 'GO:1900119',
    'label': 'positive regulation of execution phase of apoptosis'},
   {'id': 'MGI:MGI:109324', 'label': 'Fadd Mmus'},
   {'id': 'MGI:MGI:1261423', 'label': 'Casp8 Mmus'},
   {'id': 'MGI:MGI:109200', 'label': 'Tradd Mmus'},
   {'id': 'MGI:MGI:1314884', 'label': 'Tnfrsf1a Mmus'},
   {'id': 'GO:0005031', 'label': 'tumor necrosis factor receptor activity'},
   {'id': 'MGI:MGI:104798', 'label': 'Tnf Mmus'},
   {'id': 'GO:0048018', 'label': 'receptor ligand activity'},
   {'id': 'GO:0004672', 'label': 'protein kinase activity'},
   {'id': 'MGI:MGI:108212', 'label': 'Ripk1 Mmus'},
   {'id': 'GO:0005886', 'label': 'plasma membrane'},
   {'id': 'ECO:0000314',
    'label': 'direct assay evidence used in manual assertion'},
   {'id': 'ECO:0000316',
    'label': 'genetic interaction evidence used in manual assertion'},
   {'id': 'ECO:0000315',
    'label': 'mutant phenotype evidence used in manual assertion'},
   {'id': 'ECO:0000266',
    'label': 'sequence orthology evidence used in manual assertion'},
   {'id': 'GO:0008625',
    'label': 'extrinsic apoptotic signaling pathway via death domain receptors'},
   {'id': 'GO:0097194', 'label': 'execution phase of apoptosis'},
   {'id': 'ECO:0000304',
    'label': 'author statement supported by traceable reference used in manual assertion'}]})

We can combine semantic search with queries:

[20]:
qr = collection.search("cell death pathways", where={"taxon": "NCBITaxon:10090"})
qr.rows_dataframe[0:5]
[20]:
score id title taxon status activities objects comments
0 0.829536 gomodel:64e7eefa00001233 Extrinsic apoptotic signaling pathway via deat... NCBITaxon:10090 production [{'id': 'gomodel:64e7eefa00001233/64e7eefa0000... [{'id': 'GO:0035591', 'label': 'signaling adap... NaN
1 0.822032 gomodel:62b4ffe300000240 Perforin maturation leading to granzyme-mediat... NCBITaxon:10090 production [{'id': 'gomodel:62b4ffe300000240/62b4ffe30000... [{'id': 'GO:0005509', 'label': 'calcium ion bi... [Automated change 2022-09-22: GO:0005887 repla...
2 0.805320 gomodel:645d887900001077 Cell type specific, p53-independent mitotic G2... NCBITaxon:10090 production [{'id': 'gomodel:645d887900001077/645d88790000... [{'id': 'GO:0004674', 'label': 'protein serine... NaN
3 0.801173 gomodel:5ce58dde00001215 Mouse-Aatf-antiapoptosis NCBITaxon:10090 production [{'id': 'gomodel:5ce58dde00001215/5ce58dde0000... [{'id': 'MGI:MGI:87986', 'label': 'Akt1 Mmus'}... [Automated change 2023-03-16: RO:0002213 repla...
4 0.795273 gomodel:6516135700000211 Tumor necrosis factor-mediated signaling pathw... NCBITaxon:10090 production [{'id': 'gomodel:6516135700000211/651613570000... [{'id': 'GO:0005125', 'label': 'cytokine activ... NaN

Validation

Next we will demonstrate validation over a whole collection.

Currently validating depends on a LinkML schema - we have previously copied this schema into the test folder. We will load the schema into the database object:

[21]:
db.load_schema_view("input/gocam-models-schema.yaml")

Quick sanity check to ensure that worked:

[22]:
list(db.schema_view.all_classes())[0:10]
[22]:
['Model',
 'Activity',
 'EvidenceItem',
 'Association',
 'CausalAssociation',
 'TermAssociation',
 'MolecularFunctionAssociation',
 'BiologicalProcessAssociation',
 'CellularAnatomicalEntityAssociation',
 'MoleculeAssociation']
[23]:
collection.metadata.type = "Model"
[24]:
from linkml_runtime.dumpers import yaml_dumper
for r in db.iter_validate_database():
    # known issue - https://github.com/monarch-initiative/GO-CAM-store/issues/97
    if "is not of type 'integer'" in r.message:
        continue
    print(r.message[0:100])
    print(r)
    raise ValueError("Unexpected validation error")

Command Line Usage

We can also use the command line for all of the above operations.

For example, feceted queries:

[26]:
!linkml-store -d mongodb://localhost:27017/gocams -c main fq -S taxon
{
  "taxon": {
    "NCBITaxon:9606": 541,
    "NCBITaxon:10090": 185,
    "NCBITaxon:4896": 15,
    "NCBITaxon:7955": 14,
    "NCBITaxon:7227": 13,
    "NCBITaxon:559292": 6,
    "NCBITaxon:6239": 4,
    "NCBITaxon:9823": 4,
    "NCBITaxon:227321": 1,
    "NCBITaxon:8355": 1,
    "NCBITaxon:1403190": 1,
    "NCBITaxon:1735992": 1,
    "NCBITaxon:229533": 1,
    "NCBITaxon:5074": 1,
    "NCBITaxon:602072": 1,
    "NCBITaxon:425011": 1,
    "NCBITaxon:28576": 1,
    "NCBITaxon:99287": 1,
    "NCBITaxon:8364": 1
  }
}
[29]:
!linkml-store -d mongodb://localhost:27017/gocams -c main fq -S activities.enabled_by,taxon -O yaml

activities.enabled_by:
  MGI:MGI:109482: 91
  MGI:MGI:98973: 55
  UniProtKB:P42345: 53
  UniProtKB:Q8N884: 43
  UniProtKB:P57764: 37
  UniProtKB:Q9UHD2: 31
  MGI:MGI:109349: 30
  UniProtKB:Q9HB90: 25
  UniProtKB:Q13315: 23
  UniProtKB:Q13501: 23
  MGI:MGI:95294: 23
  UniProtKB:P29466: 22
  UniProtKB:Q7L523: 22
  UniProtKB:Q86WV6: 22
  UniProtKB:P09874: 21
  UniProtKB:P62877: 21
  UniProtKB:Q15382: 19
  UniProtKB:Q7Z434: 17
  UniProtKB:O43318: 17
  UniProtKB:Q9Y4K3: 16
  UniProtKB:P62753: 16
  MGI:MGI:2686159: 16
  UniProtKB:P23443: 16
  UniProtKB:Q9UBS0: 15
  UniProtKB:P23458: 15
  UniProtKB:Q96P20: 15
  UniProtKB:P49662: 14
  UniProtKB:Q13541: 14
  UniProtKB:P31749: 14
  UniProtKB:P49959: 14
  UniProtKB:Q04206: 14
  UniProtKB:Q14653: 14
  MGI:MGI:1916396: 14
  MGI:MGI:3647519: 13
  MGI:MGI:98907: 13
  UniProtKB:Q9C000: 13
  MGI:MGI:1916142: 13
  UniProtKB:Q92993: 13
  MGI:MGI:97365: 13
  UniProtKB:Q9UBF6: 12
  UniProtKB:O60934: 12
  MGI:MGI:95410: 12
  UniProtKB:P25963: 12
  UniProtKB:P40763: 12
  UniProtKB:Q8N6T7: 12
  MGI:MGI:95654: 12
  UniProtKB:O75385: 12
  UniProtKB:P68400: 11
  UniProtKB:O43914: 11
  UniProtKB:P43405: 10
  MGI:MGI:95797: 10
  MGI:MGI:1924809: 10
  UniProtKB:Q8N122: 10
  UniProtKB:Q93034: 10
  UniProtKB:P42229: 9
  UniProtKB:O60674: 9
  UniProtKB:O94874: 9
  UniProtKB:Q99836: 9
  UniProtKB:P42224: 9
  UniProtKB:Q13131: 9
  UniProtKB:O00206: 9
  UniProtKB:P55210: 8
  UniProtKB:P05198: 8
  UniProtKB:Q7NWF2: 8
  UniProtKB:Q9NZC2: 8
  UniProtKB:O14920: 8
  UniProtKB:Q6IAA8: 8
  MGI:MGI:107700: 8
  UniProtKB:Q14790: 8
  MGI:MGI:95805: 8
  UniProtKB:P06493: 8
  MGI:MGI:96433: 8
  UniProtKB:P42574: 8
  UniProtKB:P0DP23: 8
  UniProtKB:Q13188: 8
  MGI:MGI:1351352: 8
  UniProtKB:Q13535: 7
  UniProtKB:Q5T4S7: 7
  UniProtKB:Q13616: 7
  UniProtKB:Q9H4B6: 7
  MGI:MGI:87916: 7
  UniProtKB:P59044: 7
  UniProtKB:A1A4Y4: 7
  UniProtKB:P50542: 7
  UniProtKB:P19838: 7
  MGI:MGI:1354954: 7
  MGI:MGI:1096335: 7
  UniProtKB:O95786: 7
  MGI:MGI:98797: 7
  MGI:MGI:96544: 7
  UniProtKB:Q09472: 7
  UniProtKB:Q16531: 7
  PomBase:SPAC664.01c: 7
  SGD:S000002674: 7
  UniProtKB:Q13619: 7
  MGI:MGI:97565: 7
  MGI:MGI:103202: 7
  UniProtKB:P14222: 7
  MGI:MGI:97592: 7
  MGI:MGI:97564: 7
taxon:
  NCBITaxon:9606: 541
  NCBITaxon:10090: 185
  NCBITaxon:4896: 15
  NCBITaxon:7955: 14
  NCBITaxon:7227: 13
  NCBITaxon:559292: 6
  NCBITaxon:6239: 4
  NCBITaxon:9823: 4
  NCBITaxon:28576: 1
  NCBITaxon:1403190: 1
  NCBITaxon:8355: 1
  NCBITaxon:5074: 1
  NCBITaxon:1735992: 1
  NCBITaxon:229533: 1
  NCBITaxon:99287: 1
  NCBITaxon:8364: 1
  NCBITaxon:227321: 1
  NCBITaxon:602072: 1
  NCBITaxon:425011: 1

[30]:
!linkml-store -d mongodb://localhost:27017/gocams -c main fq -S taxon+activities.molecular_function.term
{
  "taxon+activities.molecular_function.term": {
    "('NCBITaxon:9606', 'GO:0004674')": 280,
    "('NCBITaxon:9606', 'GO:0061630')": 167,
    "('NCBITaxon:9606', 'GO:0030674')": 88,
    "('NCBITaxon:9606', 'GO:0003700')": 86,
    "('NCBITaxon:10090', 'GO:0003674')": 82,
    "('NCBITaxon:9606', 'GO:0060090')": 76,
    "('NCBITaxon:9606', 'GO:0005125')": 73,
    "('NCBITaxon:9606', 'GO:0004197')": 68,
    "('NCBITaxon:9606', 'GO:0140311')": 65,
    "('NCBITaxon:9606', 'GO:0035591')": 64,
    "('NCBITaxon:9606', 'GO:1990756')": 54,
    "('NCBITaxon:9606', 'GO:0004713')": 54,
    "('NCBITaxon:9606', 'GO:0043539')": 51,
    "('NCBITaxon:10090', 'GO:0048018')": 48,
    "('NCBITaxon:4896', 'GO:0003674')": 41,
    "('NCBITaxon:9606', 'GO:0003674')": 39,
    "('NCBITaxon:7955', 'GO:0003674')": 38,
    "('NCBITaxon:9606', 'GO:0043495')": 37,
    "('NCBITaxon:9606', 'GO:0022829')": 37,
    "('NCBITaxon:9606', 'GO:0140693')": 37,
    "('NCBITaxon:9606', 'GO:0048018')": 32,
    "('NCBITaxon:9606', 'GO:0005515')": 31,
    "('NCBITaxon:10090', 'GO:0004713')": 31,
    "('NCBITaxon:9606', 'GO:0140463')": 28,
    "('NCBITaxon:9606', 'GO:0038187')": 28,
    "('NCBITaxon:9606', 'GO:0160072')": 27,
    "('NCBITaxon:10090', 'GO:0004855')": 26,
    "('NCBITaxon:9606', 'GO:0038023')": 23,
    "('NCBITaxon:9606', 'GO:0015026')": 23,
    "('NCBITaxon:9606', 'GO:0004888')": 23,
    "('NCBITaxon:9606', 'GO:0003735')": 23,
    "('NCBITaxon:9606', 'GO:0005546')": 21,
    "('NCBITaxon:9606', 'GO:0019706')": 20,
    "('NCBITaxon:9606', 'GO:0004843')": 19,
    "('NCBITaxon:9606', 'GO:0140767')": 18,
    "('NCBITaxon:9606', 'GO:0000981')": 18,
    "('NCBITaxon:10090', 'GO:0008253')": 18,
    "('NCBITaxon:10090', 'GO:0035591')": 18,
    "('NCBITaxon:9606', 'GO:0003924')": 18,
    "('NCBITaxon:9606', 'GO:0061733')": 17,
    "('NCBITaxon:9606', 'GO:0003690')": 17,
    "('NCBITaxon:9606', 'GO:0005179')": 17,
    "('NCBITaxon:9606', 'GO:0030371')": 17,
    "('NCBITaxon:10090', 'GO:0003925')": 17,
    "('NCBITaxon:9606', 'GO:0061501')": 17,
    "('NCBITaxon:9606', 'GO:0005096')": 17,
    "('NCBITaxon:9606', 'GO:0140313')": 15,
    "('NCBITaxon:10090', 'GO:0004674')": 15,
    "('NCBITaxon:9606', 'GO:0004252')": 14,
    "('NCBITaxon:9606', 'GO:0003712')": 14,
    "('NCBITaxon:10090', 'GO:0004614')": 14,
    "('NCBITaxon:10090', 'GO:0051997')": 13,
    "('NCBITaxon:9606', 'GO:0003925')": 13,
    "('NCBITaxon:10090', 'GO:0004846')": 13,
    "('NCBITaxon:10090', 'GO:0008331')": 13,
    "('NCBITaxon:10090', 'GO:0033971')": 13,
    "('NCBITaxon:10090', 'GO:0004731')": 13,
    "('NCBITaxon:10090', 'GO:0004854')": 13,
    "('NCBITaxon:9606', 'GO:0019003')": 12,
    "('NCBITaxon:9606', 'GO:0008320')": 12,
    "('NCBITaxon:9606', 'GO:0001228')": 12,
    "('NCBITaxon:9606', 'GO:0061507')": 12,
    "('NCBITaxon:9606', 'GO:0003743')": 12,
    "('NCBITaxon:10090', 'GO:0005085')": 11,
    "('NCBITaxon:9606', 'GO:0004707')": 11,
    "('NCBITaxon:9606', 'GO:0034979')": 11,
    "('NCBITaxon:9606', 'GO:0008384')": 11,
    "('NCBITaxon:10090', 'GO:0001228')": 11,
    "('NCBITaxon:9606', 'GO:0005525')": 11,
    "('NCBITaxon:9606', 'GO:0004715')": 11,
    "('NCBITaxon:10090', 'GO:0030674')": 11,
    "('NCBITaxon:10090', 'GO:0003876')": 10,
    "('NCBITaxon:9606', 'GO:0061891')": 10,
    "('NCBITaxon:10090', 'GO:0004714')": 10,
    "('NCBITaxon:10090', 'GO:0004347')": 10,
    "('NCBITaxon:10090', 'GO:0003938')": 10,
    "('NCBITaxon:9606', 'GO:0140036')": 10,
    "('NCBITaxon:9606', 'GO:0004842')": 10,
    "('NCBITaxon:10090', 'GO:0008184')": 10,
    "('NCBITaxon:9606', 'GO:0140608')": 10,
    "('NCBITaxon:9606', 'GO:0016301')": 10,
    "('NCBITaxon:10090', 'GO:0005068')": 9,
    "('NCBITaxon:9606', 'GO:0005085')": 9,
    "('NCBITaxon:9606', 'GO:0061666')": 9,
    "('NCBITaxon:10090', 'GO:0005245')": 9,
    "('NCBITaxon:9606', 'GO:0004693')": 9,
    "('NCBITaxon:9606', 'GO:0003723')": 9,
    "('NCBITaxon:9606', 'GO:0004679')": 9,
    "('NCBITaxon:4896', 'GO:0004674')": 9,
    "('NCBITaxon:4896', 'GO:0140463')": 9,
    "('NCBITaxon:10090', 'GO:0004708')": 9,
    "('NCBITaxon:10090', 'GO:0004707')": 9,
    "('NCBITaxon:9606', 'GO:0004222')": 9,
    "('NCBITaxon:9606', 'GO:0061631')": 8,
    "('NCBITaxon:10090', 'GO:0038024')": 8,
    "('NCBITaxon:9606', 'GO:0061608')": 8,
    "('NCBITaxon:9606', 'GO:0005516')": 8,
    "('NCBITaxon:559292', 'GO:0061630')": 8,
    "('NCBITaxon:4896', 'GO:0005515')": 8,
    "('NCBITaxon:9606', 'GO:0004879')": 8
  }
}
[ ]: