Database clients (indra.databases)

This module implements a number of clients for accessing and using resources from biomedical entity databases and other third-party web services that INDRA uses. Many of the resources these clients use are loaded from resource files in the indra.resources module, in many cases also providing access to web service endpoints.

identifiers.org mappings and URLs (indra.databases.identifiers)

indra.databases.identifiers.ensure_chebi_prefix(chebi_id)[source]

Return a valid CHEBI ID that has the appropriate CHEBI: prefix.

indra.databases.identifiers.ensure_chembl_prefix(chembl_id)[source]

Return a valid CHEMBL ID that has the appropriate CHEMBL prefix.

indra.databases.identifiers.ensure_prefix(db_ns, db_id, with_colon=True)[source]

Return a valid ID that has the given namespace embedded.

This is useful for namespaces such as CHEBI, GO or BTO that require the namespace to be part of the ID. Note that this function always ensures that the given db_ns is embedded in the ID and can handle the case whene it’s already present.

Parameters:
  • db_ns (str) – A namespace.

  • db_id (str) – An ID within that namespace which should have the namespace as a prefix in it.

  • with_colon (Optional[bool]) – If True, the namespace prefix is followed by a colon in the ID (e.g., CHEBI:12345). Otherwise, no colon is added (e.g., CHEMBL1234). Default: True

indra.databases.identifiers.ensure_prefix_if_needed(db_ns, db_id)[source]

Return an ID ensuring a namespace prefix if known to be needed.

Parameters:
  • db_ns (str) – The namespace associated with the identifier.

  • db_id (str) – The original identifier.

Return type:

str

Returns:

The identifier with namespace embedded if needed.

indra.databases.identifiers.get_identifiers_ns(db_name)[source]

Map an INDRA namespace to an identifiers.org namespace when possible.

Example: this can be used to map ‘UP’ to ‘uniprot’.

Parameters:

db_name (str) – An INDRA namespace to map to identifiers.org

Returns:

An identifiers.org namespace or None if not available.

Return type:

str or None

indra.databases.identifiers.get_identifiers_url(db_name, db_id)[source]

Return an identifiers.org URL for a given database name and ID.

Parameters:
  • db_name (str) – An internal database name: HGNC, UP, CHEBI, etc.

  • db_id (str) – An identifier in the given database.

Returns:

url – An identifiers.org URL corresponding to the given database name and ID.

Return type:

str

indra.databases.identifiers.get_ns_from_identifiers(identifiers_ns)[source]

“Return a namespace compatible with INDRA from an identifiers namespace.

For example, this function can be used to map ‘uniprot’ to ‘UP’.

Parameters:

identifiers_ns (str) – An identifiers.org standard namespace.

Returns:

The namespace compatible with INDRA’s internal representation or None if the given namespace isn’t an identifiers.org standard.

Return type:

str or None

indra.databases.identifiers.get_ns_id_from_identifiers(identifiers_ns, identifiers_id)[source]

Return a namespace/ID pair compatible with INDRA from identifiers.

Parameters:
  • identifiers_ns (str) – An identifiers.org standard namespace.

  • identifiers_id (str) – An identifiers.org standard ID in the given namespace.

Returns:

A namespace and ID that are valid in INDRA db_refs.

Return type:

(str, str)

indra.databases.identifiers.get_url_prefix(db_name)[source]

Return the URL prefix for a given namespace.

indra.databases.identifiers.namespace_embedded(db_ns)[source]

Return true if this namespace requires IDs to have namespace embedded.

This function first maps the given namespace to an identifiers.org namespace and then checks the registry to see if namespaces need to be embedded in IDs. If yes, it returns True. If not, or the ID can’t be mapped to identifiers.org, it returns False

Parameters:

db_ns (str) – The namespace to check.

Return type:

bool

Returns:

True if the namespace is known to be embedded in IDs of this namespace. False if unknown or known not to be embedded.

indra.databases.identifiers.parse_identifiers_url(url)[source]

Retrieve database name and ID given the URL.

Parameters:

url (str) – An identifiers.org URL to parse.

Returns:

  • db_name (str) – An internal database name: HGNC, UP, CHEBI, etc. corresponding to the given URL.

  • db_id (str) – An identifier in the database.

Bioregistry mappings and URLs (indra.databases.bioregistry_client)

This module implements a client for using namespace and identifiers information from the Bioregistry (bioregistry.io).

indra.databases.bioregistry_client.ensure_prefix_if_needed(db_ns, db_id)[source]

Return an ID ensuring a namespace prefix if known to be needed.

indra.databases.bioregistry_client.get_bioregistry_curie(db_ns, db_id)[source]

Return the Bioregistry CURIE for the given INDRA namespace and ID.

indra.databases.bioregistry_client.get_bioregistry_prefix(db_ns)[source]

Return the prefix for the given INDRA namespace in Bioregistry.

indra.databases.bioregistry_client.get_bioregistry_url(db_ns, db_id)[source]

Return the Bioregistry URL for the given INDRA namespace and ID.

indra.databases.bioregistry_client.get_ns_from_bioregistry(bioregistry_prefix)[source]

Return the INDRA namespace for the given Bioregistry prefix.

indra.databases.bioregistry_client.get_ns_id_from_bioregistry(bioregistry_prefix, bioregistry_id)[source]

Return the INDRA namespace and ID for a Bioregistry prefix and ID.

indra.databases.bioregistry_client.get_ns_id_from_bioregistry_curie(bioregistry_curie)[source]

Return the INDRA namespace and ID for a Bioregistry CURIE.

HGNC client (indra.databases.hgnc_client)

indra.databases.hgnc_client.get_current_hgnc_id(hgnc_name)[source]

Return HGNC ID(s) corresponding to a current or outdated HGNC symbol.

Parameters:

hgnc_name (str) – The HGNC symbol to be converted, possibly an outdated symbol.

Returns:

If there is a single HGNC ID corresponding to the given current or outdated HGNC symbol, that ID is returned as a string. If the symbol is outdated and maps to multiple current IDs, a list of these IDs is returned. If the given name doesn’t correspond to either a current or an outdated HGNC symbol, None is returned.

Return type:

str or list of str or None

indra.databases.hgnc_client.get_ensembl_id(hgnc_id)[source]

Return the Ensembl ID corresponding to the given HGNC ID.

Parameters:

hgnc_id (str) – The HGNC ID to be converted. Note that the HGNC ID is a number that is passed as a string. It is not the same as the HGNC gene symbol.

Returns:

ensembl_id – The Ensembl ID corresponding to the given HGNC ID.

Return type:

str

indra.databases.hgnc_client.get_entrez_id(hgnc_id)[source]

Return the Entrez ID corresponding to the given HGNC ID.

Parameters:

hgnc_id (str) – The HGNC ID to be converted. Note that the HGNC ID is a number that is passed as a string. It is not the same as the HGNC gene symbol.

Returns:

entrez_id – The Entrez ID corresponding to the given HGNC ID.

Return type:

str

indra.databases.hgnc_client.get_enzymes(hgnc_id)[source]

Return the EC codes corresponding to the given HGNC ID.

Parameters:

hgnc_id (str) – The HGNC ID to be converted.

Return type:

Set[str]

Returns:

A set of EC codes

indra.databases.hgnc_client.get_gene_type(hgnc_id)[source]

Return the locus type of the genve with the given HGNC ID.

See more under Locus type at https://www.genenames.org/help/symbol-report/#!/#tocAnchor-1-2

Parameters:

hgnc_id (str) – The HGNC ID of the gene to get the locus type of.

Return type:

Optional[str]

Returns:

The locus type of the given gene.

indra.databases.hgnc_client.get_hgnc_entry(hgnc_id)[source]

Return the HGNC entry for the given HGNC ID from the web service.

Parameters:

hgnc_id (str) – The HGNC ID to be converted.

Returns:

xml_tree – The XML ElementTree corresponding to the entry for the given HGNC ID.

Return type:

ElementTree

indra.databases.hgnc_client.get_hgnc_from_ensembl(ensembl_id)[source]

Return the HGNC ID corresponding to the given Ensembl ID.

Parameters:

ensembl_id (str) – The Ensembl ID to be converted, a number passed as a string.

Returns:

hgnc_id – The HGNC ID corresponding to the given Ensembl ID.

Return type:

str

indra.databases.hgnc_client.get_hgnc_from_entrez(entrez_id)[source]

Return the HGNC ID corresponding to the given Entrez ID.

Parameters:

entrez_id (str) – The Entrez ID to be converted, a number passed as a string.

Returns:

hgnc_id – The HGNC ID corresponding to the given Entrez ID.

Return type:

str

indra.databases.hgnc_client.get_hgnc_from_mouse(mgi_id)[source]

Return the HGNC ID corresponding to the given MGI mouse gene ID.

Parameters:

mgi_id (str) – The MGI ID to be converted. Example: “2444934”

Returns:

hgnc_id – The HGNC ID corresponding to the given MGI ID.

Return type:

str

indra.databases.hgnc_client.get_hgnc_from_rat(rgd_id)[source]

Return the HGNC ID corresponding to the given RGD rat gene ID.

Parameters:

rgd_id (str) – The RGD ID to be converted. Example: “1564928”

Returns:

hgnc_id – The HGNC ID corresponding to the given RGD ID.

Return type:

str

indra.databases.hgnc_client.get_hgnc_id(hgnc_name)[source]

Return the HGNC ID corresponding to the given HGNC symbol.

Parameters:

hgnc_name (str) – The HGNC symbol to be converted. Example: BRAF

Returns:

hgnc_id – The HGNC ID corresponding to the given HGNC symbol.

Return type:

str

indra.databases.hgnc_client.get_hgnc_id_from_mgi_name(mgi_name)[source]

Return a HGNC ID for the human gene homologous to the given mouse gene.

The mouse gene name provided as input is assumed to be an MGI official symbol.

Parameters:

mgi_name (str) – The MGI symbol of a mouse gene.

Return type:

Optional[str]

Returns:

The HGNC ID of the corresponding human gene or None if not available.

indra.databases.hgnc_client.get_hgnc_name(hgnc_id)[source]

Return the HGNC symbol corresponding to the given HGNC ID.

Parameters:

hgnc_id (str) – The HGNC ID to be converted.

Returns:

hgnc_name – The HGNC symbol corresponding to the given HGNC ID.

Return type:

str

indra.databases.hgnc_client.get_hgnc_name_from_mgi_name(mgi_name)[source]

Return a HGNC name for the human gene homologous to the given mouse gene.

The mouse gene name provided as input is assumed to be an MGI official symbol.

Parameters:

mgi_name (str) – The MGI symbol of a mouse gene.

Return type:

Optional[str]

Returns:

The HGNC symbol of the corresponding human gene or None if not available.

indra.databases.hgnc_client.get_hgncs_from_enzyme(ec_code)[source]

Return the HGNC ids associated with a given enzyme.

Parameters:

ec_code (str) – The EC code (e.g., 2.4.1.228)

Return type:

Set[str]

Returns:

A set of HGNC identifiers

indra.databases.hgnc_client.get_mouse_id(hgnc_id)[source]

Return the MGI mouse ID corresponding to the given HGNC ID.

Parameters:

hgnc_id (str) – The HGNC ID to be converted. Example: “1097”

Returns:

mgi_id – The MGI ID corresponding to the given HGNC ID.

Return type:

str

indra.databases.hgnc_client.get_rat_id(hgnc_id)[source]

Return the RGD rat ID corresponding to the given HGNC ID.

Parameters:

hgnc_id (str) – The HGNC ID to be converted. Example: “1097”

Returns:

rgd_id – The RGD ID corresponding to the given HGNC ID.

Return type:

str

indra.databases.hgnc_client.get_uniprot_id(hgnc_id)[source]

Return the UniProt ID corresponding to the given HGNC ID.

Parameters:

hgnc_id (str) – The HGNC ID to be converted. Note that the HGNC ID is a number that is passed as a string. It is not the same as the HGNC gene symbol.

Returns:

uniprot_id – The UniProt ID corresponding to the given HGNC ID.

Return type:

str

indra.databases.hgnc_client.is_kinase(gene_name)[source]

Return True if the given gene name is a kinase.

Parameters:

gene_name (str) – The HGNC gene symbol corresponding to the protein.

Returns:

True if the given gene name corresponds to a kinase, False otherwise.

Return type:

bool

indra.databases.hgnc_client.is_phosphatase(gene_name)[source]

Return True if the given gene name is a phosphatase.

Parameters:

gene_name (str) – The HGNC gene symbol corresponding to the protein.

Returns:

True if the given gene name corresponds to a phosphatase, False otherwise.

Return type:

bool

indra.databases.hgnc_client.is_transcription_factor(gene_name)[source]

Return True if the given gene name is a transcription factor.

Parameters:

gene_name (str) – The HGNC gene symbol corresponding to the protein.

Returns:

True if the given gene name corresponds to a transcription factor, False otherwise.

Return type:

bool

UniProt client (indra.databases.uniprot_client)

See also https://protmapper.readthedocs.io/en/latest/modules/uniprot.html whose functions this module imports and exposes.

ChEBI client (indra.databases.chebi_client)

indra.databases.chebi_client.get_chebi_entry_from_web(chebi_id)[source]

Return a ChEBI entry corresponding to a given ChEBI ID using a REST API.

Parameters:

chebi_id (str) – The ChEBI ID whose entry is to be returned.

Returns:

A dictionary containing the ChEBI entry data. If the lookup fails, None is returned.

Return type:

dict

indra.databases.chebi_client.get_chebi_id_from_cas(cas_id)[source]

Return a ChEBI ID corresponding to the given CAS ID.

Parameters:

cas_id (str) – The CAS ID to be converted.

Returns:

chebi_id – The ChEBI ID corresponding to the given CAS ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_chebi_id_from_chembl(chembl_id)[source]

Return a ChEBI ID from a given ChEBML ID.

Parameters:

chembl_id (str) – ChEBML ID to be converted.

Returns:

chebi_id – ChEBI ID corresponding to the given ChEBML ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_chebi_id_from_hmdb(hmdb_id)[source]

Return the ChEBI ID corresponding to an HMDB ID.

Parameters:

hmdb_id (str) – An HMDB ID.

Returns:

The ChEBI ID that the given HMDB ID maps to or None if no mapping was found.

Return type:

str

indra.databases.chebi_client.get_chebi_id_from_name(chebi_name)[source]

Return a ChEBI ID corresponding to the given ChEBI name.

Parameters:

chebi_name (str) – The ChEBI name whose ID is to be returned.

Returns:

chebi_id – The ID corresponding to the given ChEBI name. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_chebi_id_from_pubchem(pubchem_id)[source]

Return the ChEBI ID corresponding to a given Pubchem ID.

Parameters:

pubchem_id (str) – Pubchem ID to be converted.

Returns:

chebi_id – ChEBI ID corresponding to the given Pubchem ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_chebi_name_from_id(chebi_id, offline=True)[source]

Return a ChEBI name corresponding to the given ChEBI ID.

Parameters:
  • chebi_id (str) – The ChEBI ID whose name is to be returned.

  • offline (Optional[bool]) – If False, the ChEBI web service is invoked in case a name mapping could not be found in the local resource file. Default: True

Returns:

chebi_name – The name corresponding to the given ChEBI ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_chebi_name_from_id_web(chebi_id)[source]

Return a ChEBI name corresponding to a given ChEBI ID using a REST API.

Parameters:

chebi_id (str) – The ChEBI ID whose name is to be returned.

Returns:

chebi_name – The name corresponding to the given ChEBI ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_chembl_id(chebi_id)[source]

Return a ChEMBL ID from a given ChEBI ID.

Parameters:

chebi_id (str) – ChEBI ID to be converted.

Returns:

chembl_id – ChEMBL ID corresponding to the given ChEBI ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_inchi_key(chebi_id)[source]

Return an InChIKey corresponding to a given ChEBI ID using a REST API.

Parameters:

chebi_id (str) – The ChEBI ID whose InChIKey is to be returned.

Returns:

The InChIKey corresponding to the given ChEBI ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_primary_id(chebi_id)[source]

Return the primary ID corresponding to a ChEBI ID.

Note that if the provided ID is a primary ID, it is returned unchanged.

Parameters:

chebi_id (str) – The ChEBI ID that should be mapped to its primary equivalent.

Returns:

The primary ChEBI ID or None if the provided ID is neither primary nor a secondary ID with a primary mapping.

Return type:

str or None

indra.databases.chebi_client.get_pubchem_id(chebi_id)[source]

Return the PubChem ID corresponding to a given ChEBI ID.

Parameters:

chebi_id (str) – ChEBI ID to be converted.

Returns:

pubchem_id – PubChem ID corresponding to the given ChEBI ID. If the lookup fails, None is returned.

Return type:

str

indra.databases.chebi_client.get_specific_id(chebi_ids)[source]

Return the most specific ID in a list based on the hierarchy.

Parameters:

chebi_ids (list of str) – A list of ChEBI IDs some of which may be hierarchically related.

Returns:

The first ChEBI ID which is at the most specific level in the hierarchy with respect to the input list.

Return type:

str

Cell type context client (indra.databases.context_client)

indra.databases.context_client.get_mutations(gene_names, cell_types)[source]

Return protein amino acid changes in given genes and cell types.

Parameters:
  • gene_names (list) – HGNC gene symbols for which mutations are queried.

  • cell_types (list) –

    List of cell type names in which mutations are queried. The cell type names follow the CCLE database conventions.

    Example: LOXIMVI_SKIN, BT20_BREAST

Returns:

res – A dictionary keyed by cell line, which contains another dictionary that is keyed by gene name, with a list of amino acid substitutions as values.

Return type:

dict[dict[list]]

indra.databases.context_client.get_protein_expression(gene_names, cell_types)[source]

Return the protein expression levels of genes in cell types.

Parameters:
  • gene_names (list) – HGNC gene symbols for which expression levels are queried.

  • cell_types (list) –

    List of cell type names in which expression levels are queried. The cell type names follow the CCLE database conventions.

    Example: LOXIMVI_SKIN, BT20_BREAST

Returns:

res – A dictionary keyed by cell line, which contains another dictionary that is keyed by gene name, with estimated protein amounts as values.

Return type:

dict[dict[float]]

NDEx client (indra.databases.ndex_client)

indra.databases.ndex_client.create_network(cx_str, ndex_cred=None, private=True)[source]

Creates a new NDEx network of the assembled CX model.

To upload the assembled CX model to NDEx, you need to have a registered account on NDEx (http://ndexbio.org/) and have the ndex python package installed. The uploaded network is private by default.

Parameters:

ndex_cred (dict) – A dictionary with the following entries: ‘user’: NDEx user name ‘password’: NDEx password

Returns:

network_id – The UUID of the NDEx network that was created by uploading the assembled CX model.

Return type:

str

indra.databases.ndex_client.get_default_ndex_cred(ndex_cred)[source]

Gets the NDEx credentials from the dict, or tries the environment if None

indra.databases.ndex_client.send_request(ndex_service_url, params, is_json=True, use_get=False)[source]

Send a request to the NDEx server.

Parameters:
  • ndex_service_url (str) – The URL of the service to use for the request.

  • params (dict) – A dictionary of parameters to send with the request. Parameter keys differ based on the type of request.

  • is_json (bool) – True if the response is in json format, otherwise it is assumed to be text. Default: False

  • use_get (bool) – True if the request needs to use GET instead of POST.

Returns:

res – Depending on the type of service and the is_json parameter, this function either returns a text string or a json dict.

Return type:

str

indra.databases.ndex_client.set_style(network_id, ndex_cred=None, template_id=None)[source]

Set the style of the network to a given template network’s style

Parameters:
  • network_id (str) – The UUID of the NDEx network whose style is to be changed.

  • ndex_cred (dict) – A dictionary of NDEx credentials.

  • template_id (Optional[str]) – The UUID of the NDEx network whose style is used on the network specified in the first argument.

indra.databases.ndex_client.update_network(cx_str, network_id, ndex_cred=None)[source]

Update an existing CX network on NDEx with new CX content.

Parameters:
  • cx_str (str) – String containing the CX content.

  • network_id (str) – UUID of the network on NDEx.

  • ndex_cred (dict) – A dictionary with the following entries: ‘user’: NDEx user name ‘password’: NDEx password

cBio portal client (indra.databases.cbio_client)

This is a client for the cBioPortal web service, with documentation at https://docs.cbioportal.org/web-api-and-clients/ and Swagger definition at https://www.cbioportal.org/api/v2/api-docs. Note that the client implements direct requests to the API instead of adding an additional dependency to do so.

indra.databases.cbio_client.get_cancer_studies(study_filter=None)[source]

Return a list of cancer study identifiers, optionally filtered.

There are typically multiple studies for a given type of cancer and a filter can be used to constrain the returned list.

Parameters:

study_filter (Optional[str]) – A string used to filter the study IDs to return. Example: “paad”

Returns:

study_ids – A list of study IDs. For instance “paad” as a filter would result in a list of study IDs with paad in their name like “paad_icgc”, “paad_tcga”, etc.

Return type:

list[str]

indra.databases.cbio_client.get_cancer_types(cancer_filter=None)[source]

Return a list of cancer types, optionally filtered.

Parameters:

cancer_filter (Optional[str]) – A string used to filter cancer types. Its value is the name or part of the name of a type of cancer. Example: “melanoma”, “pancreatic”, “non-small cell lung”

Returns:

type_ids – A list of cancer types matching the filter. Example: for cancer_filter=”pancreatic”, the result includes “panet” (neuro-endocrine) and “paad” (adenocarcinoma)

Return type:

list[str]

indra.databases.cbio_client.get_case_lists(study_id)[source]

Return a list of the case set ids for a particular study.

In v2 of the API these are called sample lists.

Parameters:

study_id (str) – The ID of the cBio study. Example: ‘cellline_ccle_broad’ or ‘paad_icgc’

Returns:

case_set_ids – A list of case set IDs, e.g., [‘cellline_ccle_broad_all’, ‘cellline_ccle_broad_cna’, …]

Return type:

list[str]

indra.databases.cbio_client.get_ccle_cna(gene_list, cell_lines=None)[source]

Return a dict of CNAs in given genes and cell lines from CCLE.

CNA values correspond to the following alterations

-2 = homozygous deletion

-1 = hemizygous deletion

0 = neutral / no change

1 = gain

2 = high level amplification

Parameters:
  • gene_list (list[str]) – A list of HGNC gene symbols to get mutations in

  • cell_lines (Optional[list[str]]) – A list of CCLE cell line names to get mutations for.

Returns:

profile_data – A dict keyed to cases containing a dict keyed to genes containing int

Return type:

dict[dict[int]]

indra.databases.cbio_client.get_ccle_lines_for_mutation(gene, amino_acid_change)[source]

Return cell lines with a given point mutation in a given gene.

Checks which cell lines in CCLE have a particular point mutation in a given gene and return their names in a list.

Parameters:
  • gene (str) – The HGNC symbol of the mutated gene in whose product the amino acid change occurs. Example: “BRAF”

  • amino_acid_change (str) – The amino acid change of interest. Example: “V600E”

Returns:

cell_lines – A list of CCLE cell lines in which the given mutation occurs.

Return type:

list

indra.databases.cbio_client.get_ccle_mrna(gene_list, cell_lines=None)[source]

Return a dict of mRNA amounts in given genes and cell lines from CCLE.

Parameters:
  • gene_list (list[str]) – A list of HGNC gene symbols to get mRNA amounts for.

  • cell_lines (Optional[list[str]]) – A list of CCLE cell line names to get mRNA amounts for.

Returns:

mrna_amounts – A dict keyed to cell lines containing a dict keyed to genes containing float

Return type:

dict[dict[float]]

indra.databases.cbio_client.get_ccle_mutations(gene_list, cell_lines, mutation_type=None)[source]

Return a dict of mutations in given genes and cell lines from CCLE.

This is a specialized call to get_mutations tailored to CCLE cell lines.

Parameters:
  • gene_list (list[str]) – A list of HGNC gene symbols to get mutations in

  • cell_lines (list[str]) – A list of CCLE cell line names to get mutations for.

  • mutation_type (Optional[str]) – The type of mutation to filter to. mutation_type can be one of: missense, nonsense, frame_shift_ins, frame_shift_del, splice_site

Returns:

mutations – The result from cBioPortal as a dict in the format {cell_line : {gene : [mutation1, mutation2, …] }}

Example: {‘LOXIMVI_SKIN’: {‘BRAF’: [‘V600E’, ‘I208V’]}, ‘SKMEL30_SKIN’: {‘BRAF’: [‘D287H’, ‘E275K’]}}

Return type:

dict

indra.databases.cbio_client.get_genetic_profiles(study_id, profile_filter=None)[source]

Return all the genetic profiles (data sets) for a given study.

Genetic profiles are different types of data for a given study. For instance the study ‘cellline_ccle_broad’ has profiles such as ‘cellline_ccle_broad_mutations’ for mutations, ‘cellline_ccle_broad_CNA’ for copy number alterations, etc.

NOTE: In the v2 API, the genetic profiles are called molecular profiles.

Parameters:
  • study_id (str) – The ID of the cBio study. Example: ‘paad_icgc’

  • profile_filter (Optional[str]) – A string used to filter the profiles to return. Will be one of: - MUTATION - MUTATION_EXTENDED - COPY_NUMBER_ALTERATION - MRNA_EXPRESSION - METHYLATION The genetic profiles can include “mutation”, “CNA”, “rppa”, “methylation”, etc. The filter is case insensitive.

Returns:

genetic_profiles – A list of genetic profiles available for the given study.

Return type:

list[str]

indra.databases.cbio_client.get_mutations(study_id, gene_list=None, mutation_type=None, case_id=None)[source]

Return mutations as a list of genes and list of amino acid changes.

Parameters:
  • study_id (str) – The ID of the cBio study. Example: ‘cellline_ccle_broad’ or ‘paad_icgc’

  • gene_list (list[str]) – A list of genes with their HGNC symbols. Example: [‘BRAF’, ‘KRAS’]

  • mutation_type (Optional[str]) – The type of mutation to filter to. mutation_type can be one of: missense, nonsense, frame_shift_ins, frame_shift_del, splice_site

  • case_id (Optional[str]) – The case ID within the study to filter to.

Returns:

mutations – A dict with entries for each gene symbol and another list with entries for each corresponding amino acid change.

Return type:

dict

indra.databases.cbio_client.get_num_sequenced(study_id)[source]

Return number of sequenced tumors for given study.

This is useful for calculating mutation statistics in terms of the prevalence of certain mutations within a type of cancer.

Parameters:

study_id (str) – The ID of the cBio study. Example: ‘paad_icgc’

Returns:

num_case – The number of sequenced tumors in the given study

Return type:

int

indra.databases.cbio_client.get_profile_data(study_id, gene_list, profile_filter, case_set_filter=None)[source]

Return dict of cases and genes and their respective values.

Parameters:
  • study_id (str) – The ID of the cBio study. Example: ‘cellline_ccle_broad’ or ‘paad_icgc’

  • gene_list (list[str]) – A list of genes with their HGNC symbols. Example: [‘BRAF’, ‘KRAS’]

  • profile_filter (str) – A string used to filter the profiles to return. Will be one of: - MUTATION - MUTATION_EXTENDED - COPY_NUMBER_ALTERATION - MRNA_EXPRESSION - METHYLATION

  • case_set_filter (Optional[str]) – A string that specifies which case_set_id to use, based on a complete or partial match. If not provided, will look for study_id + ‘_all’

Returns:

profile_data – A dict keyed to cases (cell lines if using CCLE) in turn containing a dict keyed by genes, with values corresponding to the given profile (e.g., CNA, mutations).

Return type:

dict[dict[int]]

ChEMBL client (indra.databases.chembl_client)

indra.databases.chembl_client.activities_by_target(activities)[source]

Get back lists of activities in a dict keyed by ChEMBL target id

Parameters:

activities (list) – response from a query returning activities for a drug

Returns:

targ_act_dict – dictionary keyed to ChEMBL target ids with lists of activity ids

Return type:

dict

indra.databases.chembl_client.get_chembl_id(nlm_mesh)[source]

Get ChEMBL ID from NLM MESH

Parameters:

nlm_mesh (str)

Returns:

chembl_id

Return type:

str

indra.databases.chembl_client.get_chembl_name(chembl_id)[source]

Return a standard ChEMBL name from an ID if available in the local resource.

Parameters:

chembl_id (str) – The ChEBML ID to get the name for.

Returns:

The corresponding ChEBML name or None if not available.

Return type:

str or None

indra.databases.chembl_client.get_drug_inhibition_stmts(drug)[source]

Query ChEMBL for kinetics data given drug as Agent get back statements

Parameters:

drug (Agent) – Agent representing drug with MESH or CHEBI grounding

Returns:

stmts – INDRA statements generated by querying ChEMBL for all kinetics data of a drug interacting with protein targets

Return type:

list of INDRA statements

indra.databases.chembl_client.get_evidence(assay)[source]

Given an activity, return an INDRA Evidence object.

Parameters:

assay (dict) – an activity from the activities list returned by a query to the API

Returns:

ev – an Evidence object containing the kinetics of the

Return type:

Evidence

indra.databases.chembl_client.get_kinetics(assay)[source]

Given an activity, return its kinetics values.

Parameters:

assay (dict) – an activity from the activities list returned by a query to the API

Returns:

kin – dictionary of values with units keyed to value types ‘IC50’, ‘EC50’, ‘INH’, ‘Potency’, ‘Kd’

Return type:

dict

indra.databases.chembl_client.get_mesh_id(nlm_mesh)[source]

Get MESH ID from NLM MESH

Parameters:

nlm_mesh (str)

Returns:

mesh_id

Return type:

str

indra.databases.chembl_client.get_pcid(mesh_id)[source]

Get PC ID from MESH ID

Parameters:

mesh (str)

Returns:

pcid

Return type:

str

indra.databases.chembl_client.get_pmid(doc_id)[source]

Get PMID from document_chembl_id

Parameters:

doc_id (str)

Returns:

pmid

Return type:

str

indra.databases.chembl_client.get_protein_targets_only(target_chembl_ids)[source]

Given list of ChEMBL target ids, return dict of SINGLE PROTEIN targets

Parameters:

target_chembl_ids (list) – list of chembl_ids as strings

Returns:

protein_targets – dictionary keyed to ChEMBL target ids with lists of activity ids

Return type:

dict

indra.databases.chembl_client.get_target_chemblid(target_upid)[source]

Get ChEMBL ID from UniProt upid

Parameters:

target_upid (str)

Returns:

target_chembl_id

Return type:

str

indra.databases.chembl_client.query_target(target_chembl_id)[source]

Query ChEMBL API target by id

Parameters:

target_chembl_id (str)

Returns:

target – dict parsed from json that is unique for the target

Return type:

dict

indra.databases.chembl_client.send_query(query_dict)[source]

Query ChEMBL API

Parameters:

query_dict (dict) – ‘query’ : string of the endpoint to query ‘params’ : dict of params for the query

Returns:

js – dict parsed from json that is unique to the submitted query

Return type:

dict

LINCS client (indra.databases.lincs_client)

class indra.databases.lincs_client.LincsClient[source]

Client for querying LINCS small molecules and proteins.

get_protein_refs(hms_lincs_id)[source]

Get the refs for a protein from the LINCs protein metadata.

Parameters:

hms_lincs_id (str) – The HMS LINCS ID for the protein

Returns:

A dictionary of protein references.

Return type:

dict

get_small_molecule_name(hms_lincs_id)[source]

Get the name of a small molecule from the LINCS sm metadata.

Parameters:

hms_lincs_id (str) – The HMS LINCS ID of the small molecule.

Returns:

The name of the small molecule.

Return type:

str

get_small_molecule_refs(hms_lincs_id)[source]

Get the id refs of a small molecule from the LINCS sm metadata.

Parameters:

hms_lincs_id (str) – The HMS LINCS ID of the small molecule.

Returns:

A dictionary of references.

Return type:

dict

indra.databases.lincs_client.get_drug_target_data()[source]

Load the csv into a list of dicts containing the LINCS drug target data.

Returns:

data – A list of dicts, each keyed based on the header of the csv, with values as the corresponding column values.

Return type:

list[dict]

indra.databases.lincs_client.load_lincs_csv(url)[source]

Helper function to turn csv rows into dicts.

MeSH client (indra.databases.mesh_client)

indra.databases.mesh_client.get_db_mapping(mesh_id)[source]

Return mapping to another name space for a MeSH ID, if it exists.

Parameters:

mesh_id (str) – The MeSH ID whose mappings is to be returned.

Returns:

A tuple consisting of a DB namespace and ID for the mapping or None if not available.

Return type:

tuple or None

indra.databases.mesh_client.get_go_id(mesh_id)[source]

Return a GO ID corresponding to the given MeSH ID.

Parameters:

mesh_id (str) – MeSH ID to map to GO

Returns:

The GO ID corresponding to the given MeSH ID, or None if not available.

Return type:

str

indra.databases.mesh_client.get_mesh_id_from_db_id(db_ns, db_id)[source]

Return a MeSH ID mapped from another namespace and ID.

Parameters:
  • db_ns (str) – A namespace corresponding to db_id.

  • db_id (str) – An ID in the given namespace.

Returns:

The MeSH ID corresponding to the given namespace and ID if available, otherwise None.

Return type:

str or None

indra.databases.mesh_client.get_mesh_id_from_go_id(go_id)[source]

Return a MeSH ID corresponding to the given GO ID.

Parameters:

go_id (str) – GO ID to map to MeSH

Returns:

The MeSH ID corresponding to the given GO ID, or None if not available.

Return type:

str

indra.databases.mesh_client.get_mesh_id_name(mesh_term, offline=False)[source]

Get the MESH ID and name for the given MESH term.

Uses the mappings table in indra/resources; if the MESH term is not listed there, falls back on the NLM REST API.

Parameters:
  • mesh_term (str) – MESH Descriptor or Concept name, e.g. ‘Breast Cancer’.

  • offline (bool) – Whether to allow queries to the NLM REST API if the given MESH term is not contained in INDRA’s internal MESH mappings file. Default is False (allows REST API queries).

Returns:

Returns a 2-tuple of the form (id, name) with the ID of the descriptor corresponding to the MESH label, and the descriptor name (which may not exactly match the name provided as an argument if it is a Concept name). If the query failed, or no descriptor corresponding to the name was found, returns a tuple of (None, None).

Return type:

tuple of strs

indra.databases.mesh_client.get_mesh_id_name_from_web(mesh_term)[source]

Get the MESH ID and name for the given MESH term using the NLM REST API.

Parameters:

mesh_term (str) – MESH Descriptor or Concept name, e.g. ‘Breast Cancer’.

Returns:

Returns a 2-tuple of the form (id, name) with the ID of the descriptor corresponding to the MESH label, and the descriptor name (which may not exactly match the name provided as an argument if it is a Concept name). If the query failed, or no descriptor corresponding to the name was found, returns a tuple of (None, None).

Return type:

tuple of strs

indra.databases.mesh_client.get_mesh_name(mesh_id, offline=False)[source]

Get the MESH label for the given MESH ID.

Uses the mappings table in indra/resources; if the MESH ID is not listed there, falls back on the NLM REST API.

Parameters:
  • mesh_id (str) – MESH Identifier, e.g. ‘D003094’.

  • offline (bool) – Whether to allow queries to the NLM REST API if the given MESH ID is not contained in INDRA’s internal MESH mappings file. Default is False (allows REST API queries).

Returns:

Label for the MESH ID, or None if the query failed or no label was found.

Return type:

str

indra.databases.mesh_client.get_mesh_name_from_web(mesh_id)[source]

Get the MESH label for the given MESH ID using the NLM REST API.

Parameters:

mesh_id (str) – MESH Identifier, e.g. ‘D003094’.

Returns:

Label for the MESH ID, or None if the query failed or no label was found.

Return type:

str

indra.databases.mesh_client.get_mesh_tree_numbers(mesh_id)[source]

Return MeSH tree IDs associated with a MeSH ID from the resource file.

This function can handle supplementary concepts by first mapping them to primary terms and then collecting all the tree numbers for the mapped primary terms.

Parameters:

mesh_id (str) – The MeSH ID whose tree IDs should be returned.

Returns:

A list of MeSH tree IDs.

Return type:

list[str]

indra.databases.mesh_client.get_mesh_tree_numbers_from_web(mesh_id)[source]

Return MeSH tree IDs associated with a MeSH ID from the web.

Parameters:

mesh_id (str) – The MeSH ID whose tree IDs should be returned.

Returns:

A list of MeSH tree IDs.

Return type:

list[str]

indra.databases.mesh_client.get_primary_mappings(db_id)[source]

Return the list of primary terms a supplementary term is mapped to.

See https://www.nlm.nih.gov/mesh/xml_data_elements.html#HeadingMappedTo.

Parameters:

db_id (str) – A supplementary MeSH ID.

Return type:

List[str]

Returns:

The list of primary MeSH terms that the supplementary concept is heading-mapped to.

indra.databases.mesh_client.has_tree_prefix(mesh_id, tree_prefix)[source]

Return True if the given MeSH ID has the given tree prefix.

indra.databases.mesh_client.is_disease(mesh_id)[source]

Return True if the given MeSH ID is a disease.

indra.databases.mesh_client.is_enzyme(mesh_id)[source]

Return True if the given MeSH ID is an enzyme.

indra.databases.mesh_client.is_molecular(mesh_id)[source]

Return True if the given MeSH ID is a chemical or drug (incl protein).

indra.databases.mesh_client.is_protein(mesh_id)[source]

Return True if the given MeSH ID is a protein.

GO client (indra.databases.go_client)

A client to the Gene Ontology.

indra.databases.go_client.get_go_id_from_label(label)[source]

Get ID corresponding to a given GO label.

Parameters:

label (str) – The GO label to get the ID for.

Returns:

Identifier corresponding to the GO label, starts with GO:.

Return type:

str

indra.databases.go_client.get_go_id_from_label_or_synonym(label)[source]

Get ID corresponding to a given GO label or synonym

Parameters:

label (str) – The GO label or synonym to get the ID for.

Returns:

Identifier corresponding to the GO label or synonym, starts with GO:.

Return type:

str

indra.databases.go_client.get_go_label(go_id)[source]

Get label corresponding to a given GO identifier.

Parameters:

go_id (str) – The GO identifier. Should include the GO: prefix, e.g., GO:1903793 (positive regulation of anion transport).

Returns:

Label corresponding to the GO ID.

Return type:

str

indra.databases.go_client.get_namespace(go_id)[source]

Return the GO namespace associated with a GO ID.

Parameters:

go_id (str) – The GO ID to get the namespace for

Return type:

Optional[str]

Returns:

The GO namespace for the given ID. This is one of molecular_function, biological_process or cellular_component. If the GO ID is not available as an entry, None is returned.

indra.databases.go_client.get_primary_id(go_id)[source]

Get primary ID corresponding to an alternative/deprecated GO ID.

Parameters:

go_id (str) – The GO ID to get the primary ID for.

Returns:

Primary identifier corresponding to the given ID.

Return type:

str

indra.databases.go_client.get_valid_location(loc)[source]

Return a valid GO label based on an ID, label or synonym.

The rationale behind this function is that many sources produce cellular locations that are arbitrarily either GO IDs (sometimes without the prefix and sometimes outdated) or labels or synonyms. This function handles all these cases and returns a valid GO label in case one is available, otherwise None.

Parameters:

loc (txt) – The location that needst o be canonicalized.

Returns:

The valid location string is available, otherwise None.

Return type:

str or None

PubChem client (indra.databases.pubchem_client)

indra.databases.pubchem_client.get_inchi_key(pubchem_cid)[source]

Return the InChIKey for a given PubChem CID.

Parameters:

pubchem_cid (str) – The PubChem CID whose InChIKey should be returned.

Returns:

The InChIKey corresponding to the PubChem CID.

Return type:

str

indra.databases.pubchem_client.get_json_record(pubchem_cid)[source]

Return the JSON record of a given PubChem CID.

Parameters:

pubchem_cid (str) – The PubChem CID whose record should be returned.

Returns:

The record deserialized from JSON.

Return type:

dict

indra.databases.pubchem_client.get_mesh_id(pubchem_cid)[source]

Return the MeSH ID for a given PubChem CID.

Parameters:

pubchem_cid (str) – The PubChem CID whose MeSH ID should be returned.

Return type:

Optional[str]

Returns:

The MeSH ID corresponding to the PubChem CID or None if not available.

indra.databases.pubchem_client.get_pmids(pubchem_cid)[source]

Return depositor provided PMIDs for a given PubChem CID.

Note that this information can also be obtained via PubMed at https://www.ncbi.nlm.nih.gov/sites/entrez?LinkName=pccompound_pubmed&db=pccompound&cmd=Link&from_uid=<pubchem_cid>.

Parameters:

pubchem_cid (str) – The PubChem CID whose PMIDs will be returned.

Return type:

List[str]

Returns:

PMIDs corresponding to the given PubChem CID. If none present, or the query fails, an empty list is returned.

indra.databases.pubchem_client.get_preferred_compound_ids(pubchem_cid)[source]

Return a list of preferred CIDs for a given PubChem CID.

Parameters:

pubchem_cid (str) – The PubChem CID whose preferred CIDs should be returned.

Returns:

The list of preferred CIDs for the given CID. If there are no preferred CIDs for the given CID then an empty list is returned.

Return type:

list of str

miRBase client (indra.databases.mirbase_client)

A client to miRBase.

indra.databases.mirbase_client.get_hgnc_id_from_mirbase_id(mirbase_id)[source]

Return the HGNC ID corresponding to the given miRBase ID.

Parameters:

mirbase_id (str) – The miRBase ID to be converted. Example: “MI0000060”

Returns:

hgnc_id – The HGNC ID corresponding to the given miRBase ID.

Return type:

str

indra.databases.mirbase_client.get_mirbase_id_from_hgnc_id(hgnc_id)[source]

Return the HGNC ID corresponding to the given miRBase ID.

Parameters:

hgnc_id (str) – An HGNC identifier to convert to miRBase, if it is indeed an miRNA. Example: “31476”

Returns:

mirbase_id – The miRBase ID corresponding to the given HGNC ID.

Return type:

str

indra.databases.mirbase_client.get_mirbase_id_from_hgnc_symbol(hgnc_symbol)[source]

Return the HGNC gene symbol corresponding to the given miRBase ID.

Parameters:

hgnc_symbol (str) – An HGNC gene symbol to convert to miRBase, if it is indeed an miRNA. Example: “MIR19B2”

Returns:

mirbase_id – The miRBase ID corresponding to the given HGNC gene symbol.

Return type:

str

indra.databases.mirbase_client.get_mirbase_id_from_mirbase_name(mirbase_name)[source]

Return the miRBase identifier corresponding to the given miRBase name.

Parameters:

mirbase_name (str) – The miRBase ID to be converted. Example: “hsa-mir-19b-2”

Returns:

mirbase_id – The miRBase ID corresponding to the given miRBase name.

Return type:

str

indra.databases.mirbase_client.get_mirbase_name_from_mirbase_id(mirbase_id)[source]

Return the miRBase name corresponding to the given miRBase ID.

Parameters:

mirbase_id (str) – The miRBase ID to be converted. Example: “MI0000060”

Returns:

mirbase_name – The miRBase name corresponding to the given miRBase ID.

Return type:

str

Experimental Factor Ontology (EFO) client (indra.databases.efo_client)

A client to EFO.

indra.databases.efo_client.get_efo_id_from_efo_name(efo_name)[source]

Return the EFO identifier corresponding to the given EFO name.

Parameters:

efo_name (str) – The EFO name to be converted. Example: “gum cancer”

Returns:

efo_id – The EFO identifier corresponding to the given EFO name.

Return type:

str

indra.databases.efo_client.get_efo_name_from_efo_id(efo_id)[source]

Return the EFO name corresponding to the given EFO ID.

Parameters:

efo_id (str) – The EFO identifier to be converted. Example: “0005557”

Returns:

efo_name – The EFO name corresponding to the given EFO identifier.

Return type:

str

Human Phenotype Ontology (HP) client (indra.databases.hp_client)

A client to HP.

indra.databases.hp_client.get_hp_id_from_hp_name(hp_name)[source]

Return the HP identifier corresponding to the given HP name.

Parameters:

hp_name (str) – The HP name to be converted. Example: “Nocturia”

Returns:

hp_id – The HP identifier corresponding to the given HP name.

Return type:

str

indra.databases.hp_client.get_hp_name_from_hp_id(hp_id)[source]

Return the HP name corresponding to the given HP ID.

Parameters:

hp_id (str) – The HP identifier to be converted. Example: “HP:0000017”

Returns:

hp_name – The HP name corresponding to the given HP identifier.

Return type:

str

Disease Ontology (DOID) client (indra.databases.doid_client)

A client to the Disease Ontology.

indra.databases.doid_client.get_doid_id_from_doid_alt_id(doid_alt_id)[source]

Return the identifier corresponding to the given Disease Ontology alt id.

Parameters:

doid_alt_id (str) – The Disease Ontology alt id to be converted. Example: “DOID:267”

Returns:

doid_id – The Disease Ontology identifier corresponding to the given alt id.

Return type:

str

indra.databases.doid_client.get_doid_id_from_doid_name(doid_name)[source]

Return the identifier corresponding to the given Disease Ontology name.

Parameters:

doid_name (str) – The Disease Ontology name to be converted. Example: “Nocturia”

Returns:

doid_id – The Disease Ontology identifier corresponding to the given name.

Return type:

str

indra.databases.doid_client.get_doid_name_from_doid_id(doid_id)[source]

Return the name corresponding to the given Disease Ontology ID.

Parameters:

doid_id (str) – The Disease Ontology identifier to be converted. Example: “DOID:0000017”

Returns:

doid_name – The DOID name corresponding to the given DOID identifier.

Return type:

str

Infectious Disease Ontology client (indra.databases.ido_client)

A client to OWL.

indra.databases.ido_client.get_ido_id_from_ido_name(ido_name)[source]

Return the HP identifier corresponding to the given IDO name.

Parameters:

ido_name (str) – The IDO name to be converted. Example: “parasite role”

Return type:

Optional[str]

Returns:

The IDO identifier corresponding to the given IDO name.

indra.databases.ido_client.get_ido_name_from_ido_id(ido_id)[source]

Return the HP name corresponding to the given HP ID.

Parameters:

ido_id (str) – The IDO identifier to be converted. Example: “0000403”

Return type:

Optional[str]

Returns:

The IDO name corresponding to the given IDO identifier.

Taxonomy client (indra.databases.taxonomy_client)

Client to access the Entrez Taxonomy web service.

indra.databases.taxonomy_client.get_taxonomy_id(name)[source]

Return the taxonomy ID corresponding to a taxonomy name.

Parameters:

name (str) – The name of the taxonomy entry. Example: “Severe acute respiratory syndrome coronavirus 2”

Returns:

The taxonomy ID corresponding to the given name or None if not available.

Return type:

str or None

DrugBank client (indra.databases.drugbank_client)

Client for interacting with DrugBank entries.

indra.databases.drugbank_client.get_chebi_id(drugbank_id)[source]

Return a mapping for a DrugBank ID to CHEBI.

Parameters:

drugbank_id (str) – DrugBank ID to map.

Returns:

The ID mapped to CHEBI or None if not available.

Return type:

str or None

indra.databases.drugbank_client.get_chembl_id(drugbank_id)[source]

Return a mapping for a DrugBank ID to CHEMBL.

Parameters:

drugbank_id (str) – DrugBank ID to map.

Returns:

The ID mapped to CHEMBL or None if not available.

Return type:

str or None

indra.databases.drugbank_client.get_db_mapping(drugbank_id, db_ns)[source]

Return a mapping for a DrugBank ID to a given name space.

Parameters:
  • drugbank_id (str) – DrugBank ID to map.

  • db_ns (str) – The database name space to map to.

Returns:

The ID mapped to the given name space or None if not available.

Return type:

str or None

indra.databases.drugbank_client.get_drugbank_id_from_chebi_id(chebi_id)[source]

Return DrugBank ID from a CHEBI ID.

Parameters:

chebi_id (str) – CHEBI ID to map.

Returns:

The mapped DrugBank ID or None if not available.

Return type:

str or None

indra.databases.drugbank_client.get_drugbank_id_from_chembl_id(chembl_id)[source]

Return DrugBank ID from a CHEMBL ID.

Parameters:

chembl_id (str) – CHEMBL ID to map.

Returns:

The mapped DrugBank ID or None if not available.

Return type:

str or None

indra.databases.drugbank_client.get_drugbank_id_from_db_id(db_ns, db_id)[source]

Return DrugBank ID from a database name space and ID.

Parameters:
  • db_ns (str) – Database name space.

  • db_id (str) – Database ID.

Returns:

The mapped DrugBank ID or None if not available.

Return type:

str or None

indra.databases.drugbank_client.get_drugbank_name(drugbank_id)[source]

Return the DrugBank standard name for a given DrugBank ID.

Parameters:

drugbank_id (str) – DrugBank ID to get the name for

Returns:

The name corresponding to the given DrugBank ID or None if not available.

Return type:

str or None

Enyzme Class client (indra.databases.ec_client)

A client to EC-code via an ontology client.

indra.databases.ec_client.get_id_from_name(name)[source]

Return the enzyme class code corresponding to the given enzyme class name.

Parameters:

name (str) – The enzyme name to be converted. Example: “Alcohol dehydrogenase”

Return type:

Optional[str]

Returns:

  • The enzyme class code corresponding to the given enzyme class name.

  • >>> from indra.databases import ec_client

  • >>> ec_client.get_id_from_name(“Alcohol dehydrogenase”)

  • ’1.1.1.1’

indra.databases.ec_client.get_name_from_id(ec_code)[source]

Return the enzyme name corresponding to the given enzyme class code.

Parameters:

ec_code (str) – The enzyme class code to be converted. Example: “1.1.1.1”

Return type:

Optional[str]

Returns:

  • The enzyme class name corresponding to the given enzyme class code

  • >>> from indra.databases import ec_client

  • >>> ec_client.get_name_from_id(“1.1.1.1”)

  • ’Alcohol dehydrogenase’

OBO client (indra.databases.obo_client)

A client for OBO-sourced identifier mappings.

class indra.databases.obo_client.OboClient(prefix)[source]

A base client for data that’s been grabbed via OBO

static entries_from_graph(obo_graph, prefix, remove_prefix=False, allowed_synonyms=None, allowed_external_ns=None)[source]

Return processed entries from an OBO graph.

classmethod update_resource(directory, url, prefix, *args, remove_prefix=False, allowed_synonyms=None, allowed_external_ns=None, force=False)[source]

Write the OBO information to files in the given directory.

class indra.databases.obo_client.OntologyClient(prefix)[source]

A base client class for OBO and OWL ontologies.

get_id_from_alt_id(db_alt_id)[source]

Return the canonical database id corresponding to the alt id.

Parameters:

db_alt_id (str) – The alt id to be converted.

Return type:

Optional[str]

Returns:

The ID corresponding to the given alt id.

get_id_from_name(db_name)[source]

Return the database identifier corresponding to the given name.

Parameters:

db_name (str) – The name to be converted.

Return type:

Optional[str]

Returns:

The ID corresponding to the given name.

get_id_from_name_or_synonym(txt)[source]

Return the database id corresponding to the given name or synonym.

Note that the way the OboClient is constructed, ambiguous synonyms are filtered out. Further, this function prioritizes names over synonyms (i.e., it first looks up the ID by name, and only if that fails, it attempts a synonym-based lookup). Overall, these mappings are guaranteed to be many-to-one.

Parameters:

txt (str) – The name or synonym to be converted.

Return type:

Optional[str]

Returns:

The ID corresponding to the given name or synonym.

get_name_from_id(db_id)[source]

Return the database name corresponding to the given database ID.

Parameters:

db_id (str) – The ID to be converted.

Return type:

Optional[str]

Returns:

The name corresponding to the given ID.

get_parents(db_id)[source]

Return the isa relationships corresponding to a given ID.

Parameters:

db_id – The ID whose isa relationships should be returned

Return type:

List[str]

Returns:

The IDs of the terms that are in the given relation with the given ID.

get_relation(db_id, rel_type)[source]

Return the relationships corresponding to a given ID.

Parameters:
  • db_id (str) – The ID whose isa relationships should be returned

  • rel_type (str) – The type of relationships to get, e.g., is_a, part_of

Return type:

List[str]

Returns:

The IDs of the terms that are in the given relation with the given ID.

get_relations(db_id)[source]

Return the isa relationships corresponding to a given ID.

Parameters:

db_id (str) – The ID whose isa relationships should be returned

Return type:

Mapping[str, List[str]]

Returns:

A dict keyed by relation type with each entry a list of IDs of the terms that are in the given relation with the given ID.

class indra.databases.obo_client.PyOboClient(prefix)[source]

A base client for data that’s been grabbed via PyOBO.

classmethod update_by_prefix(prefix, include_relations=False, predicate=None, indra_prefix=None)[source]

Update the JSON data by looking up the ontology through PyOBO.

OWL client (indra.databases.owl_client)

A client for OWL-sourced identifier mappings.

class indra.databases.owl_client.OwlClient(prefix)[source]

A base client for data that’s been grabbed via OWL.

static entry_from_term(term, prefix, remove_prefix=False, allowed_external_ns=None)[source]

Create a data dictionary from a Pronto term.

Return type:

Mapping[str, Any]

Biolookup client (indra.databases.biolookup_client)

A client to the Biolookup web service available at http://biolookup.io/.

indra.databases.biolookup_client.get_name(db_ns, db_id)[source]

Return the name of a namespace and corresponding ID in the Biolookup web service.

Parameters:
  • db_ns (str) – The database namespace.

  • db_id (str) – The database ID.

Return type:

Dict

Returns:

The name of the entry.

indra.databases.biolookup_client.lookup(db_ns, db_id)[source]

Look up a namespace and corresponding ID in the Biolookup web service.

Parameters:
  • db_ns (str) – The database namespace.

  • db_id (str) – The database ID.

Return type:

dict

Returns:

A dictionary containing the results of the lookup.

indra.databases.biolookup_client.lookup_curie(curie)[source]

Look up a CURIE in the Biolookup web service.

Parameters:

curie (str) – The CURIE to look up.

Return type:

Dict

Returns:

A dictionary containing the results of the lookup.

MONDO client (indra.databases.mondo_client)

A client to the Monarch Disease Ontology (MONDO).

indra.databases.mondo_client.get_id_from_alt_id(mondo_alt_id)[source]

Return the identifier corresponding to the given MONDO alt id.

Parameters:

mondo_alt_id (str) – The MONDO alt id to be converted. Example: “0024812”

Return type:

Optional[str]

Returns:

  • The MONDO identifier corresponding to the given alt id.

  • >>> from indra.databases import mondo_client

  • >>> mondo_client.get_id_from_alt_id(‘0018220’)

  • ’0002413’

indra.databases.mondo_client.get_id_from_name(mondo_name)[source]

Return the identifier corresponding to the given MONDO name.

Parameters:

mondo_name (str) – The MONDO name to be converted. Example: “tenosynovial giant cell tumor, localized type”

Return type:

Optional[str]

Returns:

The MONDO identifier corresponding to the given name.

indra.databases.mondo_client.get_name_from_id(mondo_id)[source]

Return the name corresponding to the given MONDO ID.

Parameters:

mondo_id (str) – The MONDO identifier to be converted. Example: “0002399”

Return type:

Optional[str]

Returns:

The MONDO name corresponding to the given MONDO identifier.

MGI client (indra.databases.mgi_client)

A client for accessing MGI mouse gene data.

indra.databases.mgi_client.get_ensembl_id(mgi_id)[source]

Return the Ensembl ID for an MGI ID.

Parameters:

mgi_id (str) – An MGI ID, without prefix.

Return type:

Optional[str]

Returns:

The Ensembl ID corresponding to the MGI ID, or None if not available.

indra.databases.mgi_client.get_id_from_name(name)[source]

Return an MGI ID from an MGI gene symbol.

Parameters:

name (str) – The MGI gene symbol whose ID will be returned.

Return type:

Optional[str]

Returns:

The MGI ID (without prefix) or None if not available.

indra.databases.mgi_client.get_id_from_name_synonym(name_synonym)[source]

Return an MGI ID from an MGI gene symbol or synonym.

If the given name or synonym is the official symbol of a gene, its ID is returned. If the input is a synonym, it can correspond to one or more genes. If there is a single gene whose synonym matches the input, the ID is returned as a string. If multiple genes share the given synonym, their IDs are returned in a list. If the input doesn’t match any names or synonyms, None is returned.

Parameters:

name_synonym (str) – The MGI gene symbol or synonym whose ID will be returned.

Return type:

Union[None, str, List[str]]

Returns:

The MGI ID (without prefix) of a single gene, a list of MGI IDs, or None.

indra.databases.mgi_client.get_name_from_id(mgi_id)[source]

Return the MGI gene symbol for a given MGI ID.

Parameters:

mgi_id (str) – The MGI ID (without prefix) whose symbol will be returned.

Return type:

Optional[str]

Returns:

The MGI symbol for the given ID or None if not available.

indra.databases.mgi_client.get_synonyms(mgi_id)[source]

Return the synonyms for an MGI ID.

Parameters:

mgi_id (str) – An MGI ID, without prefix.

Return type:

List[str]

Returns:

The list of synonyms corresponding to the MGI ID, or an empty list if not available.

RGD client (indra.databases.rgd_client)

A client for accessing RGD rat gene data.

indra.databases.rgd_client.get_ensembl_id(rgd_id)[source]

Return the Ensembl ID for an RGD ID.

Parameters:

rgd_id (str) – An RGD ID, without prefix.

Return type:

Optional[str]

Returns:

A list of Ensembl IDs corresponding to the RGD ID, or None if not available.

indra.databases.rgd_client.get_id_from_name(name)[source]

Return an RGD ID from an RGD gene symbol.

Parameters:

name (str) – The RGD gene symbol whose ID will be returned.

Return type:

Optional[str]

Returns:

The RGD ID (without prefix) or None if not available.

indra.databases.rgd_client.get_id_from_name_synonym(name_synonym)[source]

Return an RGD ID from an RGD gene symbol or synonym.

If the given name or synonym is the official symbol of a gene, its ID is returned. If the input is a synonym, it can correspond to one or more genes. If there is a single gene whose synonym matches the input, the ID is returned as a string. If multiple genes share the given synonym, their IDs are returned in a list. If the input doesn’t match any names or synonyms, None is returned.

Parameters:

name_synonym (str) – The RGD gene symbol or synonym whose ID will be returned.

Return type:

Union[None, str, List[str]]

Returns:

The RGD ID (without prefix) of a single gene, a list of RGD IDs, or None.

indra.databases.rgd_client.get_name_from_id(rgd_id)[source]

Return the RGD gene symbol for a given RGD ID.

Parameters:

rgd_id (str) – The RGD ID (without prefix) whose symbol will be returned.

Return type:

Optional[str]

Returns:

The RGD symbol for the given ID or None if not available.

indra.databases.rgd_client.get_synonyms(rgd_id)[source]

Return the synonyms for an RGD ID.

Parameters:

rgd_id (str) – An RGD ID, without prefix.

Return type:

List[str]

Returns:

The list of synonyms corresponding to the RGD ID, or an empty list if not available.