Database clients (indra.databases)

HGNC client (indra.hgnc_client)

indra.databases.hgnc_client.get_entrez_id(hgnc_id)[source]

Return the Entrez ID corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted. Note that the HGNC ID is a number that is passed as a string. It is not the same as the HGNC gene symbol.
Returns:entrez_id – The Entrez ID corresponding to the given HGNC ID.
Return type:str
indra.databases.hgnc_client.get_hgnc_entry[source]

Return the HGNC entry for the given HGNC ID from the web service.

Parameters:hgnc_id (str) – The HGNC ID to be converted.
Returns:xml_tree – The XML ElementTree corresponding to the entry for the given HGNC ID.
Return type:ElementTree
indra.databases.hgnc_client.get_hgnc_from_entrez(entrez_id)[source]

Return the HGNC ID corresponding to the given Entrez ID.

Parameters:entrez_id (str) – The EntrezC ID to be converted, a number passed as a strig.
Returns:hgnc_id – The HGNC ID corresponding to the given Entrez ID.
Return type:str
indra.databases.hgnc_client.get_hgnc_id(hgnc_name)[source]

Return the HGNC ID corresponding to the given HGNC symbol.

Parameters:hgnc_name (str) – The HGNC symbol to be converted. Example: BRAF
Returns:hgnc_id – The HGNC ID corresponding to the given HGNC symbol.
Return type:str
indra.databases.hgnc_client.get_hgnc_name(hgnc_id)[source]

Return the HGNC symbol corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted.
Returns:hgnc_name – The HGNC symbol corresponding to the given HGNC ID.
Return type:str
indra.databases.hgnc_client.get_uniprot_id(hgnc_id)[source]

Return the UniProt ID corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted. Note that the HGNC ID is a number that is passed as a string. It is not the same as the HGNC gene symbol.
Returns:uniprot_id – The UniProt ID corresponding to the given HGNC ID.
Return type:str

Uniprot client (indra.databases.uniprot_client)

indra.databases.uniprot_client.get_family_members(family_name, human_only=True)[source]

Return the HGNC gene symbols which are the members of a given family.

Parameters:
  • family_name (str) – Family name to be queried.
  • human_only (bool) – If True, only human proteins in the family will be returned. Default: True
Returns:

gene_names – The HGNC gene symbols corresponding to the given family.

Return type:

list

indra.databases.uniprot_client.get_gene_name(protein_id, web_fallback=True)[source]

Return the gene name for the given UniProt ID.

This is an alternative to get_hgnc_name and is useful when HGNC name is not availabe (for instance, when the organism is not homo sapiens).

Parameters:
  • protein_id (str) – UniProt ID to be mapped.
  • web_fallback (Optional[bool]) – If True and the offline lookup fails, the UniProt web service is used to do the query.
Returns:

gene_name – The gene name corresponding to the given Uniprot ID.

Return type:

str

indra.databases.uniprot_client.get_id_from_mnemonic(uniprot_mnemonic)[source]

Return the UniProt ID for the given UniProt mnemonic.

Parameters:uniprot_mnemonic (str) – UniProt mnemonic to be mapped.
Returns:uniprot_id – The UniProt ID corresponding to the given Uniprot mnemonic.
Return type:str
indra.databases.uniprot_client.get_mnemonic(protein_id, web_fallback=False)[source]

Return the UniProt mnemonic for the given UniProt ID.

Parameters:
  • protein_id (str) – UniProt ID to be mapped.
  • web_fallback (Optional[bool]) – If True and the offline lookup fails, the UniProt web service is used to do the query.
Returns:

mnemonic – The UniProt mnemonic corresponding to the given Uniprot ID.

Return type:

str

indra.databases.uniprot_client.get_primary_id(protein_id)[source]

Return a primary entry corresponding to the UniProt ID.

Parameters:protein_id (str) – The UniProt ID to map to primary.
Returns:primary_id – If the given ID is primary, it is returned as is. Othwewise the primary IDs are looked up. If there are multiple primary IDs then the first human one is returned. If there are no human primary IDs then the first primary found is returned.
Return type:str
indra.databases.uniprot_client.is_human(protein_id)[source]

Return True if the given protein id corresponds to a human protein.

Parameters:protein_id (str) – UniProt ID of the protein
Returns:
Return type:True if the protein_id corresponds to a human protein, otherwise False.
indra.databases.uniprot_client.is_secondary(protein_id)[source]

Return True if the UniProt ID corresponds to a secondary accession.

Parameters:protein_id (str) – The UniProt ID to check.
Returns:
Return type:True if it is a secondary accessing entry, False otherwise.
indra.databases.uniprot_client.query_protein[source]

Return the UniProt entry as an RDF graph for the given UniProt ID.

Parameters:protein_id (str) – UniProt ID to be queried.
Returns:g – The RDF graph corresponding to the UniProt entry.
Return type:rdflib.Graph
indra.databases.uniprot_client.verify_location(protein_id, residue, location)[source]

Return True if the residue is at the given location in the UP sequence.

Parameters:
  • protein_id (str) – UniProt ID of the protein whose sequence is used as reference.
  • residue (str) – A single character amino acid symbol (Y, S, T, V, etc.)
  • location (str) – The location on the protein sequence (starting at 1) at which the residue should be checked against the reference sequence.
Returns:

  • True if the given residue is at the given position in the sequence
  • corresponding to the given UniProt ID, otherwise False.

indra.databases.uniprot_client.verify_modification(protein_id, residue, location=None)[source]

Return True if the residue at the given location has a known modifiation.

Parameters:
  • protein_id (str) – UniProt ID of the protein whose sequence is used as reference.
  • residue (str) – A single character amino acid symbol (Y, S, T, V, etc.)
  • location (Optional[str]) – The location on the protein sequence (starting at 1) at which the modification is checked.
Returns:

  • True if the given residue is reported to be modified at the given position
  • in the sequence corresponding to the given UniProt ID, otherwise False.
  • If location is not given, we only check if there is any residue of the
  • given type that is modified.

ChEBI client (indra.databases.chebi_client)

indra.databases.chebi_client.get_chebi_id_from_pubchem(pubchem_id)[source]

Return the ChEBI ID corresponding to a given Pubchem ID.

Parameters:pubchem_id (str) – Pubchem ID to be converted.
Returns:chebi_id – ChEBI ID corresponding to the given Pubchem ID. If the lookup fails, None is returned.
Return type:str
indra.databases.chebi_client.get_pubchem_id(chebi_id)[source]

Return the PubChem ID corresponding to a given ChEBI ID.

Parameters:chebi_id (str) – ChEBI ID to be converted.
Returns:pubchem_id – PubChem ID corresponding to the given ChEBI ID. If the lookup fails, None is returned.
Return type:str

BioGRID client (indra.databases.biogrid_client)

indra.databases.biogrid_client.get_publications(gene_names, save_json_name=None)[source]

Return evidence publications for interaction between the given genes.

Parameters:
  • gene_names (list[str]) – A list of gene names (HGNC symbols) to query interactions between. Currently supports exactly two genes only.
  • save_json_name (Optional[str]) – A file name to save the raw BioGRID web service output in. By default, the raw output is not saved.
Returns:

publications – A list of Publication objects that provide evidence for interactions between the given list of genes.

Return type:

list[Publication]

Cell type context client (indra.databases.context_client)

indra.databases.context_client.get_mutations(gene_names, cell_types)[source]

Return the mutation status of genes in cell types.

Parameters:
  • gene_names (list) – HGNC gene symbols for which mutations are queried.
  • cell_types (list) –

    List of cell type names in which mutations are queried. The cell type names follow the CCLE database conventions.

    Example: LOXIMVI_SKIN, BT20_BREAST

Returns:

res – A json string containing the mutation status of the given proteins in the given cell types as returned by the NDEx web service.

Return type:

str

indra.databases.context_client.get_protein_expression(gene_names, cell_types)[source]

Return the protein expression levels of genes in cell types.

Parameters:
  • gene_names (list) – HGNC gene symbols for which expression levels are queried.
  • cell_types (list) –

    List of cell type names in which expression levels are queried. The cell type names follow the CCLE database conventions.

    Example: LOXIMVI_SKIN, BT20_BREAST

Returns:

res – A json string containing the predicted protein expression levels of the given proteins in the given cell types as returned by the NDEx web service.

Return type:

str

Network relevance client (indra.databases.relevance_client)

indra.databases.relevance_client.get_heat_kernel(network_id)[source]

Return the identifier of a heat kernel calculated for a given network.

Parameters:network_id (str) – The UUID of the network in NDEx.
Returns:kernel_id – The identifier of the heat kernel calculated for the given network.
Return type:str
indra.databases.relevance_client.get_relevant_nodes(network_id, query_nodes)[source]

Return a set of network nodes relevant to a given query set.

A heat diffusion algorithm is used on a pre-computed heat kernel for the given network which starts from the given query nodes. The nodes in the network are ranked according to heat score which is a measure of relevance with respect to the query nodes.

Parameters:
  • network_id (str) – The UUID of the network in NDEx.
  • query_nodes (list[str]) – A list of node names with respect to which relevance is queried.
Returns:

ranked_entities – A list containing pairs of node names and their relevance scores.

Return type:

list[(str, float)]

NDEx client (indra.databases.ndex_client)

indra.databases.ndex_client.send_request(ndex_service_url, params, is_json=True, use_get=False)[source]

Send a request to the NDEx server.

Parameters:
  • ndex_service_url (str) – The URL of the service to use for the request.
  • params (dict) – A dictionary of parameters to send with the request. Parameter keys differ based on the type of request.
  • is_json (bool) – True if the response is in json format, otherwise it is assumed to be text. Default: False
  • use_get (bool) – True if the request needs to use GET instead of POST.
Returns:

res – Depending on the type of service and the is_json parameter, this function either returns a text string or a json dict.

Return type:

str