Database clients (indra.databases)

HGNC client (indra.hgnc_client)

indra.databases.hgnc_client.get_entrez_id(hgnc_id)[source]

Return the Entrez ID corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted. Note that the HGNC ID is a number that is passed as a string. It is not the same as the HGNC gene symbol.
Returns:entrez_id – The Entrez ID corresponding to the given HGNC ID.
Return type:str
indra.databases.hgnc_client.get_hgnc_entry[source]

Return the HGNC entry for the given HGNC ID from the web service.

Parameters:hgnc_id (str) – The HGNC ID to be converted.
Returns:xml_tree – The XML ElementTree corresponding to the entry for the given HGNC ID.
Return type:ElementTree
indra.databases.hgnc_client.get_hgnc_from_entrez(entrez_id)[source]

Return the HGNC ID corresponding to the given Entrez ID.

Parameters:entrez_id (str) – The EntrezC ID to be converted, a number passed as a strig.
Returns:hgnc_id – The HGNC ID corresponding to the given Entrez ID.
Return type:str
indra.databases.hgnc_client.get_hgnc_from_mouse(mgi_id)[source]

Return the HGNC ID corresponding to the given MGI mouse gene ID.

Parameters:mgi_id (str) – The MGI ID to be converted. Example: “2444934”
Returns:hgnc_id – The HGNC ID corresponding to the given MGI ID.
Return type:str
indra.databases.hgnc_client.get_hgnc_from_rat(rgd_id)[source]

Return the HGNC ID corresponding to the given RGD rat gene ID.

Parameters:rgd_id (str) – The RGD ID to be converted. Example: “1564928”
Returns:hgnc_id – The HGNC ID corresponding to the given RGD ID.
Return type:str
indra.databases.hgnc_client.get_hgnc_id(hgnc_name)[source]

Return the HGNC ID corresponding to the given HGNC symbol.

Parameters:hgnc_name (str) – The HGNC symbol to be converted. Example: BRAF
Returns:hgnc_id – The HGNC ID corresponding to the given HGNC symbol.
Return type:str
indra.databases.hgnc_client.get_hgnc_name(hgnc_id)[source]

Return the HGNC symbol corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted.
Returns:hgnc_name – The HGNC symbol corresponding to the given HGNC ID.
Return type:str
indra.databases.hgnc_client.get_mouse_id(hgnc_id)[source]

Return the MGI mouse ID corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted. Example: “”
Returns:mgi_id – The MGI ID corresponding to the given HGNC ID.
Return type:str
indra.databases.hgnc_client.get_rat_id(hgnc_id)[source]

Return the RGD rat ID corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted. Example: “”
Returns:rgd_id – The RGD ID corresponding to the given HGNC ID.
Return type:str
indra.databases.hgnc_client.get_uniprot_id(hgnc_id)[source]

Return the UniProt ID corresponding to the given HGNC ID.

Parameters:hgnc_id (str) – The HGNC ID to be converted. Note that the HGNC ID is a number that is passed as a string. It is not the same as the HGNC gene symbol.
Returns:uniprot_id – The UniProt ID corresponding to the given HGNC ID.
Return type:str

Uniprot client (indra.databases.uniprot_client)

indra.databases.uniprot_client.get_family_members(family_name, human_only=True)[source]

Return the HGNC gene symbols which are the members of a given family.

Parameters:
  • family_name (str) – Family name to be queried.
  • human_only (bool) – If True, only human proteins in the family will be returned. Default: True
Returns:

gene_names – The HGNC gene symbols corresponding to the given family.

Return type:

list

indra.databases.uniprot_client.get_function(protein_id)[source]

Return the function description of a given protein.

Parameters:protein_id (str) – The UniProt ID of the protein.
Returns:The function description of the protein.
Return type:str
indra.databases.uniprot_client.get_gene_name(protein_id, web_fallback=True)[source]

Return the gene name for the given UniProt ID.

This is an alternative to get_hgnc_name and is useful when HGNC name is not availabe (for instance, when the organism is not homo sapiens).

Parameters:
  • protein_id (str) – UniProt ID to be mapped.
  • web_fallback (Optional[bool]) – If True and the offline lookup fails, the UniProt web service is used to do the query.
Returns:

gene_name – The gene name corresponding to the given Uniprot ID.

Return type:

str

indra.databases.uniprot_client.get_gene_synonyms(protein_id)[source]

Return a list of synonyms for the gene corresponding to a protein.

Note that synonyms here also include the official gene name as returned by get_gene_name.

Parameters:protein_id (str) – The UniProt ID of the protein to query
Returns:synonyms – The list of synonyms of the gene corresponding to the protein
Return type:list[str]
indra.databases.uniprot_client.get_id_from_mgi(mgi_id)[source]

Return the UniProt ID given the MGI ID of a mouse protein.

Parameters:mgi_id (str) – The MGI ID of the mouse protein.
Returns:up_id – The UniProt ID of the mouse protein.
Return type:str
indra.databases.uniprot_client.get_id_from_mnemonic(uniprot_mnemonic)[source]

Return the UniProt ID for the given UniProt mnemonic.

Parameters:uniprot_mnemonic (str) – UniProt mnemonic to be mapped.
Returns:uniprot_id – The UniProt ID corresponding to the given Uniprot mnemonic.
Return type:str
indra.databases.uniprot_client.get_id_from_rgd(rgd_id)[source]

Return the UniProt ID given the RGD ID of a rat protein.

Parameters:rgd_id (str) – The RGD ID of the rat protein.
Returns:up_id – The UniProt ID of the rat protein.
Return type:str
indra.databases.uniprot_client.get_length(protein_id)[source]

Return the length (number of amino acids) of a protein.

Parameters:protein_id (str) – UniProt ID of a protein.
Returns:length – The length of the protein in amino acids.
Return type:int
indra.databases.uniprot_client.get_mgi_id(protein_id)[source]

Return the MGI ID given the protein id of a mouse protein.

Parameters:protein_id (str) – UniProt ID of the mouse protein
Returns:mgi_id – MGI ID of the mouse protein
Return type:str
indra.databases.uniprot_client.get_mnemonic(protein_id, web_fallback=False)[source]

Return the UniProt mnemonic for the given UniProt ID.

Parameters:
  • protein_id (str) – UniProt ID to be mapped.
  • web_fallback (Optional[bool]) – If True and the offline lookup fails, the UniProt web service is used to do the query.
Returns:

mnemonic – The UniProt mnemonic corresponding to the given Uniprot ID.

Return type:

str

indra.databases.uniprot_client.get_mouse_id(human_protein_id)[source]

Return the mouse UniProt ID given a human UniProt ID.

Parameters:human_protein_id (str) – The UniProt ID of a human protein.
Returns:mouse_protein_id – The UniProt ID of a mouse protein orthologous to the given human protein
Return type:str
indra.databases.uniprot_client.get_primary_id(protein_id)[source]

Return a primary entry corresponding to the UniProt ID.

Parameters:protein_id (str) – The UniProt ID to map to primary.
Returns:primary_id – If the given ID is primary, it is returned as is. Othwewise the primary IDs are looked up. If there are multiple primary IDs then the first human one is returned. If there are no human primary IDs then the first primary found is returned.
Return type:str
indra.databases.uniprot_client.get_protein_synonyms(protein_id)[source]

Return a list of synonyms for a protein.

Note that this function returns protein synonyms as provided by UniProt. The get_gene_synonym returns synonyms given for the gene corresponding to the protein, and get_synonyms returns both.

Parameters:protein_id (str) – The UniProt ID of the protein to query
Returns:synonyms – The list of synonyms of the protein
Return type:list[str]
indra.databases.uniprot_client.get_rat_id(human_protein_id)[source]

Return the rat UniProt ID given a human UniProt ID.

Parameters:human_protein_id (str) – The UniProt ID of a human protein.
Returns:rat_protein_id – The UniProt ID of a rat protein orthologous to the given human protein
Return type:str
indra.databases.uniprot_client.get_rgd_id(protein_id)[source]

Return the RGD ID given the protein id of a rat protein.

Parameters:protein_id (str) – UniProt ID of the rat protein
Returns:rgd_id – RGD ID of the rat protein
Return type:str
indra.databases.uniprot_client.get_synonyms(protein_id)[source]

Return synonyms for a protein and its associated gene.

Parameters:protein_id (str) – The UniProt ID of the protein to query
Returns:synonyms – The list of synonyms of the protein and its associated gene.
Return type:list[str]
indra.databases.uniprot_client.is_human(protein_id)[source]

Return True if the given protein id corresponds to a human protein.

Parameters:protein_id (str) – UniProt ID of the protein
Returns:
Return type:True if the protein_id corresponds to a human protein, otherwise False.
indra.databases.uniprot_client.is_mouse(protein_id)[source]

Return True if the given protein id corresponds to a mouse protein.

Parameters:protein_id (str) – UniProt ID of the protein
Returns:
Return type:True if the protein_id corresponds to a mouse protein, otherwise False.
indra.databases.uniprot_client.is_rat(protein_id)[source]

Return True if the given protein id corresponds to a rat protein.

Parameters:protein_id (str) – UniProt ID of the protein
Returns:
Return type:True if the protein_id corresponds to a rat protein, otherwise False.
indra.databases.uniprot_client.is_secondary(protein_id)[source]

Return True if the UniProt ID corresponds to a secondary accession.

Parameters:protein_id (str) – The UniProt ID to check.
Returns:
Return type:True if it is a secondary accessing entry, False otherwise.
indra.databases.uniprot_client.query_protein[source]

Return the UniProt entry as an RDF graph for the given UniProt ID.

Parameters:protein_id (str) – UniProt ID to be queried.
Returns:g – The RDF graph corresponding to the UniProt entry.
Return type:rdflib.Graph
indra.databases.uniprot_client.verify_location(protein_id, residue, location)[source]

Return True if the residue is at the given location in the UP sequence.

Parameters:
  • protein_id (str) – UniProt ID of the protein whose sequence is used as reference.
  • residue (str) – A single character amino acid symbol (Y, S, T, V, etc.)
  • location (str) – The location on the protein sequence (starting at 1) at which the residue should be checked against the reference sequence.
Returns:

  • True if the given residue is at the given position in the sequence
  • corresponding to the given UniProt ID, otherwise False.

indra.databases.uniprot_client.verify_modification(protein_id, residue, location=None)[source]

Return True if the residue at the given location has a known modifiation.

Parameters:
  • protein_id (str) – UniProt ID of the protein whose sequence is used as reference.
  • residue (str) – A single character amino acid symbol (Y, S, T, V, etc.)
  • location (Optional[str]) – The location on the protein sequence (starting at 1) at which the modification is checked.
Returns:

  • True if the given residue is reported to be modified at the given position
  • in the sequence corresponding to the given UniProt ID, otherwise False.
  • If location is not given, we only check if there is any residue of the
  • given type that is modified.

ChEBI client (indra.databases.chebi_client)

indra.databases.chebi_client.get_chebi_id_from_cas(cas_id)[source]

Return a ChEBI ID corresponding to the given CAS ID.

Parameters:
  • cas_id (str) – The CAS ID to be converted.
  • chebi_id (str) – The ChEBI ID corresponding to the given CAS ID. If the lookup fails, None is returned.
indra.databases.chebi_client.get_chebi_id_from_pubchem(pubchem_id)[source]

Return the ChEBI ID corresponding to a given Pubchem ID.

Parameters:pubchem_id (str) – Pubchem ID to be converted.
Returns:chebi_id – ChEBI ID corresponding to the given Pubchem ID. If the lookup fails, None is returned.
Return type:str
indra.databases.chebi_client.get_chembl_id(chebi_id)[source]

Return a ChEMBL ID from a given ChEBI ID.

Parameters:chebi_id (str) – ChEBI ID to be converted.
Returns:chembl_id – ChEMBL ID corresponding to the given ChEBI ID. If the lookup fails, None is returned.
Return type:str
indra.databases.chebi_client.get_pubchem_id(chebi_id)[source]

Return the PubChem ID corresponding to a given ChEBI ID.

Parameters:chebi_id (str) – ChEBI ID to be converted.
Returns:pubchem_id – PubChem ID corresponding to the given ChEBI ID. If the lookup fails, None is returned.
Return type:str

BioGRID client (indra.databases.biogrid_client)

indra.databases.biogrid_client.get_publications(gene_names, save_json_name=None)[source]

Return evidence publications for interaction between the given genes.

Parameters:
  • gene_names (list[str]) – A list of gene names (HGNC symbols) to query interactions between. Currently supports exactly two genes only.
  • save_json_name (Optional[str]) – A file name to save the raw BioGRID web service output in. By default, the raw output is not saved.
Returns:

publications – A list of Publication objects that provide evidence for interactions between the given list of genes.

Return type:

list[Publication]

Cell type context client (indra.databases.context_client)

indra.databases.context_client.get_mutations(gene_names, cell_types)[source]

Return protein amino acid changes in given genes and cell types.

Parameters:
  • gene_names (list) – HGNC gene symbols for which mutations are queried.
  • cell_types (list) –

    List of cell type names in which mutations are queried. The cell type names follow the CCLE database conventions.

    Example: LOXIMVI_SKIN, BT20_BREAST

Returns:

res – A dictionary keyed by cell line, which contains another dictionary that is keyed by gene name, with a list of amino acid substitutions as values.

Return type:

dict[dict[list]]

indra.databases.context_client.get_protein_expression(gene_names, cell_types)[source]

Return the protein expression levels of genes in cell types.

Parameters:
  • gene_names (list) – HGNC gene symbols for which expression levels are queried.
  • cell_types (list) –

    List of cell type names in which expression levels are queried. The cell type names follow the CCLE database conventions.

    Example: LOXIMVI_SKIN, BT20_BREAST

Returns:

res – A dictionary keyed by cell line, which contains another dictionary that is keyed by gene name, with estimated protein amounts as values.

Return type:

dict[dict[float]]

Network relevance client (indra.databases.relevance_client)

indra.databases.relevance_client.get_heat_kernel(network_id)[source]

Return the identifier of a heat kernel calculated for a given network.

Parameters:network_id (str) – The UUID of the network in NDEx.
Returns:kernel_id – The identifier of the heat kernel calculated for the given network.
Return type:str
indra.databases.relevance_client.get_relevant_nodes(network_id, query_nodes)[source]

Return a set of network nodes relevant to a given query set.

A heat diffusion algorithm is used on a pre-computed heat kernel for the given network which starts from the given query nodes. The nodes in the network are ranked according to heat score which is a measure of relevance with respect to the query nodes.

Parameters:
  • network_id (str) – The UUID of the network in NDEx.
  • query_nodes (list[str]) – A list of node names with respect to which relevance is queried.
Returns:

ranked_entities – A list containing pairs of node names and their relevance scores.

Return type:

list[(str, float)]

NDEx client (indra.databases.ndex_client)

indra.databases.ndex_client.create_network(cx_str, ndex_cred=None, private=True)[source]

Creates a new NDEx network of the assembled CX model.

To upload the assembled CX model to NDEx, you need to have a registered account on NDEx (http://ndexbio.org/) and have the ndex python package installed. The uploaded network is private by default.

Parameters:ndex_cred (dict) – A dictionary with the following entries: ‘user’: NDEx user name ‘password’: NDEx password
Returns:network_id – The UUID of the NDEx network that was created by uploading the assembled CX model.
Return type:str
indra.databases.ndex_client.get_default_ndex_cred(ndex_cred)[source]

Gets the NDEx credentials from the dict, or tries the environment if None

indra.databases.ndex_client.send_request(ndex_service_url, params, is_json=True, use_get=False)[source]

Send a request to the NDEx server.

Parameters:
  • ndex_service_url (str) – The URL of the service to use for the request.
  • params (dict) – A dictionary of parameters to send with the request. Parameter keys differ based on the type of request.
  • is_json (bool) – True if the response is in json format, otherwise it is assumed to be text. Default: False
  • use_get (bool) – True if the request needs to use GET instead of POST.
Returns:

res – Depending on the type of service and the is_json parameter, this function either returns a text string or a json dict.

Return type:

str

indra.databases.ndex_client.update_network(cx_str, network_id, ndex_cred=None)[source]

Update an existing CX network on NDEx with new CX content.

Parameters:
  • cx_str (str) – String containing the CX content.
  • network_id (str) – UUID of the network on NDEx.
  • ndex_cred (dict) – A dictionary with the following entries: ‘user’: NDEx user name ‘password’: NDEx password

cBio portal client (indra.databases.cbio_client)

indra.databases.cbio_client.get_cancer_studies(study_filter=None)[source]

Return a list of cancer study identifiers, optionally filtered.

There are typically multiple studies for a given type of cancer and a filter can be used to constrain the returned list.

Parameters:study_filter (Optional[str]) – A string used to filter the study IDs to return. Example: “paad”
Returns:study_ids – A list of study IDs. For instance “paad” as a filter would result in a list of study IDs with paad in their name like “paad_icgc”, “paad_tcga”, etc.
Return type:list[str]
indra.databases.cbio_client.get_cancer_types(cancer_filter=None)[source]

Return a list of cancer types, optionally filtered.

Parameters:cancer_filter (Optional[str]) – A string used to filter cancer types. Its value is the name or part of the name of a type of cancer. Example: “melanoma”, “pancreatic”, “non-small cell lung”
Returns:type_ids – A list of cancer types matching the filter. Example: for cancer_filter=”pancreatic”, the result includes “panet” (neuro-endocrine) and “paad” (adenocarcinoma)
Return type:list[str]
indra.databases.cbio_client.get_case_lists(study_id)[source]

Return a list of the case set ids for a particular study.

TAKE NOTE the “case_list_id” are the same thing as “case_set_id” Within the data, this string is referred to as a “case_list_id”. Within API calls it is referred to as a ‘case_set_id’. The documentation does not make this explicitly clear.

Parameters:study_id (str) – The ID of the cBio study. Example: ‘cellline_ccle_broad’ or ‘paad_icgc’
Returns:case_set_ids – A dict keyed to cases containing a dict keyed to genes containing int
Return type:dict[dict[int]]
indra.databases.cbio_client.get_ccle_cna(gene_list, cell_lines)[source]

Return a dict of CNAs in given genes and cell lines from CCLE.

CNA values correspond to the following alterations

-2 = homozygous deletion

-1 = hemizygous deletion

0 = neutral / no change

1 = gain

2 = high level amplification

Parameters:
  • gene_list (list[str]) – A list of HGNC gene symbols to get mutations in
  • cell_lines (list[str]) – A list of CCLE cell line names to get mutations for.
Returns:

profile_data – A dict keyed to cases containing a dict keyed to genes containing int

Return type:

dict[dict[int]]

indra.databases.cbio_client.get_ccle_lines_for_mutation(gene, amino_acid_change)[source]

Return cell lines with a given point mutation in a given gene.

Checks which cell lines in CCLE have a particular point mutation in a given gene and return their names in a list.

Parameters:
  • gene (str) – The HGNC symbol of the mutated gene in whose product the amino acid change occurs. Example: “BRAF”
  • amino_acid_change (str) – The amino acid change of interest. Example: “V600E”
Returns:

cell_lines – A list of CCLE cell lines in which the given mutation occurs.

Return type:

list

indra.databases.cbio_client.get_ccle_mrna(gene_list, cell_lines)[source]

Return a dict of mRNA amounts in given genes and cell lines from CCLE.

Parameters:
  • gene_list (list[str]) – A list of HGNC gene symbols to get mRNA amounts for.
  • cell_lines (list[str]) – A list of CCLE cell line names to get mRNA amounts for.
Returns:

mrna_amounts – A dict keyed to cell lines containing a dict keyed to genes containing float

Return type:

dict[dict[float]]

indra.databases.cbio_client.get_ccle_mutations(gene_list, cell_lines, mutation_type=None)[source]

Return a dict of mutations in given genes and cell lines from CCLE.

This is a specialized call to get_mutations tailored to CCLE cell lines.

Parameters:
  • gene_list (list[str]) – A list of HGNC gene symbols to get mutations in
  • cell_lines (list[str]) – A list of CCLE cell line names to get mutations for.
  • mutation_type (Optional[str]) – The type of mutation to filter to. mutation_type can be one of: missense, nonsense, frame_shift_ins, frame_shift_del, splice_site
Returns:

mutations – The result from cBioPortal as a dict in the format {cell_line : {gene : [mutation1, mutation2, …] }}

Example: {‘LOXIMVI_SKIN’: {‘BRAF’: [‘V600E’, ‘I208V’]}, ‘SKMEL30_SKIN’: {‘BRAF’: [‘D287H’, ‘E275K’]}}

Return type:

dict

indra.databases.cbio_client.get_genetic_profiles(study_id, profile_filter=None)[source]

Return all the genetic profiles (data sets) for a given study.

Genetic profiles are different types of data for a given study. For instance the study ‘cellline_ccle_broad’ has profiles such as ‘cellline_ccle_broad_mutations’ for mutations, ‘cellline_ccle_broad_CNA’ for copy number alterations, etc.

Parameters:
  • study_id (str) – The ID of the cBio study. Example: ‘paad_icgc’
  • profile_filter (Optional[str]) – A string used to filter the profiles to return. Will be one of: - MUTATION - MUTATION_EXTENDED - COPY_NUMBER_ALTERATION - MRNA_EXPRESSION - METHYLATION The genetic profiles can include “mutation”, “CNA”, “rppa”, “methylation”, etc.
Returns:

genetic_profiles – A list of genetic profiles available for the given study.

Return type:

list[str]

indra.databases.cbio_client.get_mutations(study_id, gene_list, mutation_type=None, case_id=None)[source]

Return mutations as a list of genes and list of amino acid changes.

Parameters:
  • study_id (str) – The ID of the cBio study. Example: ‘cellline_ccle_broad’ or ‘paad_icgc’
  • gene_list (list[str]) – A list of genes with their HGNC symbols. Example: [‘BRAF’, ‘KRAS’]
  • mutation_type (Optional[str]) – The type of mutation to filter to. mutation_type can be one of: missense, nonsense, frame_shift_ins, frame_shift_del, splice_site
  • case_id (Optional[str]) – The case ID within the study to filter to.
Returns:

mutations – A tuple of two lists, the first one containing a list of genes, and the second one a list of amino acid changes in those genes.

Return type:

tuple[list]

indra.databases.cbio_client.get_num_sequenced(study_id)[source]

Return number of sequenced tumors for given study.

This is useful for calculating mutation statistics in terms of the prevalence of certain mutations within a type of cancer.

Parameters:study_id (str) – The ID of the cBio study. Example: ‘paad_icgc’
Returns:num_case – The number of sequenced tumors in the given study
Return type:int
indra.databases.cbio_client.get_profile_data(study_id, gene_list, profile_filter, case_set_filter=None)[source]

Return dict of cases and genes and their respective values.

Parameters:
  • study_id (str) – The ID of the cBio study. Example: ‘cellline_ccle_broad’ or ‘paad_icgc’
  • gene_list (list[str]) – A list of genes with their HGNC symbols. Example: [‘BRAF’, ‘KRAS’]
  • profile_filter (str) – A string used to filter the profiles to return. Will be one of: - MUTATION - MUTATION_EXTENDED - COPY_NUMBER_ALTERATION - MRNA_EXPRESSION - METHYLATION
  • case_set_filter (Optional[str]) – A string that specifices which case_set_id to use, based on a complete or partial match. If not provided, will look for study_id + ‘_all’
Returns:

profile_data – A dict keyed to cases containing a dict keyed to genes containing int

Return type:

dict[dict[int]]

indra.databases.cbio_client.send_request[source]

Return a data frame from a web service request to cBio portal.

Sends a web service requrest to the cBio portal with arguments given in the dictionary data and returns a Pandas data frame on success.

More information about the service here: http://www.cbioportal.org/web_api.jsp

Parameters:kwargs (dict) – A dict of parameters for the query. Entries map directly to web service calls with the exception of the optional ‘skiprows’ entry, whose value is used as the number of rows to skip when reading the result data frame.
Returns:df – Response from cBioPortal as a Pandas DataFrame.
Return type:pandas.DataFrame

ChEMBL client (indra.databases.chembl_client)

indra.databases.chembl_client.activities_by_target(activities)[source]

Get back lists of activities in a dict keyed by ChEMBL target id

Parameters:activities (list) – response from a query returning activities for a drug
Returns:targ_act_dict – dictionary keyed to ChEMBL target ids with lists of activity ids
Return type:dict
indra.databases.chembl_client.get_chembl_id(nlm_mesh)[source]

Get ChEMBL ID from NLM MESH

Parameters:nlm_mesh (str) –
Returns:chembl_id
Return type:str
indra.databases.chembl_client.get_drug_inhibition_stmts(drug)[source]

Query ChEMBL for kinetics data given drug as Agent get back statements

Parameters:drug (Agent) – Agent representing drug with MESH or CHEBI grounding
Returns:stmts – INDRA statements generated by querying ChEMBL for all kinetics data of a drug interacting with protein targets
Return type:list of INDRA statements
indra.databases.chembl_client.get_evidence(assay)[source]

Given an activity, return an INDRA Evidence object.

Parameters:assay (dict) – an activity from the activities list returned by a query to the API
Returns:ev – an Evidence object containing the kinetics of the
Return type:Evidence
indra.databases.chembl_client.get_kinetics(assay)[source]

Given an activity, return its kinetics values.

Parameters:assay (dict) – an activity from the activities list returned by a query to the API
Returns:kin – dictionary of values with units keyed to value types ‘IC50’, ‘EC50’, ‘INH’, ‘Potency’, ‘Kd’
Return type:dict
indra.databases.chembl_client.get_mesh_id(nlm_mesh)[source]

Get MESH ID from NLM MESH

Parameters:nlm_mesh (str) –
Returns:mesh_id
Return type:str
indra.databases.chembl_client.get_pcid(mesh_id)[source]

Get PC ID from MESH ID

Parameters:mesh (str) –
Returns:pcid
Return type:str
indra.databases.chembl_client.get_pmid(doc_id)[source]

Get PMID from document_chembl_id

Parameters:doc_id (str) –
Returns:pmid
Return type:str
indra.databases.chembl_client.get_protein_targets_only(target_chembl_ids)[source]

Given list of ChEMBL target ids, return dict of SINGLE PROTEIN targets

Parameters:target_chembl_ids (list) – list of chembl_ids as strings
Returns:protein_targets – dictionary keyed to ChEMBL target ids with lists of activity ids
Return type:dict
indra.databases.chembl_client.get_target_chemblid(target_upid)[source]

Get ChEMBL ID from UniProt upid

Parameters:target_upid (str) –
Returns:target_chembl_id
Return type:str
indra.databases.chembl_client.query_target(target_chembl_id)[source]

Query ChEMBL API target by id

Parameters:target_chembl_id (str) –
Returns:target – dict parsed from json that is unique for the target
Return type:dict
indra.databases.chembl_client.send_query(query_dict)[source]

Query ChEMBL API

Parameters:query_dict (dict) – ‘query’ : string of the endpoint to query ‘params’ : dict of params for the query
Returns:js – dict parsed from json that is unique to the submitted query
Return type:dict

LINCS client (indra.databases.lincs_client)

indra.databases.lincs_client.get_drug_target_data()[source]

Load the csv into a list of dicts containing the LINCS drug target data.

Returns:data – A list of dicts, each keyed based on the header of the csv, with values as the corresponding column values.
Return type:list[dict]
class indra.databases.lincs_client.LincsClient[source]

Client for querying LINCS small molecules and proteins.

get_protein_refs(hms_lincs_id)[source]

Get the refs for a protein from the LINCs protein metadata.

Parameters:hms_lincs_id (str) – The HMS LINCS ID for the protein
Returns:A dictionary of protein references.
Return type:dict
get_small_molecule_name(hms_lincs_id)[source]

Get the name of a small molecule from the LINCS sm metadata.

Parameters:hms_lincs_id (str) – The HMS LINCS ID of the small molecule.
Returns:The name of the small molecule.
Return type:str
get_small_molecule_refs(hms_lincs_id)[source]

Get the id refs of a small molecule from the LINCS sm metadata.

Parameters:hms_lincs_id (str) – The HMS LINCS ID of the small molecule.
Returns:A dictionary of references.
Return type:dict
indra.databases.lincs_client.load_lincs_csv(url)[source]

Helper function to turn csv rows into dicts.

MeSH client (indra.databases.mesh_client)

indra.databases.mesh_client.get_mesh_name(mesh_id, offline=False)[source]

Get the MESH label for the given MESH ID.

Uses the mappings table in indra/resources; if the MESH ID is not listed there, falls back on the NLM REST API.

Parameters:
  • mesh_id (str) – MESH Identifier, e.g. ‘D003094’.
  • offline (bool) – Whether to allow queries to the NLM REST API if the given MESH ID is not contained in INDRA’s internal MESH mappings file. Default is False (allows REST API queries).
Returns:

Label for the MESH ID, or None if the query failed or no label was found.

Return type:

str

indra.databases.mesh_client.get_mesh_name_from_web[source]

Get the MESH label for the given MESH ID using the NLM REST API.

Parameters:mesh_id (str) – MESH Identifier, e.g. ‘D003094’.
Returns:Label for the MESH ID, or None if the query failed or no label was found.
Return type:str

GO client (indra.databases.go_client)

indra.databases.go_client.get_go_label(go_id)[source]

Get label corresponding to a given GO identifier.

Parameters:go_id (str) – The GO identifier. Should include the GO: prefix, e.g., GO:1903793 (positive regulation of anion transport).
Returns:Label corresponding to the GO ID.
Return type:str
indra.databases.go_client.load_go_graph(go_fname)[source]

Load the GO data from an OWL file and parse into an RDF graph.

Parameters:go_fname (str) – Path to the GO OWL file. Can be downloaded from http://geneontology.org/ontology/go.owl.
Returns:RDF graph containing GO data.
Return type:rdflib.Graph
indra.databases.go_client.update_id_mappings(g)[source]

Compile all ID->label mappings and save to a TSV file.

Parameters:g (rdflib.Graph) – RDF graph containing GO data.