BioPAX (indra.sources.biopax
)¶
This module allows processing BioPAX content into INDRA Statements. It uses the pybiopax package (https://github.com/indralab/pybiopax) to process OWL files or strings, or to obtain BioPAX content by querying the PathwayCommons web service. The module has been tested with BioPAX content from PathwayCommons https://www.pathwaycommons.org/archives/PC2/v12/. BioPAX from other sources may not adhere to the same conventions and could result in processing issues, though these can typically be addressed with minor changes in the processor’s logic.
BioPAX API (indra.sources.biopax.api
)¶
- indra.sources.biopax.api.process_model(model)[source]¶
Returns a BiopaxProcessor for a BioPAX model object.
- Parameters
model (org.biopax.paxtools.model.Model) – A BioPAX model object.
- Returns
bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.
- Return type
- indra.sources.biopax.api.process_owl(owl_filename, encoding=None)[source]¶
Returns a BiopaxProcessor for a BioPAX OWL file.
- Parameters
- Returns
bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.
- Return type
- indra.sources.biopax.api.process_owl_gz(owl_gz_filename)[source]¶
Returns a BiopaxProcessor for a gzipped BioPAX OWL file.
- Parameters
owl_gz_filename (str) – The name of the gzipped OWL file to process.
- Returns
bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.
- Return type
- indra.sources.biopax.api.process_owl_str(owl_str)[source]¶
Returns a BiopaxProcessor for a BioPAX OWL file.
- Parameters
owl_str (str) – The string content of an OWL file to process.
- Returns
bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.
- Return type
- indra.sources.biopax.api.process_pc_neighborhood(gene_names, neighbor_limit=1, database_filter=None)[source]¶
Returns a BiopaxProcessor for a PathwayCommons neighborhood query.
The neighborhood query finds the neighborhood around a set of source genes.
http://www.pathwaycommons.org/pc2/#graph
http://www.pathwaycommons.org/pc2/#graph_kind
- Parameters
gene_names (list) – A list of HGNC gene symbols to search the neighborhood of. Examples: [‘BRAF’], [‘BRAF’, ‘MAP2K1’]
neighbor_limit (Optional[int]) – The number of steps to limit the size of the neighborhood around the gene names being queried. Default: 1
database_filter (Optional[list]) – A list of database identifiers to which the query is restricted. Examples: [‘reactome’], [‘biogrid’, ‘pid’, ‘psp’] If not given, all databases are used in the query. For a full list of databases see http://www.pathwaycommons.org/pc2/datasources
- Returns
A BiopaxProcessor containing the obtained BioPAX model in its model attribute and a list of extracted INDRA Statements from the model in its statements attribute.
- Return type
- indra.sources.biopax.api.process_pc_pathsbetween(gene_names, neighbor_limit=1, database_filter=None, block_size=None)[source]¶
Returns a BiopaxProcessor for a PathwayCommons paths-between query.
The paths-between query finds the paths between a set of genes. Here source gene names are given in a single list and all directions of paths between these genes are considered.
http://www.pathwaycommons.org/pc2/#graph
http://www.pathwaycommons.org/pc2/#graph_kind
- Parameters
gene_names (list) – A list of HGNC gene symbols to search for paths between. Examples: [‘BRAF’, ‘MAP2K1’]
neighbor_limit (Optional[int]) – The number of steps to limit the length of the paths between the gene names being queried. Default: 1
database_filter (Optional[list]) – A list of database identifiers to which the query is restricted. Examples: [‘reactome’], [‘biogrid’, ‘pid’, ‘psp’] If not given, all databases are used in the query. For a full list of databases see http://www.pathwaycommons.org/pc2/datasources
block_size (Optional[int]) – Large paths-between queries (above ~60 genes) can error on the server side. In this case, the query can be replaced by a series of smaller paths-between and paths-from-to queries each of which contains block_size genes.
- Returns
bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.
- Return type
- indra.sources.biopax.api.process_pc_pathsfromto(source_genes, target_genes, neighbor_limit=1, database_filter=None)[source]¶
Returns a BiopaxProcessor for a PathwayCommons paths-from-to query.
The paths-from-to query finds the paths from a set of source genes to a set of target genes.
http://www.pathwaycommons.org/pc2/#graph
http://www.pathwaycommons.org/pc2/#graph_kind
- Parameters
source_genes (list) – A list of HGNC gene symbols that are the sources of paths being searched for. Examples: [‘BRAF’, ‘RAF1’, ‘ARAF’]
target_genes (list) – A list of HGNC gene symbols that are the targets of paths being searched for. Examples: [‘MAP2K1’, ‘MAP2K2’]
neighbor_limit (Optional[int]) – The number of steps to limit the length of the paths between the source genes and target genes being queried. Default: 1
database_filter (Optional[list]) – A list of database identifiers to which the query is restricted. Examples: [‘reactome’], [‘biogrid’, ‘pid’, ‘psp’] If not given, all databases are used in the query. For a full list of databases see http://www.pathwaycommons.org/pc2/datasources
- Returns
bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.
- Return type
BioPAX Processor (indra.sources.biopax.processor
)¶
- class indra.sources.biopax.processor.BiopaxProcessor(model, use_conversion_level_evidence=False)[source]¶
The BiopaxProcessor extracts INDRA Statements from a BioPAX model.
The BiopaxProcessor uses pattern searches in a BioPAX OWL model to extract mechanisms from which it constructs INDRA Statements.
- Parameters
model (org.biopax.paxtools.model.Model) – A BioPAX model object (java object)
- model¶
A BioPAX model object (java object) which is queried using Paxtools to extract INDRA Statements
- Type
org.biopax.paxtools.model.Model
- statements¶
A list of INDRA Statements that were extracted from the model.
- Type
list[indra.statements.Statement]
- eliminate_exact_duplicates()[source]¶
Eliminate Statements that were extracted multiple times.
Due to the way the patterns are implemented, they can sometimes yield the same Statement information multiple times, in which case, we end up with redundant Statements that aren’t from independent underlying entries. To avoid this, here, we filter out such duplicates.
- feature_delta(from_pe, to_pe)[source]¶
Return gained and lost modifications and any activity change.
- static find_matching_entities(left_entities, right_entities)[source]¶
Find matching entities between two lists of entities.
- static find_matching_left_right(conversion)[source]¶
Find matching entities on the left and right of a conversion.
- get_regulate_activities()[source]¶
Get Activation/Inhibition INDRA Statements from the BioPAX model.