BioPAX (indra.sources.biopax)

This module allows processing BioPAX content into INDRA Statements. It uses the pybiopax package (https://github.com/indralab/pybiopax) to process OWL files or strings, or to obtain BioPAX content by querying the PathwayCommons web service. The module has been tested with BioPAX content from PathwayCommons https://www.pathwaycommons.org/archives/PC2/v12/. BioPAX from other sources may not adhere to the same conventions and could result in processing issues, though these can typically be addressed with minor changes in the processor’s logic.

BioPAX API (indra.sources.biopax.api)

indra.sources.biopax.api.process_model(model)[source]

Returns a BiopaxProcessor for a BioPAX model object.

Parameters

model (org.biopax.paxtools.model.Model) – A BioPAX model object.

Returns

bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.

Return type

BiopaxProcessor

indra.sources.biopax.api.process_owl(owl_filename, encoding=None)[source]

Returns a BiopaxProcessor for a BioPAX OWL file.

Parameters
  • owl_filename (str) – The name of the OWL file to process.

  • encoding (Optional[str]) – The encoding type to be passed to pybiopax.model_from_owl_file().

Returns

bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.

Return type

BiopaxProcessor

indra.sources.biopax.api.process_owl_gz(owl_gz_filename)[source]

Returns a BiopaxProcessor for a gzipped BioPAX OWL file.

Parameters

owl_gz_filename (str) – The name of the gzipped OWL file to process.

Returns

bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.

Return type

BiopaxProcessor

indra.sources.biopax.api.process_owl_str(owl_str)[source]

Returns a BiopaxProcessor for a BioPAX OWL file.

Parameters

owl_str (str) – The string content of an OWL file to process.

Returns

bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.

Return type

BiopaxProcessor

indra.sources.biopax.api.process_pc_neighborhood(gene_names, neighbor_limit=1, database_filter=None)[source]

Returns a BiopaxProcessor for a PathwayCommons neighborhood query.

The neighborhood query finds the neighborhood around a set of source genes.

http://www.pathwaycommons.org/pc2/#graph

http://www.pathwaycommons.org/pc2/#graph_kind

Parameters
  • gene_names (list) – A list of HGNC gene symbols to search the neighborhood of. Examples: [‘BRAF’], [‘BRAF’, ‘MAP2K1’]

  • neighbor_limit (Optional[int]) – The number of steps to limit the size of the neighborhood around the gene names being queried. Default: 1

  • database_filter (Optional[list]) – A list of database identifiers to which the query is restricted. Examples: [‘reactome’], [‘biogrid’, ‘pid’, ‘psp’] If not given, all databases are used in the query. For a full list of databases see http://www.pathwaycommons.org/pc2/datasources

Returns

A BiopaxProcessor containing the obtained BioPAX model in its model attribute and a list of extracted INDRA Statements from the model in its statements attribute.

Return type

BiopaxProcessor

indra.sources.biopax.api.process_pc_pathsbetween(gene_names, neighbor_limit=1, database_filter=None, block_size=None)[source]

Returns a BiopaxProcessor for a PathwayCommons paths-between query.

The paths-between query finds the paths between a set of genes. Here source gene names are given in a single list and all directions of paths between these genes are considered.

http://www.pathwaycommons.org/pc2/#graph

http://www.pathwaycommons.org/pc2/#graph_kind

Parameters
  • gene_names (list) – A list of HGNC gene symbols to search for paths between. Examples: [‘BRAF’, ‘MAP2K1’]

  • neighbor_limit (Optional[int]) – The number of steps to limit the length of the paths between the gene names being queried. Default: 1

  • database_filter (Optional[list]) – A list of database identifiers to which the query is restricted. Examples: [‘reactome’], [‘biogrid’, ‘pid’, ‘psp’] If not given, all databases are used in the query. For a full list of databases see http://www.pathwaycommons.org/pc2/datasources

  • block_size (Optional[int]) – Large paths-between queries (above ~60 genes) can error on the server side. In this case, the query can be replaced by a series of smaller paths-between and paths-from-to queries each of which contains block_size genes.

Returns

bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.

Return type

BiopaxProcessor

indra.sources.biopax.api.process_pc_pathsfromto(source_genes, target_genes, neighbor_limit=1, database_filter=None)[source]

Returns a BiopaxProcessor for a PathwayCommons paths-from-to query.

The paths-from-to query finds the paths from a set of source genes to a set of target genes.

http://www.pathwaycommons.org/pc2/#graph

http://www.pathwaycommons.org/pc2/#graph_kind

Parameters
  • source_genes (list) – A list of HGNC gene symbols that are the sources of paths being searched for. Examples: [‘BRAF’, ‘RAF1’, ‘ARAF’]

  • target_genes (list) – A list of HGNC gene symbols that are the targets of paths being searched for. Examples: [‘MAP2K1’, ‘MAP2K2’]

  • neighbor_limit (Optional[int]) – The number of steps to limit the length of the paths between the source genes and target genes being queried. Default: 1

  • database_filter (Optional[list]) – A list of database identifiers to which the query is restricted. Examples: [‘reactome’], [‘biogrid’, ‘pid’, ‘psp’] If not given, all databases are used in the query. For a full list of databases see http://www.pathwaycommons.org/pc2/datasources

Returns

bp – A BiopaxProcessor containing the obtained BioPAX model in bp.model.

Return type

BiopaxProcessor

BioPAX Processor (indra.sources.biopax.processor)

class indra.sources.biopax.processor.BiopaxProcessor(model, use_conversion_level_evidence=False)[source]

The BiopaxProcessor extracts INDRA Statements from a BioPAX model.

The BiopaxProcessor uses pattern searches in a BioPAX OWL model to extract mechanisms from which it constructs INDRA Statements.

Parameters

model (org.biopax.paxtools.model.Model) – A BioPAX model object (java object)

model

A BioPAX model object (java object) which is queried using Paxtools to extract INDRA Statements

Type

org.biopax.paxtools.model.Model

statements

A list of INDRA Statements that were extracted from the model.

Type

list[indra.statements.Statement]

eliminate_exact_duplicates()[source]

Eliminate Statements that were extracted multiple times.

Due to the way the patterns are implemented, they can sometimes yield the same Statement information multiple times, in which case, we end up with redundant Statements that aren’t from independent underlying entries. To avoid this, here, we filter out such duplicates.

feature_delta(from_pe, to_pe)[source]

Return gained and lost modifications and any activity change.

static find_matching_entities(left_entities, right_entities)[source]

Find matching entities between two lists of entities.

static find_matching_left_right(conversion)[source]

Find matching entities on the left and right of a conversion.

get_activity_modification()[source]

Extract INDRA ActiveForm statements from the BioPAX model.

get_conversions()[source]

Extract Conversion INDRA Statements from the BioPAX model.

get_gap_gef()[source]

Extract Gap and Gef INDRA Statements.

get_modifications()[source]

Extract INDRA Modification Statements from the BioPAX model.

get_regulate_activities()[source]

Get Activation/Inhibition INDRA Statements from the BioPAX model.

get_regulate_amounts()[source]

Extract INDRA RegulateAmount Statements from the BioPAX model.

static mod_condition_from_mod_feature(mf)[source]

Extract the type of modification and the position from a ModificationFeature object in the INDRA format.

save_model(file_name)[source]

Save the BioPAX model object in an OWL file.

Parameters

file_name (str) – The name of the OWL file to save the model in.