REACH (indra.reach)

REACH API (indra.reach.reach_api)

indra.reach.reach_api.process_json_file(file_name, citation=None)[source]

Return a ReachProcessor by processing the given REACH json file.

The output from the REACH parser is in this json format. This function is useful if the output is saved as a file and needs to be processed. For more information on the format, see: https://github.com/clulab/reach

Parameters:
  • file_name (str) – The name of the json file to be processed.
  • citation (Optional[str]) – A PubMed ID passed to be used in the evidence for the extracted INDRA Statements. Default: None
Returns:

rp – A ReachProcessor containing the extracted INDRA Statements in rp.statements.

Return type:

ReachProcessor

indra.reach.reach_api.process_json_str(json_str, citation=None)[source]

Return a ReachProcessor by processing the given REACH json string.

The output from the REACH parser is in this json format. For more information on the format, see: https://github.com/clulab/reach

Parameters:
  • json_str (str) – The json string to be processed.
  • citation (Optional[str]) – A PubMed ID passed to be used in the evidence for the extracted INDRA Statements. Default: None
Returns:

rp – A ReachProcessor containing the extracted INDRA Statements in rp.statements.

Return type:

ReachProcessor

indra.reach.reach_api.process_nxml_file(file_name, citation=None, offline=False)[source]

Return a ReachProcessor by processing the given NXML file.

NXML is the format used by PubmedCentral for papers in the open access subset.

Parameters:
  • file_name (str) – The name of the NXML file to be processed.
  • citation (Optional[str]) – A PubMed ID passed to be used in the evidence for the extracted INDRA Statements. Default: None
  • offline (Optional[bool]) – If set to True, the REACH system is ran offline. Otherwise (by default) the web service is called. Default: False
Returns:

rp – A ReachProcessor containing the extracted INDRA Statements in rp.statements.

Return type:

ReachProcessor

indra.reach.reach_api.process_nxml_str(nxml_str, citation=None, offline=False)[source]

Return a ReachProcessor by processing the given NXML string.

NXML is the format used by PubmedCentral for papers in the open access subset.

Parameters:
  • nxml_str (str) – The NXML string to be processed.
  • citation (Optional[str]) – A PubMed ID passed to be used in the evidence for the extracted INDRA Statements. Default: None
  • offline (Optional[bool]) – If set to True, the REACH system is ran offline. Otherwise (by default) the web service is called. Default: False
Returns:

rp – A ReachProcessor containing the extracted INDRA Statements in rp.statements.

Return type:

ReachProcessor

indra.reach.reach_api.process_pmc(pmc_id, offline=False)[source]

Return a ReachProcessor by processing a paper with a given PMC id.

Uses the PMC client to obtain the full text. If it’s not available, None is returned.

Parameters:
  • pmc_id (str) – The ID of a PubmedCentral article. The string may start with PMC but passing just the ID also works. Examples: 3717945, PMC3717945 https://www.ncbi.nlm.nih.gov/pmc/
  • offline (Optional[bool]) – If set to True, the REACH system is ran offline. Otherwise (by default) the web service is called. Default: False
Returns:

rp – A ReachProcessor containing the extracted INDRA Statements in rp.statements.

Return type:

ReachProcessor

indra.reach.reach_api.process_pubmed_abstract(pubmed_id, offline=False)[source]

Return a ReachProcessor by processing an abstract with a given Pubmed id.

Uses the Pubmed client to get the abstract. If that fails, None is returned.

Parameters:
  • pubmed_id (str) – The ID of a Pubmed article. The string may start with PMID but passing just the ID also works. Examples: 27168024, PMID27168024 https://www.ncbi.nlm.nih.gov/pubmed/
  • offline (Optional[bool]) – If set to True, the REACH system is ran offline. Otherwise (by default) the web service is called. Default: False
Returns:

rp – A ReachProcessor containing the extracted INDRA Statements in rp.statements.

Return type:

ReachProcessor

indra.reach.reach_api.process_text(text, citation=None, offline=False)[source]

Return a ReachProcessor by processing the given text.

Parameters:
  • text (str) – The text to be processed.
  • citation (Optional[str]) – A PubMed ID passed to be used in the evidence for the extracted INDRA Statements. This is used when the text to be processed comes from a publication that is not otherwise identified. Default: None
  • offline (Optional[bool]) – If set to True, the REACH system is ran offline. Otherwise (by default) the web service is called. Default: False
Returns:

rp – A ReachProcessor containing the extracted INDRA Statements in rp.statements.

Return type:

ReachProcessor

REACH Processor (indra.reach.processor)

class indra.reach.processor.ReachProcessor(json_dict, pmid=None)[source]

The ReachProcessor extracts INDRA Statements from REACH parser output.

Parameters:
  • json_dict (dict) – A JSON dictionary containing the REACH extractions.
  • pmid (Optional[str]) – The PubMed ID associated with the extractions. This can be passed in case the PMID cannot be determined from the extractions alone.`
tree

objectpath.Tree – The objectpath Tree object representing the extractions.

statements

list[indra.statements.Statement] – A list of INDRA Statements that were extracted by the processor.

citation

str – The PubMed ID associated with the extractions.

all_events

dict[str, str] – The frame IDs of all events by type in the REACH extraction.

get_activation()[source]

Extract INDRA Activation Statements.

get_all_events()[source]

Gather all event IDs in the REACH output by type.

These IDs are stored in the self.all_events dict.

get_complexes()[source]

Extract INDRA Complex Statements.

get_modifications()[source]

Extract Modification INDRA Statements.

get_regulate_amounts()[source]

Extract RegulateAmount INDRA Statements.

get_translocation()[source]

Extract INDRA Translocation Statements.

print_event_statistics()[source]

Print the number of events in the REACH output by type.

REACH reader (indra.reach.reach_reader)

class indra.reach.reach_reader.ReachReader[source]

The ReachReader wraps a singleton instance of the REACH reader.

This allows calling the reader many times without having to wait for it to start up each time.

api_ruler

org.clulab.reach.apis.ApiRuler – An instance of the REACH ApiRuler class (java object).

get_api_ruler()[source]

Return the existing reader if it exists or launch a new one.

Returns:api_ruler – An instance of the REACH ApiRuler class (java object).
Return type:org.clulab.reach.apis.ApiRuler