ISI (`indra.sources.isi`)

This module provides an input interface and processor to the ISI reading system.

The reader is set up to run within a Docker container. For the ISI reader to run, set the Docker memory and swap space to the maximum.

ISI API (`indra.sources.isi.api`)

indra.sources.isi.api.process_json_file(file_path, pmid=None, extra_annotations=None, add_grounding=True, molecular_complexes_only=False)[source]

Extracts statements from the given ISI output file.

Parameters:

file_path (str) – The ISI output file from which to extract statements
pmid (int) – The PMID of the document being preprocessed, or None if not specified
extra_annotations (dict) – Extra annotations to be added to each statement from this document (can be the empty dictionary)
add_grounding (Optional[bool]) – If True the extracted Statements’ grounding is mapped
molecular_complexes_only (Optional[bool]) – If True, only Complex statements between molecular entities are retained after grounding.

indra.sources.isi.api.process_nxml(nxml_filename, pmid=None, extra_annotations=None, **kwargs)[source]

Process an NXML file using the ISI reader

First converts NXML to plain text and preprocesses it, then runs the ISI reader, and processes the output to extract INDRA Statements.

Parameters:

nxml_filename (str) – nxml file to process
pmid (Optional[str]) – pmid of this nxml file, to be added to the Evidence object of the extracted INDRA statements
extra_annotations (Optional[dict]) – Additional annotations to add to the Evidence object of all extracted INDRA statements. Extra annotations called ‘interaction’ are ignored since this is used by the processor to store the corresponding raw ISI output.
num_processes (Optional[int]) – Number of processes to parallelize over
cleanup (Optional[bool]) – If True, the temporary folders created for preprocessed reading input and output are removed. Default: True
add_grounding (Optional[bool]) – If True the extracted Statements’ grounding is mapped
molecular_complexes_only (Optional[bool]) – If True, only Complex statements between molecular entities are retained after grounding.

Returns:

ip – A processor containing extracted Statements

Return type:

indra.sources.isi.processor.IsiProcessor

indra.sources.isi.api.process_output_folder(folder_path, pmids=None, extra_annotations=None, add_grounding=True, molecular_complexes_only=False)[source]

Recursively extracts statements from all ISI output files in the given directory and subdirectories.

Parameters:

folder_path (str) – The directory to traverse
pmids (Optional[str]) – PMID mapping to be added to the Evidence of the extracted INDRA Statements
extra_annotations (Optional[dict]) – Additional annotations to add to the Evidence object of all extracted INDRA statements. Extra annotations called ‘interaction’ are ignored since this is used by the processor to store the corresponding raw ISI output.
add_grounding (Optional[bool]) – If True the extracted Statements’ grounding is mapped
molecular_complexes_only (Optional[bool]) – If True, only Complex statements between molecular entities are retained after grounding.

indra.sources.isi.api.process_preprocessed(isi_preprocessor, num_processes=1, output_dir=None, cleanup=True, add_grounding=True, molecular_complexes_only=False)[source]

Process a directory of abstracts and/or papers preprocessed using the specified IsiPreprocessor, to produce a list of extracted INDRA statements.

Parameters:

isi_preprocessor (indra.sources.isi.preprocessor.IsiPreprocessor) – Preprocessor object that has already preprocessed the documents we want to read and process with the ISI reader
num_processes (Optional[int]) – Number of processes to parallelize over
output_dir (Optional[str]) – The directory into which to put reader output; if omitted or None, uses a temporary directory.
cleanup (Optional[bool]) – If True, the temporary folders created for preprocessed reading input and output are removed. Default: True
add_grounding (Optional[bool]) – If True the extracted Statements’ grounding is mapped
molecular_complexes_only (Optional[bool]) – If True, only Complex statements between molecular entities are retained after grounding.

Returns:

ip – A processor containing extracted statements

Return type:

indra.sources.isi.processor.IsiProcessor

indra.sources.isi.api.process_text(text, pmid=None, **kwargs)[source]

Process a string using the ISI reader and extract INDRA statements.

Parameters:

text (str) – A text string to process
pmid (Optional[str]) – The PMID associated with this text (or None if not specified)
num_processes (Optional[int]) – Number of processes to parallelize over
cleanup (Optional[bool]) – If True, the temporary folders created for preprocessed reading input and output are removed. Default: True
add_grounding (Optional[bool]) – If True the extracted Statements’ grounding is mapped
molecular_complexes_only (Optional[bool]) – If True, only Complex statements between molecular entities are retained after grounding.

Returns:

ip – A processor containing statements

Return type:

indra.sources.isi.processor.IsiProcessor

ISI Processor (`indra.sources.isi.processor`)

class indra.sources.isi.processor.IsiProcessor(reader_output, pmid=None, extra_annotations=None, add_grounding=False)[source]

Processes the output of the ISI reader.

Parameters:

reader_output (json) – The output JSON of the ISI reader as a json object.
pmid (Optional[str]) – The PMID to assign to the extracted Statements
extra_annotations (Optional[dict]) – Annotations to be included with each extracted Statement
add_grounding (Optional[bool]) – If True, Gilda is used as a service to ground the Agents in the extracted Statements.

verbs

A list of verbs that have appeared in the processed ISI output

Type:: set[str]

statements

Extracted statements

Type:: list[indra.statements.Statement]

get_statements()[source]: Process reader output to produce INDRA Statements.

retain_molecular_complexes()[source]: Filter the statements to Complexes between molecular entities.

ISI (indra.sources.isi)

ISI API (indra.sources.isi.api)

ISI Processor (indra.sources.isi.processor)

ISI (`indra.sources.isi`)

ISI API (`indra.sources.isi.api`)

ISI Processor (`indra.sources.isi.processor`)