TRIPS (`indra.sources.trips`)

TRIPS API (`indra.sources.trips.api`)

indra.sources.trips.api.process_text(text, save_xml_name='trips_output.xml', save_xml_pretty=True, offline=False, service_endpoint='drum', service_host=None)[source]

Return a TripsProcessor by processing text.

Parameters

text (str) – The text to be processed.
save_xml_name (Optional[str]) – The name of the file to save the returned TRIPS extraction knowledge base XML. Default: trips_output.xml
save_xml_pretty (Optional[bool]) – If True, the saved XML is pretty-printed. Some third-party tools require non-pretty-printed XMLs which can be obtained by setting this to False. Default: True
offline (Optional[bool]) – If True, offline reading is used with a local instance of DRUM, if available. Default: False
service_endpoint (Optional[str]) – Selects the TRIPS/DRUM web service endpoint to use. Is a choice between “drum” (default) and “drum-dev”, a nightly build.
service_host (Optional[str]) – Address of a service host different from the public IHMC server (e.g., a locally running service).

Returns

tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.

Return type

TripsProcessor

indra.sources.trips.api.process_xml(xml_string)[source]

Return a TripsProcessor by processing a TRIPS EKB XML string.

Parameters: xml_string (str) – A TRIPS extraction knowledge base (EKB) string to be processed. http://trips.ihmc.us/parser/api.html
Returns: tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.
Return type: TripsProcessor

indra.sources.trips.api.process_xml_file(file_name)[source]

Return a TripsProcessor by processing a TRIPS EKB XML file.

Parameters: file_name (str) – Path to a TRIPS extraction knowledge base (EKB) file to be processed.
Returns: tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.
Return type: TripsProcessor

TRIPS Processor (`indra.sources.trips.processor`)

class indra.sources.trips.processor.TripsProcessor(xml_string)[source]

The TripsProcessor extracts INDRA Statements from a TRIPS XML.

For more details on the TRIPS EKB XML format, see http://trips.ihmc.us/parser/cgi/drum

Parameters: xml_string (str) – A TRIPS extraction knowledge base (EKB) in XML format as a string.

tree

An ElementTree object representation of the TRIPS EKB XML.

Type: xml.etree.ElementTree.Element

statements

A list of INDRA Statements that were extracted from the EKB.

Type: list[indra.statements.Statement]

doc_id

The PubMed ID of the paper that the extractions are from.

Type: str

sentences

The list of all sentences in the EKB with their IDs

Type: dict[str: str]

paragraphs

The list of all paragraphs in the EKB with their IDs

Type: dict[str: str]

par_to_sec

A map from paragraph IDs to their associated section types

Type: dict[str: str]

extracted_events

A list of Event elements that have been extracted as INDRA Statements.

Type: list[xml.etree.ElementTree.Element]

get_activations()[source]: Extract direct Activation INDRA Statements.

get_activations_causal()[source]: Extract causal Activation INDRA Statements.

get_activations_stimulate()[source]: Extract Activation INDRA Statements via stimulation.

get_active_forms()[source]: Extract ActiveForm INDRA Statements.

get_active_forms_state()[source]: Extract ActiveForm INDRA Statements.

get_agents(with_coords=False)[source]

Return list of INDRA Agents corresponding to TERMs in the EKB.

This is meant to be used when entities e.g. “phosphorylated ERK”, rather than events need to be extracted from processed natural language. These entities with their respective states are represented as INDRA Agents.

Parameters: with_coords (Optional[bool]) – If True, the coordinates of the agent are also returned in the result as a tuple. Default: False.
Returns: agents – List of INDRA Agents extracted from EKB.
Return type: list[indra.statements.Agent]

get_all_events()[source]

Make a list of all events in the TRIPS EKB.

The events are stored in self.all_events.

get_complexes()[source]: Extract Complex INDRA Statements.

get_degradations()[source]: Extract Degradation INDRA Statements.

get_modifications()[source]: Extract all types of Modification INDRA Statements.

get_modifications_indirect()[source]: Extract indirect Modification INDRA Statements.

get_regulate_amounts()[source]: Extract Increase/DecreaseAmount Statements.

get_syntheses()[source]: Extract IncreaseAmount INDRA Statements.

get_term_agents(with_coords=False)[source]

Return dict of INDRA Agents keyed by corresponding TERMs in the EKB.

This is meant to be used when entities e.g. “phosphorylated ERK”, rather than events need to be extracted from processed natural language. These entities with their respective states are represented as INDRA Agents. Further, each key of the dictionary corresponds to the ID assigned by TRIPS to the given TERM that the Agent was extracted from.

Parameters: with_coords (Optional[bool]) – If True, the coordinates of the agent are also returned in the dictionary as a tuple. Default: False.
Returns: agents – Dict of INDRA Agents extracted from EKB.
Return type: dict[str, indra.statements.Agent]

TRIPS Web-service Client (`indra.sources.trips.client`)

indra.sources.trips.client.get_xml(html, content_tag='ekb', fail_if_empty=False)[source]

Extract the content XML from the HTML output of the TRIPS web service.

Parameters

html (str) – The HTML output from the TRIPS web service.
content_tag (str) – The xml tag used to label the content. Default is ‘ekb’.
fail_if_empty (bool) – If True, and if the xml content found is an empty string, raise an exception. Default is False.

Returns

The extraction knowledge base (e.g. EKB) XML that contains the event and
term extractions.

indra.sources.trips.client.save_xml(xml_str, file_name, pretty=True)[source]

Save the TRIPS EKB XML in a file.

Parameters

xml_str (str) – The TRIPS EKB XML string to be saved.
file_name (str) – The name of the file to save the result in.
pretty (Optional[bool]) – If True, the XML is pretty printed.

indra.sources.trips.client.send_query(text, service_endpoint='drum', query_args=None, service_host=None)[source]

Send a query to the TRIPS web service.

Parameters

text (str) – The text to be processed.
service_endpoint (Optional[str]) – Selects the TRIPS/DRUM web service endpoint to use. Is a choice between “drum” (default), “drum-dev”, a nightly build, and “cwms” for use with more general knowledge extraction.
query_args (Optional[dict]) – A dictionary of arguments to be passed with the query.
service_host (Optional[str]) – The server’s base URL under which service_endpoint is an endpoint. By default, IHMC’s public server is used.

Returns

html – The HTML result returned by the web service.

Return type

str

TRIPS/DRUM Local Reader (`indra.sources.trips.drum_reader`)

class indra.sources.trips.drum_reader.DrumReader(**kwargs)[source]

Agent which processes text through a local TRIPS/DRUM instance.

This class is implemented as a communicative agent which sends and receives KQML messages through a socket. It sends text (ideally in small blocks like one sentence at a time) to the running DRUM instance and receives extraction knowledge base (EKB) XML responses asynchronously through the socket. To install DRUM and its dependencies locally, follow instructions at: https://github.com/wdebeaum/drum Once installed, run drum/bin/trips-drum -nouser to run DRUM without a GUI. Once DRUM is running, this class can be instantiated as dr = DrumReader(), at which point it attempts to connect to DRUM via the socket. You can use dr.read_text(text) to send text for reading. In another usage more, dr.read_pmc(pmcid) can be used to read a full open-access PMC paper. Receiving responses can be started as dr.start() which waits for responses from the reader and returns when all responses were received. Once finished, the list of EKB XML extractions can be accessed via dr.extractions.

Parameters

run_drum (Optional[bool]) – If True, the DRUM reading system is launched as a subprocess for reading. If False, DRUM is expected to be running independently. Default: False
drum_system (Optional[subproces.Popen]) – A handle to the subprocess of a running DRUM system instance. This can be passed in in case the instance is to be reused rather than restarted. Default: None
**kwargs – All other keyword arguments are passed through to the DrumReader KQML module’s constructor.

extractions

A list of EKB XML extractions corresponding to the input text list.

Type: list[str]

drum_system

A subprocess handle that points to a running instance of the DRUM reading system. In case the DRUM system is running independently, this is None.

Type: subprocess.Popen

read_pmc(pmcid)[source]

Read a given PMC article.

Parameters: pmcid (str) – The PMC ID of the article to read. Note that only articles in the open-access subset of PMC will work.

read_text(text)[source]

Read a given text phrase.

Parameters: text (str) – The text to read. Typically a sentence or a paragraph.

receive_reply(msg, content)[source]: Handle replies with reading results.

TRIPS (indra.sources.trips)

TRIPS API (indra.sources.trips.api)

TRIPS Processor (indra.sources.trips.processor)

TRIPS Web-service Client (indra.sources.trips.client)

TRIPS/DRUM Local Reader (indra.sources.trips.drum_reader)

TRIPS (`indra.sources.trips`)

TRIPS API (`indra.sources.trips.api`)

TRIPS Processor (`indra.sources.trips.processor`)

TRIPS Web-service Client (`indra.sources.trips.client`)

TRIPS/DRUM Local Reader (`indra.sources.trips.drum_reader`)