TRIPS (indra.sources.trips)

TRIPS API (indra.sources.trips.api)

indra.sources.trips.api.process_text(text, save_xml_name='trips_output.xml', save_xml_pretty=True, offline=False, service_endpoint='drum', service_host=None)[source]

Return a TripsProcessor by processing text.

Parameters
  • text (str) – The text to be processed.

  • save_xml_name (Optional[str]) – The name of the file to save the returned TRIPS extraction knowledge base XML. Default: trips_output.xml

  • save_xml_pretty (Optional[bool]) – If True, the saved XML is pretty-printed. Some third-party tools require non-pretty-printed XMLs which can be obtained by setting this to False. Default: True

  • offline (Optional[bool]) – If True, offline reading is used with a local instance of DRUM, if available. Default: False

  • service_endpoint (Optional[str]) – Selects the TRIPS/DRUM web service endpoint to use. Is a choice between “drum” (default) and “drum-dev”, a nightly build.

  • service_host (Optional[str]) – Address of a service host different from the public IHMC server (e.g., a locally running service).

Returns

tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.

Return type

TripsProcessor

indra.sources.trips.api.process_xml(xml_string)[source]

Return a TripsProcessor by processing a TRIPS EKB XML string.

Parameters

xml_string (str) – A TRIPS extraction knowledge base (EKB) string to be processed. http://trips.ihmc.us/parser/api.html

Returns

tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.

Return type

TripsProcessor

indra.sources.trips.api.process_xml_file(file_name)[source]

Return a TripsProcessor by processing a TRIPS EKB XML file.

Parameters

file_name (str) – Path to a TRIPS extraction knowledge base (EKB) file to be processed.

Returns

tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.

Return type

TripsProcessor

TRIPS Processor (indra.sources.trips.processor)

class indra.sources.trips.processor.TripsProcessor(xml_string)[source]

The TripsProcessor extracts INDRA Statements from a TRIPS XML.

For more details on the TRIPS EKB XML format, see http://trips.ihmc.us/parser/cgi/drum

Parameters

xml_string (str) – A TRIPS extraction knowledge base (EKB) in XML format as a string.

tree

An ElementTree object representation of the TRIPS EKB XML.

Type

xml.etree.ElementTree.Element

statements

A list of INDRA Statements that were extracted from the EKB.

Type

list[indra.statements.Statement]

doc_id

The PubMed ID of the paper that the extractions are from.

Type

str

sentences

The list of all sentences in the EKB with their IDs

Type

dict[str: str]

paragraphs

The list of all paragraphs in the EKB with their IDs

Type

dict[str: str]

par_to_sec

A map from paragraph IDs to their associated section types

Type

dict[str: str]

extracted_events

A list of Event elements that have been extracted as INDRA Statements.

Type

list[xml.etree.ElementTree.Element]

get_activations()[source]

Extract direct Activation INDRA Statements.

get_activations_causal()[source]

Extract causal Activation INDRA Statements.

get_activations_stimulate()[source]

Extract Activation INDRA Statements via stimulation.

get_active_forms()[source]

Extract ActiveForm INDRA Statements.

get_active_forms_state()[source]

Extract ActiveForm INDRA Statements.

get_agents()[source]

Return list of INDRA Agents corresponding to TERMs in the EKB.

This is meant to be used when entities e.g. “phosphorylated ERK”, rather than events need to be extracted from processed natural language. These entities with their respective states are represented as INDRA Agents.

Returns

agents – List of INDRA Agents extracted from EKB.

Return type

list[indra.statements.Agent]

get_all_events()[source]

Make a list of all events in the TRIPS EKB.

The events are stored in self.all_events.

get_complexes()[source]

Extract Complex INDRA Statements.

get_degradations()[source]

Extract Degradation INDRA Statements.

get_modifications()[source]

Extract all types of Modification INDRA Statements.

get_modifications_indirect()[source]

Extract indirect Modification INDRA Statements.

get_regulate_amounts()[source]

Extract Increase/DecreaseAmount Statements.

get_syntheses()[source]

Extract IncreaseAmount INDRA Statements.

get_term_agents()[source]

Return dict of INDRA Agents keyed by corresponding TERMs in the EKB.

This is meant to be used when entities e.g. “phosphorylated ERK”, rather than events need to be extracted from processed natural language. These entities with their respective states are represented as INDRA Agents. Further, each key of the dictionary corresponds to the ID assigned by TRIPS to the given TERM that the Agent was extracted from.

Returns

agents – Dict of INDRA Agents extracted from EKB.

Return type

dict[str, indra.statements.Agent]

TRIPS Web-service Client (indra.sources.trips.client)

indra.sources.trips.client.get_xml(html, content_tag='ekb', fail_if_empty=False)[source]

Extract the content XML from the HTML output of the TRIPS web service.

Parameters
  • html (str) – The HTML output from the TRIPS web service.

  • content_tag (str) – The xml tag used to label the content. Default is ‘ekb’.

  • fail_if_empty (bool) – If True, and if the xml content found is an empty string, raise an exception. Default is False.

Returns

  • The extraction knowledge base (e.g. EKB) XML that contains the event and

  • term extractions.

indra.sources.trips.client.save_xml(xml_str, file_name, pretty=True)[source]

Save the TRIPS EKB XML in a file.

Parameters
  • xml_str (str) – The TRIPS EKB XML string to be saved.

  • file_name (str) – The name of the file to save the result in.

  • pretty (Optional[bool]) – If True, the XML is pretty printed.

indra.sources.trips.client.send_query(text, service_endpoint='drum', query_args=None, service_host=None)[source]

Send a query to the TRIPS web service.

Parameters
  • text (str) – The text to be processed.

  • service_endpoint (Optional[str]) – Selects the TRIPS/DRUM web service endpoint to use. Is a choice between “drum” (default), “drum-dev”, a nightly build, and “cwms” for use with more general knowledge extraction.

  • query_args (Optional[dict]) – A dictionary of arguments to be passed with the query.

  • service_host (Optional[str]) – The server’s base URL under which service_endpoint is an endpoint. By default, IHMC’s public server is used.

Returns

html – The HTML result returned by the web service.

Return type

str

TRIPS/DRUM Local Reader (indra.sources.trips.drum_reader)

class indra.sources.trips.drum_reader.DrumReader(**kwargs)[source]

Agent which processes text through a local TRIPS/DRUM instance.

This class is implemented as a communicative agent which sends and receives KQML messages through a socket. It sends text (ideally in small blocks like one sentence at a time) to the running DRUM instance and receives extraction knowledge base (EKB) XML responses asynchronously through the socket. To install DRUM and its dependencies locally, follow instructions at: https://github.com/wdebeaum/drum Once installed, run drum/bin/trips-drum -nouser to run DRUM without a GUI. Once DRUM is running, this class can be instantiated as dr = DrumReader(), at which point it attempts to connect to DRUM via the socket. You can use dr.read_text(text) to send text for reading. In another usage more, dr.read_pmc(pmcid) can be used to read a full open-access PMC paper. Receiving responses can be started as dr.start() which waits for responses from the reader and returns when all responses were received. Once finished, the list of EKB XML extractions can be accessed via dr.extractions.

Parameters
  • run_drum (Optional[bool]) – If True, the DRUM reading system is launched as a subprocess for reading. If False, DRUM is expected to be running independently. Default: False

  • drum_system (Optional[subproces.Popen]) – A handle to the subprocess of a running DRUM system instance. This can be passed in in case the instance is to be reused rather than restarted. Default: None

  • **kwargs – All other keyword arguments are passed through to the DrumReader KQML module’s constructor.

extractions

A list of EKB XML extractions corresponding to the input text list.

Type

list[str]

drum_system

A subprocess handle that points to a running instance of the DRUM reading system. In case the DRUM system is running independently, this is None.

Type

subprocess.Popen

read_pmc(pmcid)[source]

Read a given PMC article.

Parameters

pmcid (str) – The PMC ID of the article to read. Note that only articles in the open-access subset of PMC will work.

read_text(text)[source]

Read a given text phrase.

Parameters

text (str) – The text to read. Typically a sentence or a paragraph.

receive_reply(msg, content)[source]

Handle replies with reading results.