TRIPS (indra.sources.trips)

TRIPS API (indra.sources.trips.api)

indra.sources.trips.api.process_text(text, save_xml_name='trips_output.xml', save_xml_pretty=True, offline=False, service_endpoint='drum')[source]

Return a TripsProcessor by processing text.

Parameters:
  • text (str) – The text to be processed.
  • save_xml_name (Optional[str]) – The name of the file to save the returned TRIPS extraction knowledge base XML. Default: trips_output.xml
  • save_xml_pretty (Optional[bool]) – If True, the saved XML is pretty-printed. Some third-party tools require non-pretty-printed XMLs which can be obtained by setting this to False. Default: True
  • offline (Optional[bool]) – If True, offline reading is used with a local instance of DRUM, if available. Default: False
  • service_endpoint (Optional[str]) – Selects the TRIPS/DRUM web service endpoint to use. Is a choice between “drum” (default) and “drum-dev”, a nightly build.
Returns:

tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.

Return type:

TripsProcessor

indra.sources.trips.api.process_xml(xml_string)[source]

Return a TripsProcessor by processing a TRIPS EKB XML string.

Parameters:xml_string (str) – A TRIPS extraction knowledge base (EKB) string to be processed. http://trips.ihmc.us/parser/api.html
Returns:tp – A TripsProcessor containing the extracted INDRA Statements in tp.statements.
Return type:TripsProcessor

TRIPS Processor (indra.sources.trips.processor)

class indra.sources.trips.processor.TripsProcessor(xml_string)[source]

The TripsProcessor extracts INDRA Statements from a TRIPS XML.

For more details on the TRIPS EKB XML format, see http://trips.ihmc.us/parser/cgi/drum

Parameters:xml_string (str) – A TRIPS extraction knowledge base (EKB) in XML format as a string.
tree

xml.etree.ElementTree.Element – An ElementTree object representation of the TRIPS EKB XML.

statements

list[indra.statements.Statement] – A list of INDRA Statements that were extracted from the EKB.

doc_id

str – The PubMed ID of the paper that the extractions are from.

sentences

dict[str: str] – The list of all sentences in the EKB with their IDs

paragraphs

dict[str: str] – The list of all paragraphs in the EKB with their IDs

par_to_sec

dict[str: str] – A map from paragraph IDs to their associated section types

extracted_events

list[xml.etree.ElementTree.Element] – A list of Event elements that have been extracted as INDRA Statements.

get_activations()[source]

Extract direct Activation INDRA Statements.

get_activations_causal()[source]

Extract causal Activation INDRA Statements.

get_activations_stimulate()[source]

Extract Activation INDRA Statements via stimulation.

get_active_forms()[source]

Extract ActiveForm INDRA Statements.

get_active_forms_state()[source]

Extract ActiveForm INDRA Statements.

get_agents()[source]

Return list of INDRA Agents corresponding to TERMs in the EKB.

This is meant to be used when entities e.g. “phosphorylated ERK”, rather than events need to be extracted from processed natural language. These entities with their respective states are represented as INDRA Agents.

Returns:agents – List of INDRA Agents extracted from EKB.
Return type:list[indra.statements.Agent]
get_all_events()[source]

Make a list of all events in the TRIPS EKB.

The events are stored in self.all_events.

get_complexes()[source]

Extract Complex INDRA Statements.

get_degradations()[source]

Extract Degradation INDRA Statements.

get_modifications()[source]

Extract all types of Modification INDRA Statements.

get_modifications_indirect()[source]

Extract indirect Modification INDRA Statements.

get_regulate_amounts()[source]

Extract Increase/DecreaseAmount Statements.

get_syntheses()[source]

Extract IncreaseAmount INDRA Statements.

TRIPS Web-service Client (indra.sources.trips.client)

indra.sources.trips.client.get_xml(html, content_tag='ekb', fail_if_empty=False)[source]

Extract the content XML from the HTML output of the TRIPS web service.

Parameters:
  • html (str) – The HTML output from the TRIPS web service.
  • content_tag (str) – The xml tag used to label the content. Default is ‘ekb’.
  • fail_if_empty (bool) – If True, and if the xml content found is an empty string, raise an exception. Default is False.
Returns:

  • The extraction knowledge base (e.g. EKB) XML that contains the event and
  • term extractions.

indra.sources.trips.client.save_xml(xml_str, file_name, pretty=True)[source]

Save the TRIPS EKB XML in a file.

Parameters:
  • xml_str (str) – The TRIPS EKB XML string to be saved.
  • file_name (str) – The name of the file to save the result in.
  • pretty (Optional[bool]) – If True, the XML is pretty printed.
indra.sources.trips.client.send_query(text, service_endpoint='drum', query_args=None)[source]

Send a query to the TRIPS web service.

Parameters:
  • text (str) – The text to be processed.
  • service_endpoint (Optional[str]) – Selects the TRIPS/DRUM web service endpoint to use. Is a choice between “drum” (default), “drum-dev”, a nightly build, and “cwms” for use with more general knowledge extraction.
  • query_args (Optional[dict]) – A dictionary of arguments to be passed with the query.
Returns:

html – The HTML result returned by the web service.

Return type:

str

TRIPS/DRUM Local Reader (indra.sources.trips.drum_reader)

class indra.sources.trips.drum_reader.DrumReader(**kwargs)[source]

Agent which processes text through a local TRIPS/DRUM instance.

This class is implemented as a communicative agent which sends and receives KQML messages through a socket. It sends text (ideally in small blocks like one sentence at a time) to the running DRUM instance and receives extraction knowledge base (EKB) XML responses asynchronously through the socket. To install DRUM and its dependencies locally, follow instructions at: https://github.com/wdebeaum/drum Once installed, run drum/bin/trips-drum -nouser to run DRUM without a GUI. Once DRUM is running, this class can be instantiated as dr = DrumReader(), at which point it attempts to connect to DRUM via the socket. You can use dr.read_text(text) to send text for reading. In another usage more, dr.read_pmc(pmcid) can be used to read a full open-access PMC paper. Receiving responses can be started as dr.start() which waits for responses from the reader and returns when all responses were received. Once finished, the list of EKB XML extractions can be accessed via dr.extractions.

Parameters:
  • run_drum (Optional[bool]) – If True, the DRUM reading system is launched as a subprocess for reading. If False, DRUM is expected to be running independently. Default: False
  • drum_system (Optional[subproces.Popen]) – A handle to the subprocess of a running DRUM system instance. This can be passed in in case the instance is to be reused rather than restarted. Default: None
  • **kwargs – All other keyword arguments are passed through to the DrumReader KQML module’s constructor.
extractions

list[str] – A list of EKB XML extractions corresponding to the input text list.

drum_system

subprocess.Popen – A subprocess handle that points to a running instance of the DRUM reading system. In case the DRUM system is running independently, this is None.

read_pmc(pmcid)[source]

Read a given PMC article.

Parameters:pmcid (str) – The PMC ID of the article to read. Note that only articles in the open-access subset of PMC will work.
read_text(text)[source]

Read a given text phrase.

Parameters:text (str) – The text to read. Typically a sentence or a paragraph.
receive_reply(msg, content)[source]

Handle replies with reading results.