SemRep (indra.sources.semrep)

This module implements an API and processor for the SemRep system available at https://github.com/lhncbc/SemRep (see also https://doi.org/10.1016/j.jbi.2003.11.003). Setting up the SemRep requires downloading several large compressed software releases and data files, and then running installation scripts. A Dockerfile is provided in this module which allows automating the build into a ~100GB image, or can be used as a template for a local installation.

Currently, the XML output format of SemRep is supported which can be produced as follows:

[SemRep base folder]/bin/semrep.v1.8 -X -L 2018 -Z 2018AA input.txt output.xml

Predicates produced by SemRep are documented in the Appendix of https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-486#Sec26.

SemRep API (indra.sources.semrep.api)

indra.sources.semrep.api.process_xml_file(fname, use_gilda_grounding=False, predicate_mappings=None)[source]

Process a SemRep output XML file and extract INDRA Statements.

Parameters
  • fname (str) – The name of the SemRep output XML file.

  • use_gilda_grounding (Optional[bool]) – If True, Gilda is used to re-ground entities and assing identifiers. Default: False

  • predicate_mappings (Optional[dict]) – Allows providing a custom mapping of SemRep predicates to INDRA Statement types. If not provided, default ones are used.

Returns

An instance of a SemRepXmlProcessor that carries extracted INDRA Statements in its statements attribute.

Return type

SemRepXmlProcessor

SemRep Processor (indra.sources.semrep.processor)

class indra.sources.semrep.processor.SemRepXmlProcessor(tree, use_gilda_grounding=False, predicate_mappings=None)[source]

Processor for XML output from SemRep.

extract_predication(predication, utterance)[source]

Extract a Statement from a single predication.

get_agent_from_entity(entity)[source]

Return an Agent from an entity.

process_statements()[source]

Extract all Statements from an XML tree.