Eidos (indra.sources.eidos)

Eidos is an open-domain machine reading system which uses a cascade of grammars to extract causal events from free text. It is ideal for modeling applications that are not specific to a given domain like molecular biology.

To set up reading with Eidos, the Eidos system and its dependencies need to be compiled and packaged as a fat JAR:

git clone https://github.com/clulab/eidos.git
cd eidos
sbt assembly

This creates a JAR file in eidos/target/scala[version]/eidos-[version].jar. Set the absolute path to this file on the EIDOSPATH environmental variable and then append EIDOSPATH to the CLASSPATH environmental variable (entries are separated by colons).

The pyjnius package needs to be set up and operational to use Eidos reading in Python. For more details, see Pyjnius setup instructions in the documentation.

For eidos to provide grounding information to be included in INDRA Statements, the eidos configuration needs to be adjusted. First, download vectors.txt from https://drive.google.com/open?id=1tffQuLB5XtKcq9wlo0n-tsnPJYQF18oS and put it in a folder called src/main/resources/org/clulab/wm/eidos/w2v within your eidos folder. Next, set the property “useW2V” to true in src/main/resources/eidos.conf. Finally, rerun sbt assembly.

Eidos API (indra.sources.eidos.api)

indra.sources.eidos.api.initialize_reader()[source]

Instantiate an Eidos reader for fast subsequent reading.

indra.sources.eidos.api.process_json(json_dict)[source]

Return an EidosProcessor by processing a Eidos JSON-LD dict.

Parameters:json_dict (dict) – The JSON-LD dict to be processed.
Returns:ep – A EidosProcessor containing the extracted INDRA Statements in its statements attribute.
Return type:EidosProcessor
indra.sources.eidos.api.process_json_file(file_name)[source]

Return an EidosProcessor by processing the given Eidos JSON-LD file.

This function is useful if the output from Eidos is saved as a file and needs to be processed.

Parameters:file_name (str) – The name of the JSON-LD file to be processed.
Returns:ep – A EidosProcessor containing the extracted INDRA Statements in its statements attribute.
Return type:EidosProcessor
indra.sources.eidos.api.process_json_ld(json_dict)[source]

DEPRECATED: see process_json

indra.sources.eidos.api.process_json_ld_file(file_name)[source]

DEPRECATED: see process_json_file

indra.sources.eidos.api.process_json_ld_str(json_str)[source]

DEPRECATED: see process_json_str

indra.sources.eidos.api.process_json_str(json_str)[source]

Return an EidosProcessor by processing the Eidos JSON-LD string.

Parameters:json_str (str) – The JSON-LD string to be processed.
Returns:ep – A EidosProcessor containing the extracted INDRA Statements in its statements attribute.
Return type:EidosProcessor
indra.sources.eidos.api.process_text(text, out_format='json_ld', save_json='eidos_output.json', webservice=None)[source]

Return an EidosProcessor by processing the given text.

This constructs a reader object via Java and extracts mentions from the text. It then serializes the mentions into JSON and processes the result with process_json.

Parameters:
  • text (str) – The text to be processed.
  • out_format (Optional[str]) – The type of Eidos output to read into and process. Currently only ‘json-ld’ is supported which is also the default value used.
  • save_json (Optional[str]) – The name of a file in which to dump the JSON output of Eidos.
  • webservice (Optional[str]) – An Eidos reader web service URL to send the request to. If None, the reading is assumed to be done with the Eidos JAR rather than via a web service. Default: None
Returns:

ep – An EidosProcessor containing the extracted INDRA Statements in its statements attribute.

Return type:

EidosProcessor

Eidos Processor (indra.sources.eidos.processor)

class indra.sources.eidos.processor.EidosProcessor(json_dict)[source]

This processor extracts INDRA Statements from Eidos JSON-LD output.

Parameters:json_dict (dict) – A JSON dictionary containing the Eidos extractions in JSON-LD format.
tree

objectpath.Tree – The objectpath Tree object representing the extractions.

statements

list[indra.statements.Statement] – A list of INDRA Statements that were extracted by the processor.

static find_arg(event, arg_type)[source]

Return ID of the first argument of a given type

static find_args(event, arg_type)[source]

Return IDs of all arguments of a given type

geo_context_from_ref(ref)[source]

Return a ref context object given a location reference entry.

get_causal_relations()[source]

Extract causal relations as Statements.

static get_concept(entity)[source]

Return Concept from an Eidos entity.

get_evidence(event)[source]

Return the Evidence object for the INDRA Statment.

static get_groundings(entity)[source]

Return groundings as db_refs for an entity.

static get_hedging(event)[source]

Return hedging markers attached to an event.

Example: “states”: [{“@type”: “State”, “type”: “HEDGE”,
“text”: “could”}
static get_negation(event)[source]

Return negation attached to an event.

Example: “states”: [{“@type”: “State”, “type”: “NEGATION”,
“text”: “n’t”}]
static ref_context_from_geoloc(geoloc)[source]

Return a RefContext object given a geoloc entry.

static time_context_from_dct(dct)[source]

Return a time context object given a DCT entry.

time_context_from_ref(timex)[source]

Return a time context object given a timex reference entry.

static time_context_from_timex(timex)[source]

Return a TimeContext object given a timex entry.

Eidos Reader (indra.sources.eidos.reader)

class indra.sources.eidos.reader.EidosReader[source]

Reader object keeping an instance of the Eidos reader as a singleton.

This allows the Eidos reader to need initialization when the first piece of text is read, the subsequent readings are done with the same instance of the reader and are therefore faster.

eidos_reader

org.clulab.wm.eidos.EidosSystem – A Scala object, an instance of the Eidos reading system. It is instantiated only when first processing text.

process_text(text, format='json')[source]

Return a mentions JSON object given text.

Parameters:
  • text (str) – Text to be processed.
  • format (str) – The format of the output to produce, one of “json” or “json_ld”. Default: “json”
Returns:

json_dict – A JSON object of mentions extracted from text.

Return type:

dict

Eidos CLI (indra.sources.eidos.cli)

This is a Python based command line interface to Eidos to complement the Python-Java bridge based interface. EIDOSPATH (in the INDRA config.ini or as an environmental variable) needs to be pointing to a fat JAR of the Eidos system.

indra.sources.eidos.cli.extract_and_process(path_in, path_out)[source]

Run Eidos on a set of text files and process output with INDRA.

The output is produced in the specified output folder but the output files aren’t processed by this function.

Parameters:
  • path_in (str) – Path to an input folder with some text files
  • path_out (str) – Path to an output folder in which Eidos places the output JSON-LD files
Returns:

stmts – A list of INDRA Statements

Return type:

list[indra.statements.Statements]

indra.sources.eidos.cli.extract_from_directory(path_in, path_out)[source]

Run Eidos on a set of text files in a folder.

The output is produced in the specified output folder but the output files aren’t processed by this function.

Parameters:
  • path_in (str) – Path to an input folder with some text files
  • path_out (str) – Path to an output folder in which Eidos places the output JSON-LD files
indra.sources.eidos.cli.run_eidos(endpoint, *args)[source]

Run a given enpoint of Eidos through the command line.

Parameters:
  • endpoint (str) – The class within the Eidos package to run, for instance ‘apps.ExtractFromDirectory’ will run ‘org.clulab.wm.eidos.apps.ExtractFromDirectory’
  • *args – Any further arguments to be passed as inputs to the class being run.

Eidos Webserver (indra.sources.eidos.server)

This is a Python-based web server that can be run to read with Eidos. To run the server, do

python -m indra.sources.eidos.server

and then submit POST requests to the localhost:5000/process_text endpoint with JSON content as {‘text’: ‘text to read’}. The response will be the Eidos JSON-LD output.