textToKnowledgeGraph (indra.sources.tkg)

This module implements an input API and processor for the textToKnowledgeGraph method which uses LLMs to extract BEL statements from publications:

textToKnowledgeGraph: Generation of Molecular Interaction Knowledge Graphs Using Large Language Models for Exploration in Cytoscape Favour James, Christopher Churas, Dexter Pratt, Augustin Luna bioRxiv https://doi.org/10.1101/2025.07.17.664328

class indra.sources.tkg.TkgProcessor(results)[source]

Processor extracting INDRA Statments from textToKnowledgeGraph output.

After parsing BEL to INDRA Statements via PyBEL, this processor attaches metadata (confidence, text, pmid, pmcid, etc.) to Evidence objects.

Parameters:

results (Dict) – Output data structure of textToKnowledgeGraph to be processed

statements

A list of INDRA Statements extracted from the results.

Type:

List[indra.statements.Statement]

extract_statements()[source]

Run BEL to INDRA pipeline for all entries in llm_results.

indra.sources.tkg.process_json(data)[source]

Process BEL relations returned directly from the LLM engine.

Parameters:

data (Dict) – Dictionary containing at least a "relations" field.

Returns:

Processor with INDRA Statements derived from BEL.

Return type:

TkgProcessor

indra.sources.tkg.process_json_file(path)[source]

Process a single textToKnowledgeGraph JSON results file.

Parameters:

path (Union[str, Path]) – Path to a JSON file containing BEL relations.

Returns:

Processor containing the converted INDRA Statements.

Return type:

TkgProcessor

indra.sources.tkg.process_pmc(pmc_id, output_base_path, **kwargs)[source]

Run live BEL extraction using textToKnowledgeGraph, if installed.

Parameters:
  • pmc_id (str) – PMCID such as ‘PMC3898398’.

  • kwargs – Additional keyword arguments passed to textToKnowledgeGraph.main().

Returns:

Processor containing INDRA Statements derived from live BEL output.

Return type:

TkgProcessor

Raises:
  • ImportError – If textToKnowledgeGraph is not installed.

  • ValueError – If the returned data structure is unexpected.

textToKnowledgeGraph API (indra.sources.tkg.api)

indra.sources.tkg.api.process_json(data)[source]

Process BEL relations returned directly from the LLM engine.

Parameters:

data (Dict) – Dictionary containing at least a "relations" field.

Returns:

Processor with INDRA Statements derived from BEL.

Return type:

TkgProcessor

indra.sources.tkg.api.process_json_file(path)[source]

Process a single textToKnowledgeGraph JSON results file.

Parameters:

path (Union[str, Path]) – Path to a JSON file containing BEL relations.

Returns:

Processor containing the converted INDRA Statements.

Return type:

TkgProcessor

indra.sources.tkg.api.process_pmc(pmc_id, output_base_path, **kwargs)[source]

Run live BEL extraction using textToKnowledgeGraph, if installed.

Parameters:
  • pmc_id (str) – PMCID such as ‘PMC3898398’.

  • kwargs – Additional keyword arguments passed to textToKnowledgeGraph.main().

Returns:

Processor containing INDRA Statements derived from live BEL output.

Return type:

TkgProcessor

Raises:
  • ImportError – If textToKnowledgeGraph is not installed.

  • ValueError – If the returned data structure is unexpected.

textToKnowledgeGraph Processor (indra.sources.tkg.processor)

class indra.sources.tkg.processor.TkgProcessor(results)[source]

Processor extracting INDRA Statments from textToKnowledgeGraph output.

After parsing BEL to INDRA Statements via PyBEL, this processor attaches metadata (confidence, text, pmid, pmcid, etc.) to Evidence objects.

Parameters:

results (Dict) – Output data structure of textToKnowledgeGraph to be processed

statements

A list of INDRA Statements extracted from the results.

Type:

List[indra.statements.Statement]

extract_statements()[source]

Run BEL to INDRA pipeline for all entries in llm_results.