HTML browsing and curation (indra.assemblers.html.assembler)

Format a set of INDRA Statements into an HTML-formatted report which also supports curation.

class indra.assemblers.html.assembler.HtmlAssembler(statements=None, summary_metadata=None, ev_counts=None, beliefs=None, source_counts=None, curation_dict=None, title='INDRA Results', db_rest_url=None, sort_by='default', custom_stats=None)[source]

Generates an HTML-formatted report from INDRA Statements.

The HTML report format includes statements formatted in English (by the EnglishAssembler), text and metadata for the Evidence object associated with each Statement, and a Javascript-based curation interface linked to the INDRA database (access permitting). The interface allows for curation of statements at the evidence level by letting the user specify type of error and (optionally) provide a short description of of the error.

Parameters
  • statements (Optional[list[indra.statements.Statement]]) – A list of INDRA Statements to be added to the assembler. Statements can also be added using the add_statements method after the assembler has been instantiated.

  • summary_metadata (Optional[dict]) – Dictionary of statement corpus metadata such as that provided by the INDRA REST API. Default is None. Each value should be a concise summary of O(1), not of order the length of the list, such as the evidence totals. The keys should be informative human-readable strings. This information is displayed as a tooltip when hovering over the page title.

  • ev_counts (Optional[dict]) – A dictionary of the total evidence available for each statement indexed by hash. If not provided, the statements that are passed to the constructor are used to determine these, with whatever evidences these statements carry.

  • beliefs (Optional[dict]) – A dictionary of the belief of each statement indexed by hash. If not provided, the beliefs of the statements passed to the constructor are used.

  • source_counts (Optional[dict]) – A dictionary of the itemized evidence counts, by source, available for each statement, indexed by hash. If not provided, the statements that are passed to the constructor are used to determine these, with whatever evidences these statements carry.

  • title (str) – The title to be printed at the top of the page.

  • db_rest_url (Optional[str]) – The URL to a DB REST API to use for links out to further evidence. If given, this URL will be prepended to links that load additional evidence for a given Statement. One way to obtain this value is from the configuration entry indra.config.get_config(‘INDRA_DB_REST_URL’). If None, the URLs are constructed as relative links. Default: None

  • sort_by (str or function or None) –

    If str, it indicates which parameter to sort by, such as ‘belief’ or ‘ev_count’, or ‘ag_count’. Those are the default options because they can be derived from a list of statements, however if you give a custom list of stats with the custom_stats argument, you may use any of the parameters used to build it. The default, ‘default’, is mostly a sort by ev_count but also favors statements with fewer agents.

    Alternatively, you may give a function that takes a dict as its single argument, a dictionary of metrics. The contents of this dictionary always include “belief”, “ev_count”, and “ag_count”. If source_counts are given, each source will also be available as an entry (e.g. “reach” and “sparser”). As with string values, you may also add your own custom stats using the custom_stats argument.

    The value may also be None, in which case the sort function will return the same value for all elements, and thus the original order of elements will be preserved. This could have strange effects when statements are grouped (i.e. when grouping_level is not ‘statement’); such functionality is untested.

  • custom_stats (Optional[list]) – A list of StmtStat objects containing custom statement statistics to be used in sorting of statements and statement groups.

statements

A list of INDRA Statements to assemble.

Type

list[indra.statements.Statement]

model

The HTML report formatted as a single string.

Type

str

metadata

Dictionary of statement list metadata such as that provided by the INDRA REST API.

Type

dict

ev_counts

A dictionary of the total evidence available for each statement indexed by hash.

Type

dict

beliefs

A dictionary of the belief score of each statement, indexed by hash.

Type

dict

db_rest_url

The URL to a DB REST API.

Type

str

add_statements(statements)[source]

Add a list of Statements to the assembler.

Parameters

statements (list[indra.statements.Statement]) – A list of INDRA Statements to be added to the assembler.

append_warning(msg)[source]

Append a warning message to the model to expose issues.

make_json_model(grouping_level='agent-pair', no_redundancy=False, **kwargs)[source]

Return the JSON used to create the HTML display.

Parameters
  • grouping_level (Optional[str]) – Statements can be grouped at three levels, ‘statement’ (ungrouped), ‘relation’ (grouped by agents and type), and ‘agent-pair’ (grouped by ordered pairs of agents). Default: ‘agent-pair’.

  • no_redundancy (Optional[bool]) – If True, any group of statements that was already presented under a previous heading will be skipped. This is typically the case for complexes where different permutations of complex members are presented. By setting this argument to True, these can be eliminated. Default: False

Returns

json – A complexly structured JSON dict containing grouped statements and various metadata.

Return type

dict

make_model(template=None, grouping_level='agent-pair', add_full_text_search_link=False, no_redundancy=False, **template_kwargs)[source]

Return the assembled HTML content as a string.

Parameters
  • template (a Template object) – Manually pass a Jinja template to be used in generating the HTML. The template is responsible for rendering essentially the output of make_json_model.

  • grouping_level (Optional[str]) – Statements can be grouped under sub-headings at three levels, ‘statement’ (ungrouped), ‘relation’ (grouped by agents and type), and ‘agent-pair’ (grouped by ordered pairs of agents). Default: ‘agent-pair’.

  • add_full_text_search_link (bool) – If True, link with Text fragment search in PMC journal will be added for the statements.

  • no_redundancy (Optional[bool]) –

    If True, any group of statements that was already presented under a previous heading will be skipped. This is typically the case for complexes where different permutations of complex members are presented. By setting this argument to True, these can be eliminated. Default: False

    All other keyword arguments are passed along to the template. If you are using a custom template with args that are not passed below, this is how you pass them.

Returns

The assembled HTML as a string.

Return type

str

save_model(fname, **kwargs)[source]

Save the assembled HTML into a file.

Other kwargs are passed directly to make_model.

Parameters

fname (str) – The path to the file to save the HTML into.

indra.assemblers.html.assembler.tag_text(text, tag_info_list)[source]

Apply start/end tags to spans of the given text.

Parameters
  • text (str) – Text to be tagged

  • tag_info_list (list of tuples) – Each tuple refers to a span of the given text. Fields are (start_ix, end_ix, substring, start_tag, close_tag), where substring, start_tag, and close_tag are strings. If any of the given spans of text overlap, the longest span is used.

Returns

String where the specified substrings have been surrounded by the given start and close tags.

Return type

str