INDRA Statements (indra.statements
)¶
General information and statement types¶
Statements represent mechanistic relationships between biological agents.
Statement classes follow an inheritance hierarchy, with all Statement types
inheriting from the parent class Statement
. At
the next level in the hierarchy are the following classes:
Open Domain
Biological Domain
There are several types of Statements representing post-translational
modifications that further inherit from
Modification
:
There are additional subtypes of SelfModification
:
Interactions between proteins are often described simply in terms of their
effect on a protein’s “activity”, e.g., “Active MEK activates ERK”, or “DUSP6
inactives ERK”. These types of relationships are indicated by the
RegulateActivity
abstract base class which has subtypes
while the RegulateAmount
abstract base class has subtypes
Statements involve one or more Concepts, which, depending on the
semantics of the Statement, are typically biological Agents,
such as proteins, represented by the class Agent
. (However,
:py:class`Influence` statements involve two or more :py:class`Event` objects,
each of which takes a :py:class`Concept` as an argument.)
Agents can have several types of context specified on them including
a specific post-translational modification state (indicated by one or more instances of
ModCondition
),other bound Agents (
BoundCondition
),mutations (
MutCondition
),an activity state (
ActivityCondition
), andcellular location
The active form of an agent (in terms of its post-translational modifications
or bound state) is indicated by an instance of the class
ActiveForm
.
Grounding and DB references¶
Agents also carry grounding information which links them to database entries. These database references are represented as a dictionary in the db_refs attribute of each Agent. The dictionary can have multiple entries. For instance, INDRA’s input Processors produce genes and proteins that carry both UniProt and HGNC IDs in db_refs, whenever possible. FamPlex provides a name space for protein families that are typically used in the literature. More information about FamPlex can be found here: https://github.com/sorgerlab/famplex
In general, the capitalized version of any identifiers.org name space (see https://registry.identifiers.org/ for full list) can be used in db_refs with a few cases where INDRA’s internal db_refs name space is different from the identifiers.org name space (e.g., UP vs uniprot). These special cases can be programmatically mapped between INDRA and identifiers.org using the identifiers_mappings and identifiers_reverse dictionaries in the indra.databases.identifiers module.
Examples of the most commonly encountered db_refs name spaces and IDs are listed below.
Type |
Database |
Example |
---|---|---|
Gene/Protein |
HGNC |
{‘HGNC’: ‘11998’} |
Gene/Protein |
UniProt |
{‘UP’: ‘P04637’} |
Protein chain |
UniProt |
{‘UPPRO’: ‘PRO_0000435839’} |
Gene/Protein |
Entrez |
{‘EGID’: ‘5583’} |
Gene/Protein family |
FamPlex |
{‘FPLX’: ‘ERK’} |
Gene/Protein family |
InterPro |
{‘IP’: ‘IPR000308’} |
Gene/Protein family |
Pfam |
{‘PF’: ‘PF00071’} |
Gene/Protein family |
NextProt family |
{‘NXPFA’: ‘03114’} |
Chemical |
ChEBI |
{‘CHEBI’: ‘CHEBI:63637’} |
Chemical |
PubChem |
{‘PUBCHEM’: ‘42611257’} |
Chemical |
LINCS |
{‘LINCS’: ‘42611257’} |
Metabolite |
HMDB |
{‘HMDB’: ‘HMDB00122’} |
Process, location, etc. |
GO |
{‘GO’: ‘GO:0006915’} |
Process, disease, etc. |
MeSH |
{‘MESH’: ‘D008113’} |
Disease |
Disease Ontology |
{‘DOID’: ‘DOID:8659’} |
Phenotypic abnormality |
Human Pheno. Ont. |
{‘HP’: ‘HP:0031296’} |
Experimental factors |
Exp. Factor Ont. |
{‘EFO’: ‘0007820’} |
General terms |
NCIT |
{‘NCIT’: ‘C28597’} |
Raw text |
TEXT |
{‘TEXT’: ‘Nf-kappaB’} |
The evidence for a given Statement, which could include relevant citations,
database identifiers, and passages of text from the scientific literature, is
contained in one or more Evidence
objects associated with the
Statement.
JSON serialization of INDRA Statements¶
Statements can be serialized into JSON and deserialized from JSON to allow their exchange in a platform-independent way. We also provide a JSON schema (see http://json-schema.org to learn about schemas) in https://raw.githubusercontent.com/sorgerlab/indra/master/indra/resources/statements_schema.json which can be used to validate INDRA Statements JSONs.
Some validation tools include:
- jsonschema
a Python package to validate JSON content with respect to a schema
- ajv-cli
Available at https://www.npmjs.com/package/ajv-cli Install with “npm install -g ajv-cli” and then validate with: ajv -s statements_schema.json -d file_to_validate.json. This tool provides more sophisticated and better interpretable output than jsonschema.
- Web based tools
There are a variety of web-based tools for validation with JSON schemas, including https://www.jsonschemavalidator.net
- class indra.statements.statements.Acetylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Acetylation modification.
- class indra.statements.statements.Activation(subj, obj, obj_activity='activity', evidence=None)[source]¶
Bases:
RegulateActivity
Indicates that a protein activates another protein.
This statement is intended to be used for physical interactions where the mechanism of activation is not explicitly specified, which is often the case for descriptions of mechanisms extracted from the literature.
- Parameters
subj (
Agent
) – The agent responsible for the change in activity, i.e., the “upstream” node.obj (
Agent
) – The agent whose activity is influenced by the subject, i.e., the “downstream” node.obj_activity (Optional[str]) – The activity of the obj Agent that is affected, e.g., its “kinase” activity.
evidence (None or
Evidence
or list ofEvidence
) – Evidence objects in support of the modification.
Examples
MEK (MAP2K1) activates the kinase activity of ERK (MAPK1):
>>> mek = Agent('MAP2K1') >>> erk = Agent('MAPK1') >>> act = Activation(mek, erk, 'kinase')
- class indra.statements.statements.ActiveForm(agent, activity, is_active, evidence=None)[source]¶
Bases:
Statement
Specifies conditions causing an Agent to be active or inactive.
Types of conditions influencing a specific type of biochemical activity can include modifications, bound Agents, and mutations.
- Parameters
agent (
Agent
) – The Agent in a particular active or inactive state. The sets of ModConditions, BoundConditions, and MutConditions on the given Agent instance indicate the relevant conditions.activity (str) – The type of activity influenced by the given set of conditions, e.g., “kinase”.
is_active (bool) – Whether the conditions are activating (True) or inactivating (False).
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.ActivityCondition(activity_type, is_active)[source]¶
Bases:
object
An active or inactive state of a protein.
Examples
Kinase-active MAP2K1:
>>> mek_active = Agent('MAP2K1', ... activity=ActivityCondition('kinase', True))
Transcriptionally inactive FOXO3:
>>> foxo_inactive = Agent('FOXO3', ... activity=ActivityCondition('transcription', False))
- Parameters
activity_type (str) – The type of activity, e.g. ‘kinase’. The basic, unspecified molecular activity is represented as ‘activity’. Examples of other activity types are ‘kinase’, ‘phosphatase’, ‘catalytic’, ‘transcription’, etc.
is_active (bool) – Specifies whether the given activity type is present or absent.
- class indra.statements.statements.AddModification(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
Modification
- class indra.statements.statements.Agent(name, mods=None, activity=None, bound_conditions=None, mutations=None, location=None, db_refs=None)[source]¶
Bases:
Concept
A molecular entity, e.g., a protein.
- Parameters
name (str) – The name of the agent, preferably a canonicalized name such as an HGNC gene name.
mods (list of
ModCondition
) – Modification state of the agent.bound_conditions (list of
BoundCondition
) – Other agents bound to the agent in this context.mutations (list of
MutCondition
) – Amino acid mutations of the agent.activity (
ActivityCondition
) – Activity of the agent.location (str) – Cellular location of the agent. Must be a valid name (e.g. “nucleus”) or identifier (e.g. “GO:0005634”)for a GO cellular compartment.
db_refs (dict) – Dictionary of database identifiers associated with this agent.
- entity_matches_key()[source]¶
Return a key to identify the identity of the Agent not its state.
The key is based on the preferred grounding for the Agent, or if not available, the name of the Agent is used.
- Returns
The key used to identify the Agent.
- Return type
- get_grounding(ns_order=None)[source]¶
Return a tuple of a preferred grounding namespace and ID.
- Returns
A tuple whose first element is a grounding namespace (HGNC, CHEBI, etc.) and the second element is an identifier in the namespace. If no preferred grounding is available, a tuple of Nones is returned.
- Return type
- class indra.statements.statements.Association(members, evidence=None)[source]¶
Bases:
Complex
A set of events associated with each other without causal relationship.
- Parameters
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Autophosphorylation(enz, residue=None, position=None, evidence=None)[source]¶
Bases:
SelfModification
Intramolecular autophosphorylation, i.e., in cis.
Examples
p38 bound to TAB1 cis-autophosphorylates itself (see PMID:19155529).
>>> tab1 = Agent('TAB1') >>> p38_tab1 = Agent('P38', bound_conditions=[BoundCondition(tab1)]) >>> autophos = Autophosphorylation(p38_tab1)
- class indra.statements.statements.BioContext(location=None, cell_line=None, cell_type=None, organ=None, disease=None, species=None)[source]¶
Bases:
Context
An object representing the context of a Statement in biology.
- Parameters
location (Optional[RefContext]) – Cellular location, typically a sub-cellular compartment.
cell_line (Optional[RefContext]) – Cell line context, e.g., a specific cell line, like BT20.
cell_type (Optional[RefContext]) – Cell type context, broader than a cell line, like macrophage.
organ (Optional[RefContext]) – Organ context.
disease (Optional[RefContext]) – Disease context.
species (Optional[RefContext]) – Species context.
- class indra.statements.statements.BoundCondition(agent, is_bound=True)[source]¶
Bases:
object
Identify Agents bound (or not bound) to a given Agent in a given context.
- Parameters
Examples
EGFR bound to EGF:
>>> egf = Agent('EGF') >>> egfr = Agent('EGFR', bound_conditions=[BoundCondition(egf)])
BRAF not bound to a 14-3-3 protein (YWHAB):
>>> ywhab = Agent('YWHAB') >>> braf = Agent('BRAF', bound_conditions=[BoundCondition(ywhab, False)])
- class indra.statements.statements.Complex(members, evidence=None)[source]¶
Bases:
Statement
A set of proteins observed to be in a complex.
- Parameters
members (list of
Agent
) – The set of proteins in the complex.
Examples
BRAF is observed to be in a complex with RAF1:
>>> braf = Agent('BRAF') >>> raf1 = Agent('RAF1') >>> cplx = Complex([braf, raf1])
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Concept(name, db_refs=None)[source]¶
Bases:
object
A concept/entity of interest that is the argument of a Statement
- class indra.statements.statements.Conversion(subj, obj_from=None, obj_to=None, evidence=None)[source]¶
Bases:
Statement
Conversion of molecular species mediated by a controller protein.
- Parameters
subj (
indra.statement.Agent
) – The protein mediating the conversion.obj_from (list of
indra.statement.Agent
) – The list of molecular species being consumed by the conversion.obj_to (list of
indra.statement.Agent
) – The list of molecular species being created by the conversion.evidence (None or
Evidence
or list ofEvidence
) – Evidence objects in support of the synthesis statement.
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Deacetylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Deacetylation modification.
- class indra.statements.statements.DecreaseAmount(subj, obj, evidence=None)[source]¶
Bases:
RegulateAmount
Degradation of a protein, possibly mediated by another protein.
Note that this statement can also be used to represent inhibitors of synthesis (e.g., cycloheximide).
- class indra.statements.statements.Defarnesylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Defarnesylation modification.
- class indra.statements.statements.Degeranylgeranylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Degeranylgeranylation modification.
- class indra.statements.statements.Deglycosylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Deglycosylation modification.
- class indra.statements.statements.Dehydroxylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Dehydroxylation modification.
- class indra.statements.statements.Demethylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Demethylation modification.
- class indra.statements.statements.Demyristoylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Demyristoylation modification.
- class indra.statements.statements.Depalmitoylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Depalmitoylation modification.
- class indra.statements.statements.Dephosphorylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Dephosphorylation modification.
Examples
DUSP6 dephosphorylates ERK (MAPK1) at T185:
>>> dusp6 = Agent('DUSP6') >>> erk = Agent('MAPK1') >>> dephos = Dephosphorylation(dusp6, erk, 'T', '185')
- class indra.statements.statements.Deribosylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Deribosylation modification.
- class indra.statements.statements.Desumoylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Desumoylation modification.
- class indra.statements.statements.Deubiquitination(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
RemoveModification
Deubiquitination modification.
- class indra.statements.statements.Event(concept, delta=None, context=None, evidence=None, supports=None, supported_by=None)[source]¶
Bases:
Statement
An event representing the change of a Concept.
- concept¶
The concept over which the event is defined.
- delta¶
Represents a change in the concept, with a polarity and an adjectives entry.
- Type
indra.statements.delta.Delta
- context¶
The context associated with the event.
- to_json(with_evidence=True, use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Evidence(source_api=None, source_id=None, pmid=None, text=None, annotations=None, epistemics=None, context=None, text_refs=None)[source]¶
Bases:
object
Container for evidence supporting a given statement.
- Parameters
source_api (str or None) – String identifying the INDRA API used to capture the statement, e.g., ‘trips’, ‘biopax’, ‘bel’.
source_id (str or None) – For statements drawn from databases, ID of the database entity corresponding to the statement.
pmid (str or None) – String indicating the Pubmed ID of the source of the statement.
text (str) – Natural language text supporting the statement.
annotations (dict) – Dictionary containing additional information on the context of the statement, e.g., species, cell line, tissue type, etc. The entries may vary depending on the source of the information.
epistemics (dict) – A dictionary describing various forms of epistemic certainty associated with the statement.
context (Context or None) – A context object
text_refs (dict) – A dictionary of various reference ids to the source text, e.g. DOI, PMID, URL, etc.
There are some attributes which are not set by the parameters above:
- source_hashint
A hash calculated from the evidence text, source api, and pmid and/or source_id if available. This is generated automatcially when the object is instantiated.
- stmt_tagint
This is a hash calculated by a Statement to which this evidence refers, and is set by said Statement. It is useful for tracing ownership of an Evidence object.
- class indra.statements.statements.Farnesylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Farnesylation modification.
- class indra.statements.statements.Gap(gap, ras, evidence=None)[source]¶
Bases:
Statement
Acceleration of a GTPase protein’s GTP hydrolysis rate by a GAP.
Represents the generic process by which a GTPase activating protein (GAP) catalyzes GTP hydrolysis by a particular small GTPase protein.
Examples
RASA1 catalyzes GTP hydrolysis on KRAS:
>>> rasa1 = Agent('RASA1') >>> kras = Agent('KRAS') >>> gap = Gap(rasa1, kras)
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Gef(gef, ras, evidence=None)[source]¶
Bases:
Statement
Exchange of GTP for GDP on a small GTPase protein mediated by a GEF.
Represents the generic process by which a guanosine exchange factor (GEF) catalyzes nucleotide exchange on a GTPase protein.
Examples
SOS1 catalyzes nucleotide exchange on KRAS:
>>> sos = Agent('SOS1') >>> kras = Agent('KRAS') >>> gef = Gef(sos, kras)
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Geranylgeranylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Geranylgeranylation modification.
- class indra.statements.statements.Glycosylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Glycosylation modification.
- class indra.statements.statements.GtpActivation(subj, obj, obj_activity='activity', evidence=None)[source]¶
Bases:
Activation
- class indra.statements.statements.HasActivity(agent, activity, has_activity, evidence=None)[source]¶
Bases:
Statement
States that an Agent has or doesn’t have a given activity type.
With this Statement, one cane express that a given protein is a kinase, or, for instance, that it is a transcription factor. It is also possible to construct negative statements with which one epxresses, for instance, that a given protein is not a kinase.
- class indra.statements.statements.Hydroxylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Hydroxylation modification.
- class indra.statements.statements.IncreaseAmount(subj, obj, evidence=None)[source]¶
Bases:
RegulateAmount
Synthesis of a protein, possibly mediated by another protein.
- class indra.statements.statements.Influence(subj, obj, evidence=None)[source]¶
Bases:
Statement
An influence on the quantity of a concept of interest.
- Parameters
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Inhibition(subj, obj, obj_activity='activity', evidence=None)[source]¶
Bases:
RegulateActivity
Indicates that a protein inhibits or deactivates another protein.
This statement is intended to be used for physical interactions where the mechanism of inhibition is not explicitly specified, which is often the case for descriptions of mechanisms extracted from the literature.
- Parameters
subj (
Agent
) – The agent responsible for the change in activity, i.e., the “upstream” node.obj (
Agent
) – The agent whose activity is influenced by the subject, i.e., the “downstream” node.obj_activity (Optional[str]) – The activity of the obj Agent that is affected, e.g., its “kinase” activity.
evidence (None or
Evidence
or list ofEvidence
) – Evidence objects in support of the modification.
- exception indra.statements.statements.InvalidLocationError(name)[source]¶
Bases:
ValueError
Invalid cellular component name.
- exception indra.statements.statements.InvalidResidueError(name)[source]¶
Bases:
ValueError
Invalid residue (amino acid) name.
- class indra.statements.statements.Methylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Methylation modification.
- class indra.statements.statements.Migration(concept, delta=None, context=None, evidence=None, supports=None, supported_by=None)[source]¶
Bases:
Event
A special class of Event representing Migration.
- class indra.statements.statements.ModCondition(mod_type, residue=None, position=None, is_modified=True)[source]¶
Bases:
object
Post-translational modification state at an amino acid position.
- Parameters
mod_type (str) – The type of post-translational modification, e.g., ‘phosphorylation’. Valid modification types currently include: ‘phosphorylation’, ‘ubiquitination’, ‘sumoylation’, ‘hydroxylation’, and ‘acetylation’. If an invalid modification type is passed an InvalidModTypeError is raised.
residue (str or None) – String indicating the modified amino acid, e.g., ‘Y’ or ‘tyrosine’. If None, indicates that the residue at the modification site is unknown or unspecified.
position (str or None) – String indicating the position of the modified amino acid, e.g., ‘202’. If None, indicates that the position is unknown or unspecified.
is_modified (bool) – Specifies whether the modification is present or absent. Setting the flag specifies that the Agent with the ModCondition is unmodified at the site.
Examples
Doubly-phosphorylated MEK (MAP2K1):
>>> phospho_mek = Agent('MAP2K1', mods=[ ... ModCondition('phosphorylation', 'S', '202'), ... ModCondition('phosphorylation', 'S', '204')])
ERK (MAPK1) unphosphorylated at tyrosine 187:
>>> unphos_erk = Agent('MAPK1', mods=( ... ModCondition('phosphorylation', 'Y', '187', is_modified=False)))
- class indra.statements.statements.Modification(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
Statement
Generic statement representing the modification of a protein.
- Parameters
enz (
indra.statement.Agent
) – The enzyme involved in the modification.sub (
indra.statement.Agent
) – The substrate of the modification.residue (str or None) – The amino acid residue being modified, or None if it is unknown or unspecified.
position (str or None) – The position of the modified amino acid, or None if it is unknown or unspecified.
evidence (None or
Evidence
or list ofEvidence
) – Evidence objects in support of the modification.
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.MovementContext(locations=None, time=None)[source]¶
Bases:
Context
An object representing the context of a movement between start and end points in time.
- Parameters
locations (Optional[list[dict]) – A list of dictionaries each containing a RefContext object representing geographical location context and its role (e.g. ‘origin’, ‘destination’, etc.)
time (Optional[TimeContext]) – A TimeContext object representing the temporal context of the Statement.
- class indra.statements.statements.MutCondition(position, residue_from, residue_to=None)[source]¶
Bases:
object
Mutation state of an amino acid position of an Agent.
- Parameters
Examples
Represent EGFR with a L858R mutation:
>>> egfr_mutant = Agent('EGFR', mutations=[MutCondition('858', 'L', 'R')])
- class indra.statements.statements.Myristoylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Myristoylation modification.
- class indra.statements.statements.Palmitoylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Palmitoylation modification.
- class indra.statements.statements.Phosphorylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Phosphorylation modification.
Examples
MEK (MAP2K1) phosphorylates ERK (MAPK1) at threonine 185:
>>> mek = Agent('MAP2K1') >>> erk = Agent('MAPK1') >>> phos = Phosphorylation(mek, erk, 'T', '185')
- class indra.statements.statements.QualitativeDelta(polarity=None, adjectives=None)[source]¶
Bases:
Delta
Qualitative delta defining an Event.
- class indra.statements.statements.QuantitativeState(entity=None, value=None, unit=None, modifier=None, text=None, polarity=None)[source]¶
Bases:
Delta
An object representing numerical value of something.
- Parameters
entity (str) – An entity to capture the quantity of.
unit (str) – Measurement unit of value (e.g. absolute, daily, percentage, etc.)
modifier (str) – Modifier to value (e.g. more than, at least, approximately, etc.)
text (str) – Natural language text describing quantitative state.
polarity (1, -1 or None) – Polarity of an Event.
- static convert_unit(source_unit, target_unit, source_value, source_period=None, target_period=None)[source]¶
Convert value per unit from source to target unit. If a unit is absolute, total timedelta period has to be provided. If a unit is a month or a year, it is recommended to pass timedelta period object directly, if not provided, the approximation will be used.
- class indra.statements.statements.RefContext(name=None, db_refs=None)[source]¶
Bases:
object
An object representing a context with a name and references.
- Parameters
name (Optional[str]) – The name of the given context. In some cases a text name will not be available so this is an optional parameter with the default being None.
db_refs (Optional[dict]) – A dictionary where each key is a namespace and each value is an identifier in that namespace, similar to the db_refs associated with Concepts/Agents.
- class indra.statements.statements.RegulateActivity[source]¶
Bases:
Statement
Regulation of activity.
This class implements shared functionality of Activation and Inhibition statements and it should not be instantiated directly.
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.RegulateAmount(subj, obj, evidence=None)[source]¶
Bases:
Statement
Superclass handling operations on directed, two-element interactions.
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.RemoveModification(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
Modification
- class indra.statements.statements.Ribosylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Ribosylation modification.
- class indra.statements.statements.SelfModification(enz, residue=None, position=None, evidence=None)[source]¶
Bases:
Statement
Generic statement representing the self-modification of a protein.
- Parameters
enz (
indra.statement.Agent
) – The enzyme involved in the modification, which is also the substrate.residue (str or None) – The amino acid residue being modified, or None if it is unknown or unspecified.
position (str or None) – The position of the modified amino acid, or None if it is unknown or unspecified.
evidence (None or
Evidence
or list ofEvidence
) – Evidence objects in support of the modification.
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Statement(evidence=None, supports=None, supported_by=None)[source]¶
Bases:
object
The parent class of all statements.
- Parameters
evidence (None or
Evidence
or list ofEvidence
) – If a list of Evidence objects is passed to the constructor, the value is set to this list. If a bare Evidence object is passed, it is enclosed in a list. If no evidence is passed (the default), the value is set to an empty list.supports (list of
Statement
) – Statements that this Statement supports.supported_by (list of
Statement
) – Statements supported by this statement.
- get_hash(shallow=True, refresh=False, matches_fun=None)[source]¶
Get a hash for this Statement.
There are two types of hash, “shallow” and “full”. A shallow hash is as unique as the information carried by the statement, i.e. it is a hash of the matches_key. This means that differences in source, evidence, and so on are not included. As such, it is a shorter hash (14 nibbles). The odds of a collision among all the statements we expect to encounter (well under 10^8) is ~10^-9 (1 in a billion). Checks for collisions can be done by using the matches keys.
A full hash includes, in addition to the matches key, information from the evidence of the statement. These hashes will be equal if the two Statements came from the same sentences, extracted by the same reader, from the same source. These hashes are correspondingly longer (16 nibbles). The odds of a collision for an expected less than 10^10 extractions is ~10^-9 (1 in a billion).
Note that a hash of the Python object will also include the uuid, so it will always be unique for every object.
- Parameters
shallow (bool) – Choose between the shallow and full hashes described above. Default is true (e.g. a shallow hash).
refresh (bool) – Used to get a new copy of the hash. Default is false, so the hash, if it has been already created, will be read from the attribute. This is primarily used for speed testing.
matches_fun (Optional[function]) – A function which takes a Statement as argument and returns a string matches key which is then hashed. If not provided the Statement’s built-in matches_key method is used.
- Returns
hash – A long integer hash.
- Return type
- make_generic_copy(deeply=False)[source]¶
Make a new matching Statement with no provenance.
All agents and other attributes besides evidence, uuid, supports, and supported_by will be copied over, and a new uuid will be assigned. Thus, the new Statement will satisfy new_stmt.matches(old_stmt).
If deeply is set to True, all the attributes will be deep-copied, which is comparatively slow. Otherwise, attributes of this statement may be altered by changes to the new matching statement.
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Sumoylation(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Sumoylation modification.
- class indra.statements.statements.TimeContext(text=None, start=None, end=None, duration=None)[source]¶
Bases:
object
An object representing the time context of a Statement
- Parameters
text (Optional[str]) – A string representation of the time constraint, typically as seen in text.
start (Optional[datetime]) – A datetime object representing the start time
end (Optional[datetime]) – A datetime object representing the end time
duration (int) – The duration of the time constraint in seconds
- class indra.statements.statements.Translocation(agent, from_location=None, to_location=None, evidence=None)[source]¶
Bases:
Statement
The translocation of a molecular agent from one location to another.
- Parameters
agent (
Agent
) – The agent which translocates.from_location (Optional[str]) – The location from which the agent translocates. This must be a valid GO cellular component name (e.g. “cytoplasm”) or ID (e.g. “GO:0005737”).
to_location (Optional[str]) – The location to which the agent translocates. This must be a valid GO cellular component name or ID.
- to_json(use_sbo=False, matches_fun=None)[source]¶
Return serialized Statement as a JSON dict.
- Parameters
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – The JSON-serialized INDRA Statement.
- Return type
- class indra.statements.statements.Transphosphorylation(enz, residue=None, position=None, evidence=None)[source]¶
Bases:
SelfModification
Autophosphorylation in trans.
Transphosphorylation assumes that a kinase is already bound to a substrate (usually of the same molecular species), and phosphorylates it in an intra-molecular fashion. The enz property of the statement must have exactly one bound_conditions entry, and we assume that enz phosphorylates this molecule. The bound_neg property is ignored here.
- class indra.statements.statements.Ubiquitination(enz, sub, residue=None, position=None, evidence=None)[source]¶
Bases:
AddModification
Ubiquitination modification.
- class indra.statements.statements.Unresolved(uuid_str=None, shallow_hash=None, full_hash=None)[source]¶
Bases:
Statement
A special statement type used in support when a uuid can’t be resolved.
When using the stmts_from_json method, it is sometimes not possible to resolve the uuid found in support and supported_by in the json representation of an indra statement. When this happens, this class is used as a place-holder, carrying only the uuid of the statement.
- class indra.statements.statements.WorldContext(time=None, geo_location=None)[source]¶
Bases:
Context
An object representing the context of a Statement in time and space.
- Parameters
time (Optional[TimeContext]) – A TimeContext object representing the temporal context of the Statement.
geo_location (Optional[RefContext]) – The geographical location context represented as a RefContext
- indra.statements.statements.draw_stmt_graph(stmts)[source]¶
Render the attributes of a list of Statements as directed graphs.
The layout works well for a single Statement or a few Statements at a time. This function displays the plot of the graph using plt.show().
- Parameters
stmts (list[indra.statements.Statement]) – A list of one or more INDRA Statements whose attribute graph should be drawn.
- indra.statements.statements.get_all_descendants(parent)[source]¶
Get all the descendants of a parent class, recursively.
- indra.statements.statements.get_statement_by_name(stmt_name)[source]¶
Get a statement class given the name of the statement class.
- indra.statements.statements.get_unresolved_support_uuids(stmts)[source]¶
Get uuids unresolved in support from stmts from stmts_from_json.
- indra.statements.statements.get_valid_residue(residue)[source]¶
Check if the given string represents a valid amino acid residue.
- indra.statements.statements.make_statement_camel(stmt_name)[source]¶
Makes a statement name match the case of the corresponding statement.
- indra.statements.statements.mk_str(mk)[source]¶
Replace class path for backwards compatibility of matches keys.
- indra.statements.statements.pretty_print_stmts(stmt_list, stmt_limit=None, ev_limit=5, width=None)[source]¶
Print a formatted list of statements along with evidence text.
Requires the tabulate package (https://pypi.org/project/tabulate).
- Parameters
stmt_list (List[Statement]) – The list of INDRA Statements to be printed.
stmt_limit (Optional[int]) – The maximum number of INDRA Statements to be printed. If None, all Statements are printed. (Default is None)
ev_limit (Optional[int]) – The maximum number of Evidence to print for each Statement. If None, all evidence will be printed for each Statement. (Default is 5)
width (Optional[int]) – Manually set the width of the table. If None the function will try to match the current terminal width using os.get_terminal_size(). If this fails the width defaults to 80 characters. The maximum width can be controlled by setting
pretty_print_max_width
using theset_pretty_print_max_width()
function. This is useful in Jupyter notebooks where the environment returns a terminal size of 80 characters regardless of the width of the window. (Default is None).
- Return type
- indra.statements.statements.print_stmt_summary(statements)[source]¶
Print a summary of a list of statements by statement type
Requires the tabulate package (https://pypi.org/project/tabulate).
- Parameters
statements (List[Statement]) – The list of INDRA Statements to be printed.
- indra.statements.statements.set_pretty_print_max_width(new_max)[source]¶
Set the max display width for pretty prints, in characters.
- indra.statements.statements.stmt_from_json(json_in)[source]¶
Deserialize a single statement JSON into a Statement object.
- indra.statements.statements.stmt_from_json_str(json_in)[source]¶
Deserialize a single statement JSON string into a Statement object.
- indra.statements.statements.stmt_type(obj, mk=True)[source]¶
Return standardized, backwards compatible object type String.
This is a temporary solution to make sure type comparisons and matches keys of Statements and related classes are backwards compatible.
- indra.statements.statements.stmts_from_json(json_in, on_missing_support='handle')[source]¶
Get a list of Statements from Statement jsons.
In the case of pre-assembled Statements which have supports and supported_by lists, the uuids will be replaced with references to Statement objects from the json, where possible. The method of handling missing support is controled by the on_missing_support key-word argument.
- Parameters
json_in (iterable[dict]) – A json list containing json dict representations of INDRA Statements, as produced by the to_json methods of subclasses of Statement, or equivalently by stmts_to_json.
on_missing_support (Optional[str]) –
Handles the behavior when a uuid reference in supports or supported_by attribute cannot be resolved. This happens because uuids can only be linked to Statements contained in the json_in list, and some may be missing if only some of all the Statements from pre- assembly are contained in the list.
Options:
’handle’ : (default) convert unresolved uuids into Unresolved Statement objects.
’ignore’ : Simply omit any uuids that cannot be linked to any Statements in the list.
’error’ : Raise an error upon hitting an un-linkable uuid.
- Returns
stmts – A list of INDRA Statements.
- Return type
list[
Statement
]
- indra.statements.statements.stmts_from_json_file(fname, format='json')[source]¶
Return a list of statements loaded from a JSON file.
- Parameters
- Returns
The list of INDRA Statements loaded from the JSOn file.
- Return type
list[indra.statements.Statement]
- indra.statements.statements.stmts_to_json(stmts_in, use_sbo=False, matches_fun=None)[source]¶
Return the JSON-serialized form of one or more INDRA Statements.
- Parameters
stmts_in (Statement or list[Statement]) – A Statement or list of Statement objects to serialize into JSON.
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – JSON-serialized INDRA Statements.
- Return type
- indra.statements.statements.stmts_to_json_file(stmts, fname, format='json', **kwargs)[source]¶
Serialize a list of INDRA Statements into a JSON file.
- Parameters
stmts (list[indra.statement.Statements]) – The list of INDRA Statements to serialize into the JSON file.
fname (
Union
[str
,Path
,PathLike
]) – Path to the JSON file to serialize Statements into.format (Optional[str]) – One of ‘json’ to use regular JSON with indent=1 formatting or ‘jsonl’ to put each statement on a new line without indents.
Agents (indra.statements.agent
)¶
- class indra.statements.agent.ActivityCondition(activity_type, is_active)[source]¶
Bases:
object
An active or inactive state of a protein.
Examples
Kinase-active MAP2K1:
>>> mek_active = Agent('MAP2K1', ... activity=ActivityCondition('kinase', True))
Transcriptionally inactive FOXO3:
>>> foxo_inactive = Agent('FOXO3', ... activity=ActivityCondition('transcription', False))
- Parameters
activity_type (str) – The type of activity, e.g. ‘kinase’. The basic, unspecified molecular activity is represented as ‘activity’. Examples of other activity types are ‘kinase’, ‘phosphatase’, ‘catalytic’, ‘transcription’, etc.
is_active (bool) – Specifies whether the given activity type is present or absent.
- class indra.statements.agent.Agent(name, mods=None, activity=None, bound_conditions=None, mutations=None, location=None, db_refs=None)[source]¶
Bases:
Concept
A molecular entity, e.g., a protein.
- Parameters
name (str) – The name of the agent, preferably a canonicalized name such as an HGNC gene name.
mods (list of
ModCondition
) – Modification state of the agent.bound_conditions (list of
BoundCondition
) – Other agents bound to the agent in this context.mutations (list of
MutCondition
) – Amino acid mutations of the agent.activity (
ActivityCondition
) – Activity of the agent.location (str) – Cellular location of the agent. Must be a valid name (e.g. “nucleus”) or identifier (e.g. “GO:0005634”)for a GO cellular compartment.
db_refs (dict) – Dictionary of database identifiers associated with this agent.
- entity_matches_key()[source]¶
Return a key to identify the identity of the Agent not its state.
The key is based on the preferred grounding for the Agent, or if not available, the name of the Agent is used.
- Returns
The key used to identify the Agent.
- Return type
- get_grounding(ns_order=None)[source]¶
Return a tuple of a preferred grounding namespace and ID.
- Returns
A tuple whose first element is a grounding namespace (HGNC, CHEBI, etc.) and the second element is an identifier in the namespace. If no preferred grounding is available, a tuple of Nones is returned.
- Return type
- class indra.statements.agent.BoundCondition(agent, is_bound=True)[source]¶
Bases:
object
Identify Agents bound (or not bound) to a given Agent in a given context.
- Parameters
Examples
EGFR bound to EGF:
>>> egf = Agent('EGF') >>> egfr = Agent('EGFR', bound_conditions=[BoundCondition(egf)])
BRAF not bound to a 14-3-3 protein (YWHAB):
>>> ywhab = Agent('YWHAB') >>> braf = Agent('BRAF', bound_conditions=[BoundCondition(ywhab, False)])
- class indra.statements.agent.ModCondition(mod_type, residue=None, position=None, is_modified=True)[source]¶
Bases:
object
Post-translational modification state at an amino acid position.
- Parameters
mod_type (str) – The type of post-translational modification, e.g., ‘phosphorylation’. Valid modification types currently include: ‘phosphorylation’, ‘ubiquitination’, ‘sumoylation’, ‘hydroxylation’, and ‘acetylation’. If an invalid modification type is passed an InvalidModTypeError is raised.
residue (str or None) – String indicating the modified amino acid, e.g., ‘Y’ or ‘tyrosine’. If None, indicates that the residue at the modification site is unknown or unspecified.
position (str or None) – String indicating the position of the modified amino acid, e.g., ‘202’. If None, indicates that the position is unknown or unspecified.
is_modified (bool) – Specifies whether the modification is present or absent. Setting the flag specifies that the Agent with the ModCondition is unmodified at the site.
Examples
Doubly-phosphorylated MEK (MAP2K1):
>>> phospho_mek = Agent('MAP2K1', mods=[ ... ModCondition('phosphorylation', 'S', '202'), ... ModCondition('phosphorylation', 'S', '204')])
ERK (MAPK1) unphosphorylated at tyrosine 187:
>>> unphos_erk = Agent('MAPK1', mods=( ... ModCondition('phosphorylation', 'Y', '187', is_modified=False)))
Concepts (indra.statements.concept
)¶
- class indra.statements.concept.Concept(name, db_refs=None)[source]¶
Bases:
object
A concept/entity of interest that is the argument of a Statement
- indra.statements.concept.compositional_sort_key(entry)[source]¶
Return a sort key from a compositional grounding entry
Evidence (indra.statements.evidence
)¶
- class indra.statements.evidence.Evidence(source_api=None, source_id=None, pmid=None, text=None, annotations=None, epistemics=None, context=None, text_refs=None)[source]¶
Bases:
object
Container for evidence supporting a given statement.
- Parameters
source_api (str or None) – String identifying the INDRA API used to capture the statement, e.g., ‘trips’, ‘biopax’, ‘bel’.
source_id (str or None) – For statements drawn from databases, ID of the database entity corresponding to the statement.
pmid (str or None) – String indicating the Pubmed ID of the source of the statement.
text (str) – Natural language text supporting the statement.
annotations (dict) – Dictionary containing additional information on the context of the statement, e.g., species, cell line, tissue type, etc. The entries may vary depending on the source of the information.
epistemics (dict) – A dictionary describing various forms of epistemic certainty associated with the statement.
context (Context or None) – A context object
text_refs (dict) – A dictionary of various reference ids to the source text, e.g. DOI, PMID, URL, etc.
There are some attributes which are not set by the parameters above:
- source_hashint
A hash calculated from the evidence text, source api, and pmid and/or source_id if available. This is generated automatcially when the object is instantiated.
- stmt_tagint
This is a hash calculated by a Statement to which this evidence refers, and is set by said Statement. It is useful for tracing ownership of an Evidence object.
Context (indra.statements.context
)¶
- class indra.statements.context.BioContext(location=None, cell_line=None, cell_type=None, organ=None, disease=None, species=None)[source]¶
Bases:
Context
An object representing the context of a Statement in biology.
- Parameters
location (Optional[RefContext]) – Cellular location, typically a sub-cellular compartment.
cell_line (Optional[RefContext]) – Cell line context, e.g., a specific cell line, like BT20.
cell_type (Optional[RefContext]) – Cell type context, broader than a cell line, like macrophage.
organ (Optional[RefContext]) – Organ context.
disease (Optional[RefContext]) – Disease context.
species (Optional[RefContext]) – Species context.
- class indra.statements.context.MovementContext(locations=None, time=None)[source]¶
Bases:
Context
An object representing the context of a movement between start and end points in time.
- Parameters
locations (Optional[list[dict]) – A list of dictionaries each containing a RefContext object representing geographical location context and its role (e.g. ‘origin’, ‘destination’, etc.)
time (Optional[TimeContext]) – A TimeContext object representing the temporal context of the Statement.
- class indra.statements.context.RefContext(name=None, db_refs=None)[source]¶
Bases:
object
An object representing a context with a name and references.
- Parameters
name (Optional[str]) – The name of the given context. In some cases a text name will not be available so this is an optional parameter with the default being None.
db_refs (Optional[dict]) – A dictionary where each key is a namespace and each value is an identifier in that namespace, similar to the db_refs associated with Concepts/Agents.
- class indra.statements.context.TimeContext(text=None, start=None, end=None, duration=None)[source]¶
Bases:
object
An object representing the time context of a Statement
- Parameters
text (Optional[str]) – A string representation of the time constraint, typically as seen in text.
start (Optional[datetime]) – A datetime object representing the start time
end (Optional[datetime]) – A datetime object representing the end time
duration (int) – The duration of the time constraint in seconds
- class indra.statements.context.WorldContext(time=None, geo_location=None)[source]¶
Bases:
Context
An object representing the context of a Statement in time and space.
- Parameters
time (Optional[TimeContext]) – A TimeContext object representing the temporal context of the Statement.
geo_location (Optional[RefContext]) – The geographical location context represented as a RefContext
Input/output, serialization (indra.statements.io
)¶
- indra.statements.io.draw_stmt_graph(stmts)[source]¶
Render the attributes of a list of Statements as directed graphs.
The layout works well for a single Statement or a few Statements at a time. This function displays the plot of the graph using plt.show().
- Parameters
stmts (list[indra.statements.Statement]) – A list of one or more INDRA Statements whose attribute graph should be drawn.
- indra.statements.io.pretty_print_stmts(stmt_list, stmt_limit=None, ev_limit=5, width=None)[source]¶
Print a formatted list of statements along with evidence text.
Requires the tabulate package (https://pypi.org/project/tabulate).
- Parameters
stmt_list (List[Statement]) – The list of INDRA Statements to be printed.
stmt_limit (Optional[int]) – The maximum number of INDRA Statements to be printed. If None, all Statements are printed. (Default is None)
ev_limit (Optional[int]) – The maximum number of Evidence to print for each Statement. If None, all evidence will be printed for each Statement. (Default is 5)
width (Optional[int]) – Manually set the width of the table. If None the function will try to match the current terminal width using os.get_terminal_size(). If this fails the width defaults to 80 characters. The maximum width can be controlled by setting
pretty_print_max_width
using theset_pretty_print_max_width()
function. This is useful in Jupyter notebooks where the environment returns a terminal size of 80 characters regardless of the width of the window. (Default is None).
- Return type
- indra.statements.io.print_stmt_summary(statements)[source]¶
Print a summary of a list of statements by statement type
Requires the tabulate package (https://pypi.org/project/tabulate).
- Parameters
statements (List[Statement]) – The list of INDRA Statements to be printed.
- indra.statements.io.set_pretty_print_max_width(new_max)[source]¶
Set the max display width for pretty prints, in characters.
- indra.statements.io.stmt_from_json(json_in)[source]¶
Deserialize a single statement JSON into a Statement object.
- Parameters
json_in (dict) – A JSON representation of the INDRA Statement.
- Returns
stmt – The INDRA Statement.
- Return type
Statement
- indra.statements.io.stmt_from_json_str(json_in)[source]¶
Deserialize a single statement JSON string into a Statement object.
- Parameters
json_in (str) – A JSON-string serialized INDRA Statement.
- Returns
stmt – The deserialized INDRA Statement.
- Return type
Statement
- indra.statements.io.stmts_from_json(json_in, on_missing_support='handle')[source]¶
Get a list of Statements from Statement jsons.
In the case of pre-assembled Statements which have supports and supported_by lists, the uuids will be replaced with references to Statement objects from the json, where possible. The method of handling missing support is controled by the on_missing_support key-word argument.
- Parameters
json_in (iterable[dict]) – A json list containing json dict representations of INDRA Statements, as produced by the to_json methods of subclasses of Statement, or equivalently by stmts_to_json.
on_missing_support (Optional[str]) –
Handles the behavior when a uuid reference in supports or supported_by attribute cannot be resolved. This happens because uuids can only be linked to Statements contained in the json_in list, and some may be missing if only some of all the Statements from pre- assembly are contained in the list.
Options:
’handle’ : (default) convert unresolved uuids into Unresolved Statement objects.
’ignore’ : Simply omit any uuids that cannot be linked to any Statements in the list.
’error’ : Raise an error upon hitting an un-linkable uuid.
- Returns
stmts – A list of INDRA Statements.
- Return type
list[
Statement
]
- indra.statements.io.stmts_from_json_file(fname, format='json')[source]¶
Return a list of statements loaded from a JSON file.
- Parameters
- Returns
The list of INDRA Statements loaded from the JSOn file.
- Return type
list[indra.statements.Statement]
- indra.statements.io.stmts_to_json(stmts_in, use_sbo=False, matches_fun=None)[source]¶
Return the JSON-serialized form of one or more INDRA Statements.
- Parameters
stmts_in (Statement or list[Statement]) – A Statement or list of Statement objects to serialize into JSON.
use_sbo (Optional[bool]) – If True, SBO annotations are added to each applicable element of the JSON. Default: False
matches_fun (Optional[function]) – A custom function which, if provided, is used to construct the matches key which is then hashed and put into the return value. Default: None
- Returns
json_dict – JSON-serialized INDRA Statements.
- Return type
- indra.statements.io.stmts_to_json_file(stmts, fname, format='json', **kwargs)[source]¶
Serialize a list of INDRA Statements into a JSON file.
- Parameters
stmts (list[indra.statement.Statements]) – The list of INDRA Statements to serialize into the JSON file.
fname (
Union
[str
,Path
,PathLike
]) – Path to the JSON file to serialize Statements into.format (Optional[str]) – One of ‘json’ to use regular JSON with indent=1 formatting or ‘jsonl’ to put each statement on a new line without indents.
Validation (indra.statements.validate
)¶
This module implements a number of functions that can be used to validate INDRA Statements. The available functions include ones that raise custom exceptions derived from ValueError if an invalidity is found. These come with a helpful error message that can be caught and printed to learn about the specific issue. Another set of functions do not raise exceptions, rather, return True or False depending on whether the given input is valid or invalid.
For validating namespaces and identifiers, there are two validators available, one that uses data from identifiers.org and another for Bioregistry.
- class indra.statements.validate.BioregistryValidator[source]¶
Bases:
object
A class that can be used to validate INDRA Statements.
- class indra.statements.validate.IdentifiersValidator[source]¶
Bases:
object
A class that can be used to validate INDRA Statements.
- exception indra.statements.validate.InvalidAgent[source]¶
Bases:
ValueError
- exception indra.statements.validate.InvalidContext[source]¶
Bases:
ValueError
- exception indra.statements.validate.InvalidIdentifier[source]¶
Bases:
ValueError
Raised when the identifier doesn’t match the pattern.
- exception indra.statements.validate.InvalidStatement[source]¶
Bases:
ValueError
- exception indra.statements.validate.InvalidTextRefs[source]¶
Bases:
ValueError
- exception indra.statements.validate.MissingIdentifier[source]¶
Bases:
ValueError
Raised when the identifier is None.
- exception indra.statements.validate.UnknownIdentifier[source]¶
Bases:
ValueError
Raise when the database is neither registered with identifiers.org or manually added to the
indra.databases.identifiers.non_registry
list.
- exception indra.statements.validate.UnknownNamespace[source]¶
Bases:
ValueError
- indra.statements.validate.assert_valid_agent(agent, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise InvalidAgent is there is an invalidity in the Agent.
- Parameters
agent (indra.statements.Agent) – The agent to check.
- indra.statements.validate.assert_valid_bio_context(context, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise InvalidContext error if the given bio-context is invalid.
- Parameters
context (indra.statements.BioContext) – The context object to validate.
- indra.statements.validate.assert_valid_context(context, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise InvalidContext error if the given context is invalid.
- Parameters
context (indra.statements.Context) – The context object to validate.
- indra.statements.validate.assert_valid_db_refs(db_refs, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise InvalidIdentifier error if any of the entries in the given db_refs are invalid.
- Parameters
db_refs (dict) – A dict of database references, typically part of an INDRA Agent.
- indra.statements.validate.assert_valid_evidence(evidence, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise an error if the given evidence is invalid.
- Parameters
evidence (indra.statements.Evidence) – The evidence object to validate.
- indra.statements.validate.assert_valid_id(db_ns, db_id, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise InvalidIdentifier error if the ID is invalid in the given namespace.
- indra.statements.validate.assert_valid_ns(db_ns, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise UnknownNamespace error if the given namespace is unknown.
- Parameters
db_ns (str) – The namespace.
- indra.statements.validate.assert_valid_pmid_text_refs(evidence)[source]¶
Return True if the pmid attribute is consistent with text refs
- indra.statements.validate.assert_valid_statement(stmt, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise an error if there is anything invalid in the given statement.
- Parameters
stmt (indra.statements.Statement) – An INDRA Statement to validate.
- indra.statements.validate.assert_valid_statement_semantics(stmt)[source]¶
Raise InvalidStatement error if the given statement is invalid.
- Parameters
statement (indra.statements.Statement) – The statement to check.
- indra.statements.validate.assert_valid_statements(stmts, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Raise an error of any of the given statements is invalid.
- Parameters
stmts (list[indra.statements.Statement]) – A list of INDRA Statements to validate.
- indra.statements.validate.assert_valid_text_refs(text_refs)[source]¶
Raise an InvalidTextRefs error if the given text refs are invalid.
- indra.statements.validate.print_validation_report(stmts, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Log the first validation error encountered for each given statement.
- Parameters
stmts (list[indra.statements.Statement]) – A list of INDRA Statements to validate.
- indra.statements.validate.validate_agent(agent, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Return False if is there is an invalidity in the Agent, otherwise True.
- Parameters
agent (indra.statements.Agent) – The agent to check.
- Returns
True if the agent is valid, False otherwise.
- Return type
- indra.statements.validate.validate_db_refs(db_refs, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Return True if all the entries in the given db_refs are valid.
- indra.statements.validate.validate_evidence(evidence, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Return False if the given evidence is invalid, otherwise True.
- Parameters
evidence (indra.statements.Evidence) – The evidence object to validate.
- Returns
True if the evidence is valid, otherwise False.
- Return type
- indra.statements.validate.validate_id(db_ns, db_id, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Return True if the given ID is valid in the given namespace.
- indra.statements.validate.validate_ns(db_ns, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Return True if the given namespace is known.
- indra.statements.validate.validate_statement(stmt, validator=<indra.statements.validate.IdentifiersValidator object>)[source]¶
Return True if all the groundings in the given statement are valid.
- Parameters
stmt (indra.statements.Statement) – An INDRA Statement to validate.
- Returns
True if all the db_refs entries of the Agents in the given Statement are valid, else False.
- Return type
Resource access (indra.statements.resources
)¶
- exception indra.statements.resources.InvalidLocationError(name)[source]¶
Bases:
ValueError
Invalid cellular component name.
- exception indra.statements.resources.InvalidResidueError(name)[source]¶
Bases:
ValueError
Invalid residue (amino acid) name.