Using natural language to build models

In this tutorial we build a simple model using natural language, and export it into different formats.

Read INDRA Statements from a natural language string

First we import INDRA’s API to the TRIPS reading system. We then define a block of text which serves as the description of the mechanism to be modeled in the model_text variable. Finally, indra.sources.trips.process_text is called which sends a request to the TRIPS web service, gets a response and processes the extraction knowledge base to obtain a list of INDRA Statements

In [1]: from indra.sources import trips

In [2]: model_text = 'MAP2K1 phosphorylates MAPK1 and DUSP6 dephosphorylates MAPK1.'

In [3]: tp = trips.process_text(model_text)

At this point tp.statements should contain 2 INDRA Statements: a Phosphorylation Statement and a Dephosphorylation Statement. Note that the evidence sentence for each Statement is propagated:

In [4]: for st in tp.statements:
   ...:     print('%s with evidence "%s"' % (st, st.evidence[0].text))
Phosphorylation(MAP2K1(), MAPK1()) with evidence "MAP2K1 phosphorylates MAPK1 and DUSP6 dephosphorylates MAPK1."
Dephosphorylation(DUSP6(), MAPK1()) with evidence "MAP2K1 phosphorylates MAPK1 and DUSP6 dephosphorylates MAPK1."

Assemble the INDRA Statements into a rule-based executable model

We next use INDRA’s PySB Assembler to automatically assemble a rule-based model representing the biochemical mechanisms described in model_text. First a PysbAssembler object is instantiated, then the list of INDRA Statements is added to the assembler. Finally, the assembler’s make_model method is called which assembles the model and returns it, while also storing it in pa.model. Notice that we are using policies=’two_step’ as an argument of make_model. This directs the assemble to use rules in which enzymatic catalysis is modeled as a two-step process in which enzyme and substrate first reversibly bind and the enzyme-substrate complex produces and releases a product irreversibly.

In [5]: from indra.assemblers.pysb import PysbAssembler

In [6]: pa = PysbAssembler()

In [7]: pa.add_statements(tp.statements)

In [8]: pa.make_model(policies='two_step')
Out[8]: <Model 'indra_model' (monomers: 3, rules: 6, parameters: 9, expressions: 0, compartments: 0, energypatterns: 0) at 0x7ff3dfc19d10>

At this point pa.model contains a PySB model object with 3 monomers,

In [9]: for monomer in pa.model.monomers:
   ...:     print(monomer)
Monomer('MAP2K1', ['mapk'])
Monomer('MAPK1', ['phospho', 'map2k', 'dusp'], {'phospho': ['u', 'p']})
Monomer('DUSP6', ['mapk'])

6 rules,

In [10]: for rule in pa.model.rules:
   ....:     print(rule)
Rule('MAP2K1_phosphorylation_bind_MAPK1_phospho', MAP2K1(mapk=None) + MAPK1(phospho='u', map2k=None) >> MAP2K1(mapk=1) % MAPK1(phospho='u', map2k=1), kf_mm_bind_1)
Rule('MAP2K1_phosphorylation_MAPK1_phospho', MAP2K1(mapk=1) % MAPK1(phospho='u', map2k=1) >> MAP2K1(mapk=None) + MAPK1(phospho='p', map2k=None), kc_mm_phosphorylation_1)
Rule('MAP2K1_dissoc_MAPK1', MAP2K1(mapk=1) % MAPK1(map2k=1) >> MAP2K1(mapk=None) + MAPK1(map2k=None), kr_mm_bind_1)
Rule('DUSP6_dephosphorylation_bind_MAPK1_phospho', DUSP6(mapk=None) + MAPK1(phospho='p', dusp=None) >> DUSP6(mapk=1) % MAPK1(phospho='p', dusp=1), kf_dm_bind_1)
Rule('DUSP6_dephosphorylation_MAPK1_phospho', DUSP6(mapk=1) % MAPK1(phospho='p', dusp=1) >> DUSP6(mapk=None) + MAPK1(phospho='u', dusp=None), kc_dm_phosphorylation_1)
Rule('DUSP6_dissoc_MAPK1', DUSP6(mapk=1) % MAPK1(dusp=1) >> DUSP6(mapk=None) + MAPK1(dusp=None), kr_dm_bind_1)

and 9 parameters (6 kinetic rate constants and 3 total protein amounts) that are set to nominal but plausible values,

In [11]: for parameter in pa.model.parameters:
   ....:     print(parameter)
Parameter('kf_mm_bind_1', 1e-06)
Parameter('kr_mm_bind_1', 0.1)
Parameter('kc_mm_phosphorylation_1', 100.0)
Parameter('kf_dm_bind_1', 1e-06)
Parameter('kr_dm_bind_1', 0.1)
Parameter('kc_dm_phosphorylation_1', 100.0)
Parameter('MAP2K1_0', 10000.0)
Parameter('MAPK1_0', 10000.0)
Parameter('DUSP6_0', 10000.0)

The model also contains extensive annotations that tie the monomers to database identifiers and also annotate the semantics of each component of each rule.

In [12]: for annotation in pa.model.annotations:
   ....:     print(annotation)
Annotation(MAP2K1, '', 'is')
Annotation(MAP2K1, '', 'is')
Annotation(MAP2K1, '', 'is')
Annotation(MAPK1, '', 'is')
Annotation(MAPK1, '', 'is')
Annotation(MAPK1, '', 'is')
Annotation(DUSP6, '', 'is')
Annotation(DUSP6, '', 'is')
Annotation(DUSP6, '', 'is')
Annotation(MAP2K1_phosphorylation_bind_MAPK1_phospho, '5649ee7a-3411-4fe1-a76c-354ef3131faa', 'from_indra_statement')
Annotation(MAP2K1_phosphorylation_MAPK1_phospho, 'MAP2K1', 'rule_has_subject')
Annotation(MAP2K1_phosphorylation_MAPK1_phospho, 'MAPK1', 'rule_has_object')
Annotation(MAP2K1_phosphorylation_MAPK1_phospho, '5649ee7a-3411-4fe1-a76c-354ef3131faa', 'from_indra_statement')
Annotation(MAP2K1_dissoc_MAPK1, '5649ee7a-3411-4fe1-a76c-354ef3131faa', 'from_indra_statement')
Annotation(DUSP6_dephosphorylation_bind_MAPK1_phospho, '0fbf1c70-36e7-453c-9187-b60488712dfe', 'from_indra_statement')
Annotation(DUSP6_dephosphorylation_MAPK1_phospho, 'DUSP6', 'rule_has_subject')
Annotation(DUSP6_dephosphorylation_MAPK1_phospho, 'MAPK1', 'rule_has_object')
Annotation(DUSP6_dephosphorylation_MAPK1_phospho, '0fbf1c70-36e7-453c-9187-b60488712dfe', 'from_indra_statement')
Annotation(DUSP6_dissoc_MAPK1, '0fbf1c70-36e7-453c-9187-b60488712dfe', 'from_indra_statement')

Exporting the model into other common formats

From the assembled PySB format it is possible to export the model into other common formats such as SBML, BNGL and Kappa. One can also generate a Matlab or Mathematica script with ODEs corresponding to the model.


One can also pass a file name argument to the export_model function to save the exported model directly into a file:

pa.export_model('sbml', 'example_model.sbml')