Using natural language to build models¶

In this tutorial we build a simple model using natural language, then contextualize and parameterize it, and export it into different formats.

Read INDRA Statements from a natural language string¶

First we import INDRA’s API to the TRIPS reading system. We then define a block of text which serves as the description of the mechanism to be modeled in the model_text variable. Finally, indra.sources.trips.process_text is called which sends a request to the TRIPS web service, gets a response and processes the extraction knowledge base to obtain a list of INDRA Statements

In [1]: from indra.sources import trips

In [2]: model_text = 'MAP2K1 phosphorylates MAPK1 and DUSP6 dephosphorylates MAPK1.'

In [3]: tp = trips.process_text(model_text)

At this point tp.statements should contain 2 INDRA Statements: a Phosphorylation Statement and a Dephosphorylation Statement. Note that the evidence sentence for each Statement is propagated:

In [4]: for st in tp.statements:
   ...:     print('%s with evidence "%s"' % (st, st.evidence[0].text))
   ...: 
Phosphorylation(MAP2K1(), MAPK1()) with evidence "MAP2K1 phosphorylates MAPK1 and DUSP6 dephosphorylates MAPK1."
Dephosphorylation(DUSP6(), MAPK1()) with evidence "MAP2K1 phosphorylates MAPK1 and DUSP6 dephosphorylates MAPK1."

Assemble the INDRA Statements into a rule-based executable model¶

We next use INDRA’s PySB Assembler to automatically assemble a rule-based model representing the biochemical mechanisms described in model_text. First a PysbAssembler object is instantiated, then the list of INDRA Statements is added to the assembler. Finally, the assembler’s make_model method is called which assembles the model and returns it, while also storing it in pa.model. Notice that we are using policies=’two_step’ as an argument of make_model. This directs the assemble to use rules in which enzymatic catalysis is modeled as a two-step process in which enzyme and substrate first reversibly bind and the enzyme-substrate complex produces and releases a product irreversibly.

In [5]: from indra.assemblers.pysb_assembler import PysbAssembler
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-5-18a38341912b> in <module>()
----> 1 from indra.assemblers.pysb_assembler import PysbAssembler

~/checkouts/readthedocs.org/user_builds/indra/checkouts/docstrings/indra/assemblers/pysb_assembler.py in <module>()
     13 
     14 from indra import statements as ist
---> 15 from indra.databases import context_client, get_identifiers_url
     16 from indra.preassembler.hierarchy_manager import entity_hierarchy as enth
     17 from indra.tools.expand_families import _agent_from_uri

~/checkouts/readthedocs.org/user_builds/indra/checkouts/docstrings/indra/databases/context_client.py in <module>()
      2 from builtins import dict, str
      3 from copy import copy
----> 4 from indra.databases import cbio_client
      5 # Python 2
      6 try:

~/checkouts/readthedocs.org/user_builds/indra/checkouts/docstrings/indra/databases/cbio_client.py in <module>()
      1 from __future__ import absolute_import, print_function, unicode_literals
      2 from builtins import dict, str
----> 3 import pandas
      4 import logging
      5 import requests

ImportError: No module named 'pandas'

In [6]: pa = PysbAssembler()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-19ab96116edb> in <module>()
----> 1 pa = PysbAssembler()

NameError: name 'PysbAssembler' is not defined

In [7]: pa.add_statements(tp.statements)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-512a81a73b3c> in <module>()
----> 1 pa.add_statements(tp.statements)

NameError: name 'pa' is not defined

In [8]: pa.make_model(policies='two_step')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-3fec6381dbe7> in <module>()
----> 1 pa.make_model(policies='two_step')

NameError: name 'pa' is not defined

At this point pa.model contains a PySB model object with 3 monomers,

In [9]: for monomer in pa.model.monomers:
   ...:     print(monomer)
   ...: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-a48caf8ad606> in <module>()
----> 1 for monomer in pa.model.monomers:
      2     print(monomer)
      3 

NameError: name 'pa' is not defined

6 rules,

In [10]: for rule in pa.model.rules:
   ....:     print(rule)
   ....: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-3dcbb20993d5> in <module>()
----> 1 for rule in pa.model.rules:
      2     print(rule)
      3 

NameError: name 'pa' is not defined

and 9 parameters (6 kinetic rate constants and 3 total protein amounts) that are set to nominal but plausible values,

In [11]: for parameter in pa.model.parameters:
   ....:     print(parameter)
   ....: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-e5ac759079fe> in <module>()
----> 1 for parameter in pa.model.parameters:
      2     print(parameter)
      3 

NameError: name 'pa' is not defined

The model also contains extensive annotations that tie the monomers to database identifiers and also annotate the semantics of each component of each rule.

In [12]: for annotation in pa.model.annotations:
   ....:     print(annotation)
   ....: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-12-3352acac0b6e> in <module>()
----> 1 for annotation in pa.model.annotations:
      2     print(annotation)
      3 

NameError: name 'pa' is not defined

Set the model to a particular cell line context¶

We can use INDRA’s contextualization module which is built into the PysbAssembler to set the amounts of proteins in the model to total amounts measured (or estimated) in a given cancer cell line. In this example, we will use the A375 melanoma cell line to set the total amounts of proteins in the model.

In [13]: pa.set_context('A375_SKIN')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-13-0f870149b3cf> in <module>()
----> 1 pa.set_context('A375_SKIN')

NameError: name 'pa' is not defined

At this point the PySB model has total protein amounts set consistent with the A375 cell line:

In [14]: for monomer_pattern, parameter in pa.model.initial_conditions:
   ....:     print('%s = %d' % (monomer_pattern, parameter.value))
   ....: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-2bccc30c057a> in <module>()
----> 1 for monomer_pattern, parameter in pa.model.initial_conditions:
      2     print('%s = %d' % (monomer_pattern, parameter.value))
      3 

NameError: name 'pa' is not defined

Exporting the model into other common formats¶

From the assembled PySB format it is possible to export the model into other common formats such as SBML, BNGL and Kappa. One can also generate a Matlab or Mathematica script with ODEs corresponding to the model.

pa.export_model('sbml')
pa.export_model('bngl')

One can also pass a file name argument to the export_model function to save the exported model directly into a file:

pa.export_model('sbml', 'example_model.sbml')