Path finding algorithms for explanation (indra.explanation.pathfinding)

Path finding functions (indra.explanation.pathfinding.pathfinding)

Do breadth first search from a given node and yield paths

Parameters
  • g (DiGraph) – An nx.DiGraph to search in. Can also be a signed node graph. It is required that node data contains ‘ns’ (namespace) and edge data contains ‘belief’.

  • source_node (Union[str, Tuple[str, int]]) – Node in the graph to start from.

  • reverse (Optional[bool]) – If True go upstream from source, otherwise go downstream. Default: False.

  • depth_limit (Optional[int]) – Stop when all paths with this many edges have been found. Default: 2.

  • path_limit (Optional[int]) – The maximum number of paths to return. Default: no limit.

  • max_per_node (Optional[int]) – The maximum number of paths to yield per parent node. If 1 is chosen, the search only goes down to the leaf node of its first encountered branch. Default: 5

  • node_filter (Optional[List[str]]) – The allowed namespaces (node attribute ‘ns’) for the nodes in the path

  • node_blacklist (Optional[Set[Union[str, Tuple[str, int]]]]) – A set of nodes to ignore. Default: None.

  • terminal_ns (Optional[List[str]]) – Force a path to terminate when any of the namespaces in this list are encountered and only yield paths that terminate at these namespaces

  • sign (Optional[int]) – If set, defines the search to be a signed search. Default: None.

  • max_memory (Optional[int]) – The maximum memory usage in bytes allowed for the variables queue and visited. Default: 1073741824 bytes (== 1 GiB).

  • hashes (Optional[List[int]]) – List of hashes used (if not empty) to select edges for path finding

  • allow_edge (Optional[Callable[[Union[str, Tuple[str, int]], Union[str, Tuple[str, int]]], bool]]) – Function telling the edge must be omitted

  • strict_mesh_id_filtering (Optional[bool]) – If true, exclude all edges not relevant to provided hashes

  • edge_filter (Optional[Callable[[DiGraph, Union[str, Tuple[str, int]], Union[str, Tuple[str, int]]], bool]]) –

    If provided, must be a function that takes three arguments: a graph g, and the nodes u, v of the edge between u and v. The function must return a boolean. The function must return True if the edge is allowed, otherwise False. Example of function that only allows edges that have an edge belief above 0.75:

    >>> g = nx.DiGraph({'CHEK1': {'FANC': {'belief': 1}}})
    >>> def filter_example(g, u, v):
    ...    return g.edges[u, v].get('belief', 0) > 0.75
    >>> path_generator = bfs_search(g, source_node='CHEK1',
    ...                             edge_filter=filter_example)
    

Yields

Tuple[Node, …] – Paths in the bfs search starting from source.

Raises

StopIteration – Raises StopIteration when no more paths are available or when the memory limit is reached

Return type

Generator[Tuple[Union[str, Tuple[str, int]]], Tuple[Optional[Set[Union[str, Tuple[str, int]]]], Optional[Set[Tuple[Union[str, Tuple[str, int]], Union[str, Tuple[str, int]]]]]], None]

indra.explanation.pathfinding.pathfinding.bfs_search_multiple_nodes(g, source_nodes, path_limit=None, **kwargs)[source]

Do breadth first search from each of given nodes and yield paths until path limit is met.

Parameters
  • g (nx.Digraph) – An nx.DiGraph to search in. Can also be a signed node graph. It is required that node data contains ‘ns’ (namespace) and edge data contains ‘belief’.

  • source_nodes (list[node]) – List of nodes in the graph to start from.

  • path_limit (int) – The maximum number of paths to return. Default: no limit.

  • **kwargs (keyword arguments) – Any kwargs to pass to bfs_search.

Yields

path (tuple(node)) – Paths in the bfs search starting from source.

indra.explanation.pathfinding.pathfinding.find_sources(graph, target, sources, filter_func=None)[source]

Get the set of source nodes with paths to the target.

Given a common target and a list of sources (or None if test statement subject is None), perform a breadth-first search upstream from the target to determine whether there are any sources that have paths to the target. For efficiency, does not return the full path, but identifies the upstream sources and the length of the path.

Parameters
  • graph (nx.DiGraph) – A DiGraph with signed nodes to find paths in.

  • target (node) – The signed node (usually common target node) in the graph to start looking upstream for matching sources.

  • sources (list[node]) – Signed nodes corresponding to the subject or upstream influence being checked.

  • filter_func (Optional[function]) – A function to constrain the intermediate nodes in the path. A function should take a node as a parameter and return True if the node is allowed to be in a path and False otherwise.

Returns

Yields tuples of source node and path length (int). If there are no paths to any of the given source nodes, the generator is empty.

Return type

generator of (source, path_length)

indra.explanation.pathfinding.pathfinding.get_path_iter(graph, source, target, path_length, loop, dummy_target, filter_func)[source]

Return a generator of paths with path_length cutoff from source to target.

Parameters
  • graph (nx.Digraph) – An nx.DiGraph to search in.

  • source (node) – Starting node for path.

  • target (node) – Ending node for path.

  • path_length (int) – Maximum depth of the paths.

  • loop (bool) – Whether the path should be a loop. If True, source is appended to path.

  • dummy_target (bool) – Whether provided target is a dummy node and should be removed from path

  • filter_func (function or None) – A function to constrain the search. A function should take a node as a parameter and return True if the node is allowed to be in a path and False otherwise. If None, then no filtering is done.

Returns

path_generator – A generator of the paths between source and target.

Return type

generator

Do Dijkstra search from a given node and yield paths

Parameters
  • g (nx.Digraph) – An nx.DiGraph to search in.

  • start (node) – Node in the graph to start from.

  • reverse (bool) – If True go upstream from source, otherwise go downstream. Default: False.

  • path_limit (int) – The maximum number of paths to return. Default: no limit.

  • node_filter (list[str]) – The allowed namespaces (node attribute ‘ns’) for the nodes in the path

  • hashes (list) – List of hashes used to set edge weights

  • ignore_nodes (container of nodes) – nodes to ignore, optional

  • ignore_edges (container of edges) – edges to ignore, optional

  • terminal_ns (list[str]) – Force a path to terminate when any of the namespaces in this list are encountered and only yield paths that terminate at these namepsaces

  • weight (str) – Name of edge’s attribute used as its weight

  • ref_counts_function (function) – function counting references and PMIDs of an edge from its statement hashes

  • const_c (int) – Constant used in MeSH IDs-based weight calculation

  • const_tk (int) – Constant used in MeSH IDs-based weight calculation

Yields

path (tuple(node)) – Paths in the bfs search starting from source.

indra.explanation.pathfinding.pathfinding.shortest_simple_paths(G, source, target, weight=None, ignore_nodes=None, ignore_edges=None, hashes=None, ref_counts_function=None, strict_mesh_id_filtering=False, const_c=1, const_tk=10)[source]
Generate all simple paths in the graph G from source to target,

starting from shortest ones.

A simple path is a path with no repeated nodes.

If a weighted shortest path search is to be used, no negative weights are allowed.

Parameters
  • G (NetworkX graph) –

  • source (node) – Starting node for path

  • target (node) – Ending node for path

  • weight (string) – Name of the edge attribute to be used as a weight. If None all edges are considered to have unit weight. Default value None.

  • ignore_nodes (container of nodes) – nodes to ignore, optional

  • ignore_edges (container of edges) – edges to ignore, optional

  • hashes (list) – hashes specifying (if not empty) allowed edges

  • ref_counts_function (function) – function counting references and PMIDs of an edge from its statement hashes

  • strict_mesh_id_filtering (bool) – if true, exclude all edges not relevant to provided hashes

  • const_c (int) – Constant used in MeSH IDs-based weight calculation

  • const_tk (int) – Constant used in MeSH IDs-based weight calculation

Returns

path_generator – A generator that produces lists of simple paths, in order from shortest to longest.

Return type

generator

Raises
  • NetworkXNoPath – If no path exists between source and target.

  • NetworkXError – If source or target nodes are not in the input graph.

  • NetworkXNotImplemented – If the input graph is a Multi[Di]Graph.

Examples

>>> G = nx.cycle_graph(7)
>>> paths = list(nx.shortest_simple_paths(G, 0, 3))
>>> print(paths)
[[0, 1, 2, 3], [0, 6, 5, 4, 3]]

You can use this function to efficiently compute the k shortest/best paths between two nodes.

>>> from itertools import islice
>>> def k_shortest_paths(G, source, target, k, weight=None):
...     return list(islice(nx.shortest_simple_paths(G, source, target,
...         weight=weight), k))
>>> for path in k_shortest_paths(G, 0, 3, 2):
...     print(path)
[0, 1, 2, 3]
[0, 6, 5, 4, 3]

Notes

This procedure is based on algorithm by Jin Y. Yen 1. Finding the first $K$ paths requires $O(KN^3)$ operations.

See also

all_shortest_paths, shortest_path, all_simple_paths

References

1

Jin Y. Yen, “Finding the K Shortest Loopless Paths in a Network”, Management Science, Vol. 17, No. 11, Theory Series (Jul., 1971), pp. 712-716.

Path finding utilities (indra.explanation.pathfinding.util)

indra.explanation.pathfinding.util.get_sorted_neighbors(G, node, reverse, force_edges=None, edge_filter=None)[source]

Filter and sort neighbors of a node in descending order by belief

Parameters
  • G (DiGraph) – A networkx DiGraph

  • node (Union[str, Tuple[str, int]]) – A valid node name or signed node name

  • reverse (bool) – Indicates direction of search. Neighbors are either successors (downstream search) or predecessors (reverse search).

  • force_edges (Optional[List[Tuple[Union[str, Tuple[str, int]], Union[str, Tuple[str, int]]]]]) – A list of allowed edges. If provided, only allow neighbors that can be reached by the allowed edges.

  • edge_filter (Optional[Callable[[DiGraph, Union[str, Tuple[str, int]], Union[str, Tuple[str, int]]], bool]]) – If provided, must be a function that takes three arguments: a graph g, and the nodes u, v of the edge between u and v. The function must return a boolean. The function must return True if the edge is allowed, otherwise False.

Returns

A list of nodes representing the filtered and sorted neighbors

Return type

List[Node]

indra.explanation.pathfinding.util.get_subgraph(g, edge_filter_func)[source]

Get a subgraph of original graph filtered by a provided function.

indra.explanation.pathfinding.util.path_sign_to_signed_nodes(source, target, edge_sign)[source]

Translates a signed edge or path to valid signed nodes

Pairs with a negative source node are filtered out.

Parameters
  • source (str|int) – The source node

  • target (str|int) – The target node

  • edge_sign (int) – The sign of the edge

Returns

sign_tuple – Tuple of tuples of the valid combination of signed nodes

Return type

(a, sign), (b, sign)

indra.explanation.pathfinding.util.signed_nodes_to_signed_edge(source, target)[source]

Create the triple (node, node, sign) from a pair of signed nodes

Assuming source, target forms an edge of signed nodes: edge = (a, sign), (b, sign), return the corresponding signed edge triple

Parameters
Returns

A tuple, (source, target, sign), representing the corresponding signed edge.

Return type

tuple