Path finding algorithms for explanation (indra.explanation.pathfinding
)
Path finding functions (indra.explanation.pathfinding.pathfinding
)
- indra.explanation.pathfinding.pathfinding.bfs_search(g, source_node, reverse=False, depth_limit=2, path_limit=None, max_per_node=5, node_filter=None, node_blacklist=None, terminal_ns=None, sign=None, max_memory=536870912, hashes=None, allow_edge=None, strict_mesh_id_filtering=False, edge_filter=None, **kwargs)[source]
Do breadth first search from a given node and yield paths
- Parameters
g (
DiGraph
) – An nx.DiGraph to search in. Can also be a signed node graph. It is required that node data contains ‘ns’ (namespace) and edge data contains ‘belief’.source_node (
Union
[str
,Tuple
[str
,int
]]) – Node in the graph to start from.reverse (
Optional
[bool
]) – If True go upstream from source, otherwise go downstream. Default: False.depth_limit (
Optional
[int
]) – Stop when all paths with this many edges have been found. Default: 2.path_limit (
Optional
[int
]) – The maximum number of paths to return. Default: no limit.max_per_node (
Optional
[int
]) – The maximum number of paths to yield per parent node. If 1 is chosen, the search only goes down to the leaf node of its first encountered branch. Default: 5node_filter (
Optional
[List
[str
]]) – The allowed namespaces (node attribute ‘ns’) for the nodes in the pathnode_blacklist (
Optional
[Set
[Union
[str
,Tuple
[str
,int
]]]]) – A set of nodes to ignore. Default: None.terminal_ns (
Optional
[List
[str
]]) – Force a path to terminate when any of the namespaces in this list are encountered and only yield paths that terminate at these namespacessign (
Optional
[int
]) – If set, defines the search to be a signed search. Default: None.max_memory (
Optional
[int
]) – The maximum memory usage in bytes allowed for the variables queue and visited. Default: 1073741824 bytes (== 1 GiB).hashes (
Optional
[List
[int
]]) – List of hashes used (if not empty) to select edges for path findingallow_edge (
Optional
[Callable
[[Union
[str
,Tuple
[str
,int
]],Union
[str
,Tuple
[str
,int
]]],bool
]]) – Function telling the edge must be omittedstrict_mesh_id_filtering (
Optional
[bool
]) – If true, exclude all edges not relevant to provided hashesedge_filter (
Optional
[Callable
[[DiGraph
,Union
[str
,Tuple
[str
,int
]],Union
[str
,Tuple
[str
,int
]]],bool
]]) –If provided, must be a function that takes three arguments: a graph g, and the nodes u, v of the edge between u and v. The function must return a boolean. The function must return True if the edge is allowed, otherwise False. Example of function that only allows edges that have an edge belief above 0.75:
>>> g = nx.DiGraph({'CHEK1': {'FANC': {'belief': 1}}}) >>> def filter_example(g, u, v): ... return g.edges[u, v].get('belief', 0) > 0.75 >>> path_generator = bfs_search(g, source_node='CHEK1', ... edge_filter=filter_example)
- Yields
Tuple[Node, …] – Paths in the bfs search starting from source.
- Raises
StopIteration – Raises StopIteration when no more paths are available or when the memory limit is reached
- Return type
Generator
[Tuple
[Union
[str
,Tuple
[str
,int
]]],Tuple
[Optional
[Set
[Union
[str
,Tuple
[str
,int
]]]],Optional
[Set
[Tuple
[Union
[str
,Tuple
[str
,int
]],Union
[str
,Tuple
[str
,int
]]]]]],None
]
- indra.explanation.pathfinding.pathfinding.bfs_search_multiple_nodes(g, source_nodes, path_limit=None, **kwargs)[source]
Do breadth first search from each of given nodes and yield paths until path limit is met.
- Parameters
g (nx.Digraph) – An nx.DiGraph to search in. Can also be a signed node graph. It is required that node data contains ‘ns’ (namespace) and edge data contains ‘belief’.
source_nodes (list[node]) – List of nodes in the graph to start from.
path_limit (int) – The maximum number of paths to return. Default: no limit.
**kwargs (keyword arguments) – Any kwargs to pass to bfs_search.
- Yields
path (tuple(node)) – Paths in the bfs search starting from source.
- indra.explanation.pathfinding.pathfinding.find_sources(graph, target, sources, filter_func=None)[source]
Get the set of source nodes with paths to the target.
Given a common target and a list of sources (or None if test statement subject is None), perform a breadth-first search upstream from the target to determine whether there are any sources that have paths to the target. For efficiency, does not return the full path, but identifies the upstream sources and the length of the path.
- Parameters
graph (nx.DiGraph) – A DiGraph with signed nodes to find paths in.
target (node) – The signed node (usually common target node) in the graph to start looking upstream for matching sources.
sources (list[node]) – Signed nodes corresponding to the subject or upstream influence being checked.
filter_func (Optional[function]) – A function to constrain the intermediate nodes in the path. A function should take a node as a parameter and return True if the node is allowed to be in a path and False otherwise.
- Returns
Yields tuples of source node and path length (int). If there are no paths to any of the given source nodes, the generator is empty.
- Return type
generator of (source, path_length)
- indra.explanation.pathfinding.pathfinding.get_path_iter(graph, source, target, path_length, loop, dummy_target, filter_func)[source]
Return a generator of paths with path_length cutoff from source to target.
- Parameters
graph (nx.Digraph) – An nx.DiGraph to search in.
source (node) – Starting node for path.
target (node) – Ending node for path.
path_length (int) – Maximum depth of the paths.
loop (bool) – Whether the path should be a loop. If True, source is appended to path.
dummy_target (bool) – Whether provided target is a dummy node and should be removed from path
filter_func (function or None) – A function to constrain the search. A function should take a node as a parameter and return True if the node is allowed to be in a path and False otherwise. If None, then no filtering is done.
- Returns
path_generator – A generator of the paths between source and target.
- Return type
generator
- indra.explanation.pathfinding.pathfinding.open_dijkstra_search(g, start, reverse=False, path_limit=None, node_filter=None, hashes=None, ignore_nodes=None, ignore_edges=None, terminal_ns=None, weight=None, ref_counts_function=None, const_c=1, const_tk=10)[source]
Do Dijkstra search from a given node and yield paths
- Parameters
g (nx.Digraph) – An nx.DiGraph to search in.
start (node) – Node in the graph to start from.
reverse (bool) – If True go upstream from source, otherwise go downstream. Default: False.
path_limit (int) – The maximum number of paths to return. Default: no limit.
node_filter (list[str]) – The allowed namespaces (node attribute ‘ns’) for the nodes in the path
hashes (list) – List of hashes used to set edge weights
ignore_nodes (container of nodes) – nodes to ignore, optional
ignore_edges (container of edges) – edges to ignore, optional
terminal_ns (list[str]) – Force a path to terminate when any of the namespaces in this list are encountered and only yield paths that terminate at these namepsaces
weight (str) – Name of edge’s attribute used as its weight
ref_counts_function (function) – function counting references and PMIDs of an edge from its statement hashes
const_c (int) – Constant used in MeSH IDs-based weight calculation
const_tk (int) – Constant used in MeSH IDs-based weight calculation
- Yields
path (tuple(node)) – Paths in the bfs search starting from source.
- indra.explanation.pathfinding.pathfinding.shortest_simple_paths(G, source, target, weight=None, ignore_nodes=None, ignore_edges=None, hashes=None, ref_counts_function=None, strict_mesh_id_filtering=False, const_c=1, const_tk=10)[source]
- Generate all simple paths in the graph G from source to target,
starting from shortest ones.
A simple path is a path with no repeated nodes.
If a weighted shortest path search is to be used, no negative weights are allowed.
- Parameters
G (NetworkX graph) –
source (node) – Starting node for path
target (node) – Ending node for path
weight (string) – Name of the edge attribute to be used as a weight. If None all edges are considered to have unit weight. Default value None.
ignore_nodes (container of nodes) – nodes to ignore, optional
ignore_edges (container of edges) – edges to ignore, optional
hashes (list) – hashes specifying (if not empty) allowed edges
ref_counts_function (function) – function counting references and PMIDs of an edge from its statement hashes
strict_mesh_id_filtering (bool) – if true, exclude all edges not relevant to provided hashes
const_c (int) – Constant used in MeSH IDs-based weight calculation
const_tk (int) – Constant used in MeSH IDs-based weight calculation
- Returns
path_generator – A generator that produces lists of simple paths, in order from shortest to longest.
- Return type
generator
- Raises
NetworkXNoPath – If no path exists between source and target.
NetworkXError – If source or target nodes are not in the input graph.
NetworkXNotImplemented – If the input graph is a Multi[Di]Graph.
Examples
>>> G = nx.cycle_graph(7) >>> paths = list(nx.shortest_simple_paths(G, 0, 3)) >>> print(paths) [[0, 1, 2, 3], [0, 6, 5, 4, 3]]
You can use this function to efficiently compute the k shortest/best paths between two nodes.
>>> from itertools import islice >>> def k_shortest_paths(G, source, target, k, weight=None): ... return list(islice(nx.shortest_simple_paths(G, source, target, ... weight=weight), k)) >>> for path in k_shortest_paths(G, 0, 3, 2): ... print(path) [0, 1, 2, 3] [0, 6, 5, 4, 3]
Notes
This procedure is based on algorithm by Jin Y. Yen 1. Finding the first $K$ paths requires $O(KN^3)$ operations.
See also
all_shortest_paths
,shortest_path
,all_simple_paths
References
- 1
Jin Y. Yen, “Finding the K Shortest Loopless Paths in a Network”, Management Science, Vol. 17, No. 11, Theory Series (Jul., 1971), pp. 712-716.
Path finding utilities (indra.explanation.pathfinding.util
)
- indra.explanation.pathfinding.util.get_sorted_neighbors(G, node, reverse, force_edges=None, edge_filter=None)[source]
Filter and sort neighbors of a node in descending order by belief
- Parameters
G (
DiGraph
) – A networkx DiGraphnode (
Union
[str
,Tuple
[str
,int
]]) – A valid node name or signed node namereverse (
bool
) – Indicates direction of search. Neighbors are either successors (downstream search) or predecessors (reverse search).force_edges (
Optional
[List
[Tuple
[Union
[str
,Tuple
[str
,int
]],Union
[str
,Tuple
[str
,int
]]]]]) – A list of allowed edges. If provided, only allow neighbors that can be reached by the allowed edges.edge_filter (
Optional
[Callable
[[DiGraph
,Union
[str
,Tuple
[str
,int
]],Union
[str
,Tuple
[str
,int
]]],bool
]]) – If provided, must be a function that takes three arguments: a graph g, and the nodes u, v of the edge between u and v. The function must return a boolean. The function must return True if the edge is allowed, otherwise False.
- Returns
A list of nodes representing the filtered and sorted neighbors
- Return type
List[Node]
- indra.explanation.pathfinding.util.get_subgraph(g, edge_filter_func)[source]
Get a subgraph of original graph filtered by a provided function.
- indra.explanation.pathfinding.util.path_sign_to_signed_nodes(source, target, edge_sign)[source]
Translates a signed edge or path to valid signed nodes
Pairs with a negative source node are filtered out.