DAA HT23, Depth-first search
Depth-first search
In the DFS algorithm, the notion of a “colour” is used. What are the three colours of a node, and what do they each represent?
- White, a node has not been explored yet.
- Gray, a node has been visited but its adjacency list has not been fully explored.
- Black, every node in its adjacency list has been fully explored.
The output of DFS can be seen as three arrays, $d, f, \pi$, all indexed by vertices. What does each represent?
- $d[v]$, the discovery time of a vertex
- $f[v]$, the finishing time of a vertex
- $\pi[v]$, the predecessor of a vertex
What is the time complexity of DFS, in terms of $(V, E)$?
The output of DFS includes an array indexed by vertices $\pi[v]$ of “backpointers”, i.e. the predecessor of a given node in the exploration process. This can be used to define a set of edges $E _ \pi$ to make a graph $G _ \pi = (V, E _ \pi)$ called the “depth-first search forest”. How is $E _ \pi$ defined in this case?
The output of DFS includes an array indexed by vertices $\pi[v]$ of “backpointers”, i.e. the predecessor of a given node in the exploration process. This can be used to define a set of edges $E _ \pi$ to make a graph $G _ \pi = (V, E _ \pi)$ called the “depth-first search forest”. What is true about the colour of each node $u$ and $v$ when $(u, v)$ is in $E _ \pi$?
When $(u, v)$ is explored, $u$ is grey and $v$ is white.
Do the outputs of DFS like the finishing times $f$ and the backpointers $\pi$ depend on the order in which the nodes are visited?
Yes.
Can you state the parenthesis theorem about the discovery and finishing times for vertices $u$ and $v$ in a graph explored by depth first search?
Exactly one of the following holds:
- $d[v] < d[u] < f[u] < f[v]$ and $u$ is a descendant of $v$.
- $d[u] < d[v] < f[v] < f[u]$ and $v$ is a descendant of $u$.
- $d[u] < f[u] < d[v] < f[v]$ or $d[v] < f[v] < d[u] < f[u]$ and neither are a descendant of the other.
The parenthesis theorem states that for all $u, v$ in a graph explored by DFS, exactly one of the following holds:
- $d[v] < d[u] < f[u] < f[v]$ and $u$ is a descendant of $v$.
- $d[u] < d[v] < f[v] < f[u]$ and $v$ is a descendant of $u$.
- $d[u] < f[u] < d[v] < f[v]$ or $d[v] < f[v] < d[u] < f[u]$ and neither are a descendant of the other.
Why is this called the parenthesis theorem?
Let
- $( _ u$ correspond to $d[u]$,
- ${} _ u)$ correspond to $f[u]$,
- $[ _ v$ correspond to $d[v]$,
- ${} _ v]$ correspond to $f[v]$
Then the following are valid
\[\begin{aligned} (_u \, {}_u) [_v \, {}_v] \\\\ (_u \, [_v \, {}_v] \, {}_u) \\\\ \end{aligned}\]but
\[(_u \, [_v \, {}_u) \, {}_v]\]is not.
The parenthesis theorem states that for all $u, v$ in a graph explored by DFS, exactly one of the following holds:
- $d[v] < d[u] < f[u] < f[v]$ and $u$ is a descendant of $v$.
- $d[u] < d[v] < f[v] < f[u]$ and $v$ is a descendant of $u$.
- $d[u] < f[u] < d[v] < f[v]$ or $d[v] < f[v] < d[u] < f[u]$ and neither are a descendant of the other.
What does this let you say in general about when $v$ is a descendant of $u$?
$v$ is a descendant of $u$ if and only if $d[u] < d[v] < f[v] < f[u]$.
Running depth-first search gives three outputs,
- $d[v]$, an array of discovery times,
- $f[v]$, an array of finishing times,
- $\pi[v]$, an array of backpointers.
The array $\pi$ gives rise to a “depth-first search forest”. But there is also an implicit characterisation of the different edges of the original graph, as “tree” edges, “back” edges, “forward” edges and “cross” edges. What does each represent?
- Tree edge; edges of the DFS forest
- Back edge, edge from a node in a DFS forest to a previous (ancestor) node in the same DFS forest
- Forward edge, edge from a node in a DFS forest to a successive (non-child descendant) node in the DFS tree.
- Cross edge, edge from a node to neither a descendant or a successor in the tree.
Running depth-first search gives three outputs,
- $d[v]$, an array of discovery times,
- $f[v]$, an array of finishing times,
- $\pi[v]$, an array of backpointers.
The array $\pi$ gives rise to a “depth-first search forest”. But there is also an implicit characterisation of the different edges of the original graph, as “tree” edges, “back” edges, “forward” edges and “cross” edges:
- Tree edge; edges of the DFS forest
- Back edge, edge from a node in a DFS forest to a previous (ancestor) node in the same DFS forest
- Forward edge, edge from a node in a DFS forest to a successive (non-child descendant) node in the DFS tree.
- Cross edge, edge from a node to neither a descendant or a successor in the tree.
When does each occur in terms of the starting a finishing times for $(u, v)$?
- Tree or forward edge, $d[u] < d[v] < f[v] < f[u]$
- Back edge, $d[v] < d[u] < f[u] < f[v]$
- Cross edge, $d[v] < f[v] < d[u] < f[u]$
Can you give the characterisation of a tree or forward edge $(u, v)$ in terms of its discovery and finishing times?
Can you give the characterisation of a back edge $(u, v)$ in terms of its discovery and finishing times?
Can you give the characterisation of a cross edge $(u, v)$ in terms of its discovery and finishing times?
What lemma gives rise the linear time DFS-based algorithm for detecting if a graph has a cycle?
A directed graph has a cycle if and only if its DFS has a back edge.
What is a topological sorting of a DAG?
An ordering of the vertices so that for every directed edge $(u, v)$, $u$ comes before $v$ in the ordering.
Why couldn’t you have the topological sorting of a graph that isn’t a DAG?
Because then it would be impossible for some nodes to be before their predecessors.
What’s a very simple algorithm to compute a topological ordering of a directed graph $(V, E)$?
Run DFS to find finishing times, and then output nodes in the reverse order of finishing time.
What type of data structure is used when implementing depth-first search non-recursively?
A stack.
What is the difference between a cycle and a simple cycle?
A simple cycle cannot repeat vertices.