Notes - ADS HT24, NP-hardness and NP-completeness
Flashcards
What does it mean for an algorithm $\mathcal A$ to run in polynomial time?
There exists a constant $c > 0$ such that for every input $x$, the algorithm terminates in at most $ \vert x \vert ^c$ steps.
Can you define the “certificate” version of whether a decision problem $\Pi$ belongs to $\mathbf{NP}$?
There exists a polynomial time verifier $\mathcal A$ and a constant $c > 0$ such that for all instances $x$ of $\Pi$,
- $\Pi(x) = \text{Yes} \implies$ these exists a certificate $y$ with $ \vert y \vert \le \vert x \vert ^c$ such that $\mathcal A(x, y) = \text{Yes}$
- $\Pi(x) = \text{No} \implies$ for all certificates $y$, $\mathcal A(x, y) = \text{No}$.
Suppose we want to show that a particular problem is a NP, e.g. 3SAT but where we want to decide if there is a satisfying assignment where at least half of the variables are set to true. Rather than doing a reduction, what can we instead do?
Use the polynomial time verifier characterisation: given a candidate assignment, we can efficiently check whether it is satisfiable, and we can efficiently check if at least half of the variables are set to true.
What is the difference between a Karp reduction and a Turing reduction for showing $A \le B$ in the “$A$ reduces to $B$” sense?
- Karp reduction: You are allowed to use only a single call to $B$ to solve $A$
- Turing reduction: You are allowed to use multiple calls to $B$ to solve $A$
What is the $\mathbf{Clique}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a clique of size $\ge k$, where a clique is a set $S \subseteq V$ of mutually adjacent vertices?
What is the $\mathbf{IndependentSet}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
What is the $\mathbf{VertexCover}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
What is the $\mathbf{DominatingSet}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a dominating set $D$ of size $\le k$, where a dominating set $D$ is a set such that $\forall v \in V \setminus D$, $\exists u \in D$ s.t. $(u, v) \in E$.
The $\mathbf{IndependentSet}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
The $\textbf{VertexCover}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
Quickly prove that
\[\mathbf{VertexCover} \le \mathbf{IndependentSet}\]
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
Suppose $(G, k)$ is an instance of $\mathbf{VertexCover}$. Then $S$ is a vertex cover of $G$ iff $V \setminus S$ is an independent set of $G$, and since this is a polynomial time reduction we are done.
The $\mathbf{IndependentSet}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
The $\textbf{Clique}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a clique of size $\ge k$, where a clique is a set $S \subseteq V$ of mutually adjacent vertices?
Quickly prove that
\[\mathbf{IndependentSet} \le \mathbf{Clique}\]
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a clique of size $\ge k$, where a clique is a set $S \subseteq V$ of mutually adjacent vertices?
Suppose $(G, k)$ is an instance of $\mathbf{IndependentSet}$. Then $S$ is an independent set of $G$ iff $S$ is an clique of $G^C$ (where this denotes the complement of the graph), and since this is a polynomial time reduction we are done.
What is the $\mathbf{SAT}$ decision problem?
- Input: CNF formula $\Phi$
- Output: Is $\Phi$ satisfiable?
What does it mean for a Boolean formula $\Phi$ to be in CNF?
- $\Phi$ is a conjunction (AND) of clauses. ($C _ 1 \wedge \cdots \wedge C _ n$)
- Clauses are disjunctions (OR) of literals. $(\alpha _ 1 \vee \cdots \vee \alpha _ n)$
- Literals are variables or their negation ($x$ or $\bar x$)
What is a literal in a logical clause?
A variable or its negation, e.g. $x$ or $\overline x$.
Can you state the Cook-Levin theorem?
$\mathbf{SAT}$ is NP-complete.
What is the $\mathbf{3SAT}$ decision problem?
- Input: 3CNF formula $\Phi$ (each clause has at most three literals)
- Output: Is $\Phi$ satisfiable?
The $\mathbf{Max3SAT}$ optimisation problem is:
- Input: 3CNF formula $\Phi$ (each clause has at most three literals) (suppose we have $m$ clauses)
- Output: An assignment with the maximum number of clauses satisfied
Given that
\[t(n) \le 7t(n-3) \implies t(n) \le 1.913^n\]
and
\[t(n) \le t(n-1) + t(n-2) + t(n-3) \implies t(n) \le 1.840^n\]
Describe two different algorithms based on the bounded search tree approach, and highlight the better one.
- Input: 3CNF formula $\Phi$ (each clause has at most three literals) (suppose we have $m$ clauses)
- Output: An assignment with the maximum number of clauses satisfied
- If $\Phi$ is a 2CNF formula, then we can find the answer in time $O(m)$.
- Else pick a clause $c = (x _ 1 \vee x _ 2 \vee x _ 3)$ with three literals
- We can either:
- Recurse on all 7 satisfying assignments ($x _ 1 = 0, x _ 2 = 0, x _ 3 = 1$, or $x _ 1 = 0, x _ 2 = 1, x _ 3 = 0$, etc). (Algorithm 1)
- Recurse on 3 assignments $x _ 1 = 1$, or $x _ 1 = 0$, $x _ 2 = 1$, or $x _ 1 = 0$, $x _ 2 = 0$, $x _ 3 = 1$. (Algorithm 2)
- Then output the maximum over all the cases
We have the cases, if $t(n)$ represents the number of leaves in the search tree for $n$ variables
- $t(n) \le 7t(n-3)$
- $t(n) \le t(n-1) + t(n-2) + t(n-3)$
Hence the second algorithm is better.
The $\mathbf{VertexCover}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
The $\mathbf{3SAT}$ decision problem is:
- Input: CNF formula $\Phi$
- Output: Is $\Phi$ satisfiable?
Quickly prove that
\[\mathbf{3SAT} \le \mathbf{VertexCover}\]
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
- Input: CNF formula $\Phi$
- Output: Is $\Phi$ satisfiable?
Suppose are given a $\mathbf{3SAT}$ problem, the goal is to construct in polynomial time an instance $(G _ \phi, k _ \phi)$ of $\mathbf{VertexCover}$ such that
\[\phi \text{ is satisfiable} \iff G_\phi \text{ has a vertex covering of size}\le k_\phi\]Suppose $\phi$ is a 3CNF formula with $n$ variables and $m$ clauses. We construct the graph $G$ out of $n$ “variable gadgets” and $m$ “clause gadgets”.
(each variable $x _ i$ has a gadget consisting of $x _ i$ connected to $\overline{x _ i}$, and each clause $x _ i \vee x _ j \vee x _ k$ has a clause gadget which is all three variables connected one another)
Then we connect each occurence of $x _ i$ or $\overline{x _ i}$ in a variable gadget to each occurence of $x _ i$ or $\overline{x _ i}$ in a clause gadget. Note then that any vertex cover $S$ has at least $n + 2m$ vertices, since:
- At least two vertices must be in $S$ for each clause gadget
- At least one vertex must be in $S$ for each variable gadget
Let $k _ \phi = n + 2m$ (this turns out to be the right value to choose for $\phi \text{ is satisfiable} \iff G _ \phi \text{ has a vertex covering of size}\le k _ \phi$. For the backward direction, suppose $S$ is a vertex covering where $ \vert S \vert \le k _ \phi$, then by the above we must have that:
- Exactly two vertices in each clause gadget must be in $S$
- Exactly one vertex in each variable gadget must be in $S$
Define $\sigma(x _ i) = \text{True}$ if $x _ i \in S$ and $\text{False}$ otherwise.
Now consider some arbitrary clause $C _ j$ in $\phi$. By the above, the corresponding clause gadget must contain at least one vertex not in $S$. Since $S$ is a vertex cover, this vertex must be connected to a variable in the variable gadget which is in $S$ (otherwise, this edge would have endpoints not in $S$, so $S$ wouldn’t be a vertex cover). By definition, this clause then contains one true variable, hence the overall clause must be satisfied.
For the forward direction, suppose that $\phi$ is a satisfiable formula, where $\sigma$ is a satisfying assignment.
For each variable $x _ i$:
- If $\sigma(x _ i) = \text{True}$ then add $x _ i$ vertex from $i$-th variable gadget to $S$
- If $\sigma(x _ i) = \text{False}$, then add $\overline x _ i$ vertex from $i$-th variable gadget to $S$
By the above argument, for every clause $C _ j$, at least one literal is satisfied in $C _ j$ by $\sigma$. Choose one, and then add the other two literals from the $j$-th clause gadget to $S$.
What is the $\mathbf{LargeCut}$ NP-complete decision problem?
- Input: Graph $G$, integer $M > 0$.
- Output: Does $G$ have a cut of capacity $\ge M$?
What is the $\mathbf{HamiltonianCycle}$ NP-complete decision problem?
- Input: Graph $G$
- Output: Does $G$ have a cycle that goes through every vertex exactly once?
What is the $\mathbf{3COL}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$
- Output: Is the graph $3$-colorable?
What is the $\mathbf{SetCover}$ NP-complete decision problem?
- Input: Set $U$, collection $\mathcal F$ of subsets of $U$, integer $s$
- Output: Can we choose $s$ sets from $\mathcal F$ such that their union is $U$?
What is the $\mathbf{HittingSet}$ NP-complete decision problem?
- Input: Set $U$, collection $\mathcal F$ of subsets of $U$, integer $k$
- Output: Is there $U’ \subseteq U$ with $ \vert U’ \vert \le k$ such that $U’ \cap \mathcal F _ i \ne \emptyset$ for all $i$? i.e. is there a subset of $U$ that contains at least one element of each $\mathcal F _ i$ for each $i$, and has size less than $k$?
What is the $\mathbf{SubsetSum}$ NP-complete decision problem?
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
What is the $\mathbf{EqualPartition}$ NP-complete decision problem?
- Input: Integers $a _ 1, \ldots, a _ n$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = \sum _ {i \notin I} a _ i$
The $\mathbf{EqualPartition}$ decision problem is:
- Input: Integers $a _ 1, \ldots, a _ n$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = \sum _ {i \notin I} a _ i$
The $\mathbf{SubsetSum}$ decision problem is:
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
Quickly prove that
\[\mathbf{SubsetSum} \le \mathbf{EqualPartition}\]
You can assume that $\mathbf{SubsetSum}$ is also NP-complete when we add the restriction that all the integers are positive.
- Input: Integers $a _ 1, \ldots, a _ n$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = \sum _ {i \notin I} a _ i$
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
Let $(L, K)$ be an instance of $\mathbf{SubsetSum}$ where $L$ is a list of integers and $K$ is the target sum.
Define $S = \sum L$, and let
\[L' := L \cup \\{S+K, 2S-K\\}\]If $2S - K$ is negative, we can conclude $(L, K) \notin \mathbf{SubsetSum}$ and we are done. Otherwise, both $S + K$ and $2S - K$ are positive.
We aim to show
\[(L, K) \in \mathbf{SubsetSum} \iff L' \in \mathbf{Partition}\]Valid subset sum implies valid partition: if $\exists M \subseteq L$ such that $\sum M = K$, then
\[L' = (M \cup \\{2S - K\\}) \quad \cup \quad (L'\setminus(M \cup \\{2S - K\\}))\]both of which sum to $2S$, so $L’ \in \mathbf{Partition}$.
For the backward direction, suppose $L’$ can be partitioned into two parts $P _ 1$ and $P _ 2$ such that
\[\sum P_1 = \sum P_2\]By the assumption that all elements of $L’$ are positive, these must both equal $2S$. Hence it must be the case that $S + K$ and $2S - K$ belong to seperate partitions (or the partition they belong to would have sum exceeding $3S$). Then
\[P_2 \setminus \\{2S + K\\}\]has sum $K$, which gives a sublist of $L$ summing to $K$. Hence $(L, K) \in \mathbf{SubsetSum}$.
What is the $\mathbf{MinTSP}$ and $\mathbf{MinMetricTSP}$ optimisation problem?
For the $\mathbf{MinTSP}$:
- Input: Complete graph $G = (V, E)$, cost function $c : V \times V \to \mathbb R _ {\ge 0}$
- Output: A Hamiltonian cycle with minimum cost
For the $\mathbf{MinMetricTSP}$ problem, we have a couple additional restrictions:
- The cost function is symmetric
- The cost function satisfies the triangle inequality
What is the $\mathbf{DisjointTriangles}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does $G$ contain $k$ disjoint triangles?
What is the $\mathbf{LongPath}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does $G$ contain a path with $k$ vertices?
Quickly prove that integer linear programming is NP-complete.
This problem is NP since a candidate solution can be verified in polynomial time.
We can do this by reduction from $\mathbf{3SAT}$. Let $\Phi$ be a 3CNF formula. For each clause of the form $\ell _ i \vee \ell _ j \vee \ell _ k$, where $\ell _ i \in \{x _ i, \overline{x _ i}\}$, $\ell _ j \in \{x _ j, \overline{x _ j}\}$, $\ell _ k \in \{x _ k, \overline{x _ k}\}$, add a constraint of the form $y _ i + y _ j + y _ k \ge 1$ where $y _ i = x _ i$ if $\ell _ i = x _ i$, and $y _ i = 1 - x _ i$ if $\ell _ i = \overline{x _ i}$.
Then this reduction runs in polynomial time, and there is a satisfying assingment of $\Phi$ if and only if the integer linear program admits a solution.
What is the $\mathbf{LineCover}$ decision problem?
- Input: Set $P$ of $n$ points in the plane, integer $k$
- Output: Are there $\le k$ lines in the plane that cover all the points?
The $\mathbf{SubsetSum}$ decision problem is:
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
The $\mathbf{3SAT}$ decision problem is:
- Input: 3CNF formula $\Phi$ (each clause has at most three literals)
- Output: Is $\Phi$ satisfiable?
Briefly outline the construction used to prove $\mathbf{SubsetSum}$ is NP-complete, by reducing from $\mathbf{3SAT}$. Use the example of
\[\Phi = (\overline x \vee y) \wedge (x \vee y \vee \overline z) \wedge (x \vee z) \wedge (\overline y \vee z)\]
Let $\Phi$ be an instance of $\mathbf{3SAT}$ with $n$ vars and $m$ clauses.
For each literal in $\Phi$, introduce a number to be used in $\mathbf{SubsetSum}$ with $n + m$ digits:
- The first $n$ digits indicate the index of the variable it corresponds to
- The last $m$ digits will indicate the indices of clauses it satisfied
For each clause $c _ i$ in $\Phi$, add two numbers that can be used as slack where we have a $1$ in the last $m$ digits corresponding to the clause.
E.g. for
\[\Phi = (\overline x \vee y) \wedge (x \vee y \vee \overline z) \wedge (x \vee z) \wedge (\overline y \vee z)\] \[\begin{aligned} x &\mapsto a_1 = (100 \mid 0110) \\\\ y &\mapsto a_1 = (010 \mid 1100) \\\\ z &\mapsto a_1 = (001 \mid 0011) \\\\ \overline x &\mapsto a_1 = (100 \mid 1000) \\\\ \overline y &\mapsto a_1 = (010 \mid 0001) \\\\ \overline z &\mapsto a_1 = (001 \mid 0100) \\\\ \\\\ C_1 &\mapsto c_1, c_1' = (000 \mid 1000) \\\\ C_2 &\mapsto c_1, c_1' = (000 \mid 0100) \\\\ C_3 &\mapsto c_1, c_1' = (000 \mid 0010) \\\\ C_4 &\mapsto c_1, c_1' = (000 \mid 0001) \\\\ \end{aligned}\]Then pick the target $K$ as
\[K = (111 \mid 3333)\]