ADS HT24, NP-hardness and NP-completeness
Flashcards
Basic definitions
What does it mean for an algorithm $\mathcal A$ to run in polynomial time?
There exists a constant $c > 0$ such that for every input $x$, the algorithm terminates in at most $ \vert x \vert ^c$ steps.
Can you define the “certificate” version of whether a decision problem $\Pi$ belongs to $\mathbf{NP}$?
There exists a polynomial time verifier $\mathcal A$ and a constant $c > 0$ such that for all instances $x$ of $\Pi$,
- $\Pi(x) = \text{Yes} \implies$ these exists a certificate $y$ with $ \vert y \vert \le \vert x \vert ^c$ such that $\mathcal A(x, y) = \text{Yes}$
- $\Pi(x) = \text{No} \implies$ for all certificates $y$, $\mathcal A(x, y) = \text{No}$.
Suppose we want to show that a particular problem is a NP, e.g. 3SAT but where we want to decide if there is a satisfying assignment where at least half of the variables are set to true. Rather than doing a reduction, what can we instead do?
Use the polynomial time verifier characterisation: given a candidate assignment, we can efficiently check whether it is satisfiable, and we can efficiently check if at least half of the variables are set to true.
NP-complete decision problems
What is the $\mathbf{Clique}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a clique of size $\ge k$, where a clique is a set $S \subseteq V$ of mutually adjacent vertices?
What is the $\mathbf{IndependentSet}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
What is the $\mathbf{VertexCover}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
What is the $\mathbf{DominatingSet}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a dominating set $D$ of size $\le k$, where a dominating set $D$ is a set such that $\forall v \in V \setminus D$, $\exists u \in D$ s.t. $(u, v) \in E$.
Reductions
What is the difference between a Karp reduction and a Turing reduction for showing $A \le B$ in the “$A$ reduces to $B$” sense?
- Karp reduction: You are allowed to use only a single call to $B$ to solve $A$
- Turing reduction: You are allowed to use multiple calls to $B$ to solve $A$
The $\mathbf{IndependentSet}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
The $\textbf{VertexCover}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
Quickly prove that
\[\mathbf{VertexCover} \le \mathbf{IndependentSet}\]
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
Suppose $(G, k)$ is an instance of $\mathbf{VertexCover}$. Then $S$ is a vertex cover of $G$ iff $V \setminus S$ is an independent set of $G$, and since this is a polynomial time reduction we are done.
The $\mathbf{IndependentSet}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
The $\textbf{Clique}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a clique of size $\ge k$, where a clique is a set $S \subseteq V$ of mutually adjacent vertices?
Quickly prove that
\[\mathbf{IndependentSet} \le \mathbf{Clique}\]
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist an independent set of size $\ge k$, where a independent set is a set $S \subseteq V$ of mutually non-adjacent vertices?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does there exist a clique of size $\ge k$, where a clique is a set $S \subseteq V$ of mutually adjacent vertices?
Suppose $(G, k)$ is an instance of $\mathbf{IndependentSet}$. Then $S$ is an independent set of $G$ iff $S$ is an clique of $G^C$ (where this denotes the complement of the graph), and since this is a polynomial time reduction we are done.
What is the $\mathbf{SAT}$ decision problem?
- Input: CNF formula $\Phi$
- Output: Is $\Phi$ satisfiable?
What does it mean for a Boolean formula $\Phi$ to be in CNF?
- $\Phi$ is a conjunction (AND) of clauses. ($C _ 1 \wedge \cdots \wedge C _ n$)
- Clauses are disjunctions (OR) of literals. $(\alpha _ 1 \vee \cdots \vee \alpha _ n)$
- Literals are variables or their negation ($x$ or $\bar x$)
What is a literal in a logical clause?
A variable or its negation, e.g. $x$ or $\overline x$.
Can you state the Cook-Levin theorem?
$\mathbf{SAT}$ is NP-complete.
What is the $\mathbf{3SAT}$ decision problem?
- Input: 3CNF formula $\Phi$ (each clause has at most three literals)
- Output: Is $\Phi$ satisfiable?
The $\mathbf{Max3SAT}$ optimisation problem is:
- Input: 3CNF formula $\Phi$ (each clause has at most three literals) (suppose we have $m$ clauses)
- Output: An assignment with the maximum number of clauses satisfied
Given that
\[t(n) \le 7t(n-3) \implies t(n) \le 1.913^n\]
and
\[t(n) \le t(n-1) + t(n-2) + t(n-3) \implies t(n) \le 1.840^n\]
Describe two different algorithms based on the bounded search tree approach, and highlight the better one.
- Input: 3CNF formula $\Phi$ (each clause has at most three literals) (suppose we have $m$ clauses)
- Output: An assignment with the maximum number of clauses satisfied
- If $\Phi$ is a 2CNF formula, then we can find the answer in time $O(m)$.
- Else pick a clause $c = (x _ 1 \vee x _ 2 \vee x _ 3)$ with three literals
- We can either:
- Recurse on all 7 satisfying assignments ($x _ 1 = 0, x _ 2 = 0, x _ 3 = 1$, or $x _ 1 = 0, x _ 2 = 1, x _ 3 = 0$, etc). (Algorithm 1)
- Recurse on 3 assignments $x _ 1 = 1$, or $x _ 1 = 0$, $x _ 2 = 1$, or $x _ 1 = 0$, $x _ 2 = 0$, $x _ 3 = 1$. (Algorithm 2)
- Then output the maximum over all the cases
We have the cases, if $t(n)$ represents the number of leaves in the search tree for $n$ variables
- $t(n) \le 7t(n-3)$
- $t(n) \le t(n-1) + t(n-2) + t(n-3)$
Hence the second algorithm is better.
The $\mathbf{VertexCover}$ decision problem is:
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
The $\mathbf{3SAT}$ decision problem is:
- Input: CNF formula $\Phi$
- Output: Is $\Phi$ satisfiable?
Quickly prove that
\[\mathbf{3SAT} \le \mathbf{VertexCover}\]
- Input: Graph $G = (V, E)$, integer $k$
- Output: Is there a vertex cover of size $\le k$, where a vertex cover is a set $S \subseteq V$ such that every edge $e \in E$ is incident to some vertex in $S$.
- Input: CNF formula $\Phi$
- Output: Is $\Phi$ satisfiable?
Suppose are given a $\mathbf{3SAT}$ problem, the goal is to construct in polynomial time an instance $(G _ \phi, k _ \phi)$ of $\mathbf{VertexCover}$ such that
\[\phi \text{ is satisfiable} \iff G_\phi \text{ has a vertex covering of size}\le k_\phi\]Suppose $\phi$ is a 3CNF formula with $n$ variables and $m$ clauses. We construct the graph $G$ out of $n$ “variable gadgets” and $m$ “clause gadgets”.
(each variable $x _ i$ has a gadget consisting of $x _ i$ connected to $\overline{x _ i}$, and each clause $x _ i \vee x _ j \vee x _ k$ has a clause gadget which is all three variables connected one another)
Then we connect each occurence of $x _ i$ or $\overline{x _ i}$ in a variable gadget to each occurence of $x _ i$ or $\overline{x _ i}$ in a clause gadget. Note then that any vertex cover $S$ has at least $n + 2m$ vertices, since:
- At least two vertices must be in $S$ for each clause gadget
- At least one vertex must be in $S$ for each variable gadget
Let $k _ \phi = n + 2m$ (this turns out to be the right value to choose for $\phi \text{ is satisfiable} \iff G _ \phi \text{ has a vertex covering of size}\le k _ \phi$. For the backward direction, suppose $S$ is a vertex covering where $ \vert S \vert \le k _ \phi$, then by the above we must have that:
- Exactly two vertices in each clause gadget must be in $S$
- Exactly one vertex in each variable gadget must be in $S$
Define $\sigma(x _ i) = \text{True}$ if $x _ i \in S$ and $\text{False}$ otherwise.
Now consider some arbitrary clause $C _ j$ in $\phi$. By the above, the corresponding clause gadget must contain at least one vertex not in $S$. Since $S$ is a vertex cover, this vertex must be connected to a variable in the variable gadget which is in $S$ (otherwise, this edge would have endpoints not in $S$, so $S$ wouldn’t be a vertex cover). By definition, this clause then contains one true variable, hence the overall clause must be satisfied.
For the forward direction, suppose that $\phi$ is a satisfiable formula, where $\sigma$ is a satisfying assignment.
For each variable $x _ i$:
- If $\sigma(x _ i) = \text{True}$ then add $x _ i$ vertex from $i$-th variable gadget to $S$
- If $\sigma(x _ i) = \text{False}$, then add $\overline x _ i$ vertex from $i$-th variable gadget to $S$
By the above argument, for every clause $C _ j$, at least one literal is satisfied in $C _ j$ by $\sigma$. Choose one, and then add the other two literals from the $j$-th clause gadget to $S$.
What is the $\mathbf{LargeCut}$ NP-complete decision problem?
- Input: Graph $G$, integer $M > 0$.
- Output: Does $G$ have a cut of capacity $\ge M$?
What is the $\mathbf{HamiltonianCycle}$ NP-complete decision problem?
- Input: Graph $G$
- Output: Does $G$ have a cycle that goes through every vertex exactly once?
What is the $\mathbf{3COL}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$
- Output: Is the graph $3$-colorable?
What is the $\mathbf{SetCover}$ NP-complete decision problem?
- Input: Set $U$, collection $\mathcal F$ of subsets of $U$, integer $s$
- Output: Can we choose $s$ sets from $\mathcal F$ such that their union is $U$?
What is the $\mathbf{HittingSet}$ NP-complete decision problem?
- Input: Set $U$, collection $\mathcal F$ of subsets of $U$, integer $k$
- Output: Is there $U’ \subseteq U$ with $ \vert U’ \vert \le k$ such that $U’ \cap \mathcal F _ i \ne \emptyset$ for all $i$? i.e. is there a subset of $U$ that contains at least one element of each $\mathcal F _ i$ for each $i$, and has size less than $k$?
What is the $\mathbf{SubsetSum}$ NP-complete decision problem?
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
What is the $\mathbf{EqualPartition}$ NP-complete decision problem?
- Input: Integers $a _ 1, \ldots, a _ n$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = \sum _ {i \notin I} a _ i$
The $\mathbf{EqualPartition}$ decision problem is:
- Input: Integers $a _ 1, \ldots, a _ n$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = \sum _ {i \notin I} a _ i$
The $\mathbf{SubsetSum}$ decision problem is:
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
Quickly prove that
\[\mathbf{SubsetSum} \le \mathbf{EqualPartition}\]
You can assume that $\mathbf{SubsetSum}$ is also NP-complete when we add the restriction that all the integers are positive.
- Input: Integers $a _ 1, \ldots, a _ n$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = \sum _ {i \notin I} a _ i$
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
Let $(L, K)$ be an instance of $\mathbf{SubsetSum}$ where $L$ is a list of integers and $K$ is the target sum.
Define $S = \sum L$, and let
\[L' := L \cup \\{S+K, 2S-K\\}\]If $2S - K$ is negative, we can conclude $(L, K) \notin \mathbf{SubsetSum}$ and we are done. Otherwise, both $S + K$ and $2S - K$ are positive.
We aim to show
\[(L, K) \in \mathbf{SubsetSum} \iff L' \in \mathbf{Partition}\]Valid subset sum implies valid partition: if $\exists M \subseteq L$ such that $\sum M = K$, then
\[L' = (M \cup \\{2S - K\\}) \quad \cup \quad (L'\setminus(M \cup \\{2S - K\\}))\]both of which sum to $2S$, so $L’ \in \mathbf{Partition}$.
For the backward direction, suppose $L’$ can be partitioned into two parts $P _ 1$ and $P _ 2$ such that
\[\sum P_1 = \sum P_2\]By the assumption that all elements of $L’$ are positive, these must both equal $2S$. Hence it must be the case that $S + K$ and $2S - K$ belong to seperate partitions (or the partition they belong to would have sum exceeding $3S$). Then
\[P_2 \setminus \\{2S + K\\}\]has sum $K$, which gives a sublist of $L$ summing to $K$. Hence $(L, K) \in \mathbf{SubsetSum}$.
What is the $\mathbf{MinTSP}$ and $\mathbf{MinMetricTSP}$ optimisation problem?
For the $\mathbf{MinTSP}$:
- Input: Complete graph $G = (V, E)$, cost function $c : V \times V \to \mathbb R _ {\ge 0}$
- Output: A Hamiltonian cycle with minimum cost
For the $\mathbf{MinMetricTSP}$ problem, we have a couple additional restrictions:
- The cost function is symmetric
- The cost function satisfies the triangle inequality
What is the $\mathbf{DisjointTriangles}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does $G$ contain $k$ disjoint triangles?
What is the $\mathbf{LongPath}$ NP-complete decision problem?
- Input: Graph $G = (V, E)$, integer $k$
- Output: Does $G$ contain a path with $k$ vertices?
Quickly prove that integer linear programming is NP-complete.
This problem is NP since a candidate solution can be verified in polynomial time.
We can do this by reduction from $\mathbf{3SAT}$. Let $\Phi$ be a 3CNF formula. For each clause of the form $\ell _ i \vee \ell _ j \vee \ell _ k$, where $\ell _ i \in \{x _ i, \overline{x _ i}\}$, $\ell _ j \in \{x _ j, \overline{x _ j}\}$, $\ell _ k \in \{x _ k, \overline{x _ k}\}$, add a constraint of the form $y _ i + y _ j + y _ k \ge 1$ where $y _ i = x _ i$ if $\ell _ i = x _ i$, and $y _ i = 1 - x _ i$ if $\ell _ i = \overline{x _ i}$.
Then this reduction runs in polynomial time, and there is a satisfying assingment of $\Phi$ if and only if the integer linear program admits a solution.
What is the $\mathbf{LineCover}$ decision problem?
- Input: Set $P$ of $n$ points in the plane, integer $k$
- Output: Are there $\le k$ lines in the plane that cover all the points?
The $\mathbf{SubsetSum}$ decision problem is:
- Input: Integers $a _ 1, \ldots, a _ n$, integer $K$
- Output: Is there a subset $I \subseteq \{1, \cdots, n\}$ such that $\sum _ {i \in I} a _ i = K$
The $\mathbf{3SAT}$ decision problem is:
- Input: 3CNF formula $\Phi$ (each clause has at most three literals)
- Output: Is $\Phi$ satisfiable?
Briefly outline the construction used to prove $\mathbf{SubsetSum}$ is NP-complete, by reducing from $\mathbf{3SAT}$. Use the example of
\[\Phi = (\overline x \vee y) \wedge (x \vee y \vee \overline z) \wedge (x \vee z) \wedge (\overline y \vee z)\]
Let $\Phi$ be an instance of $\mathbf{3SAT}$ with $n$ vars and $m$ clauses.
For each literal in $\Phi$, introduce a number to be used in $\mathbf{SubsetSum}$ with $n + m$ digits:
- The first $n$ digits indicate the index of the variable it corresponds to
- The last $m$ digits will indicate the indices of clauses it satisfied
For each clause $c _ i$ in $\Phi$, add two numbers that can be used as slack where we have a $1$ in the last $m$ digits corresponding to the clause.
E.g. for
\[\Phi = (\overline x \vee y) \wedge (x \vee y \vee \overline z) \wedge (x \vee z) \wedge (\overline y \vee z)\] \[\begin{aligned} x &\mapsto a_1 = (100 \mid 0110) \\\\ y &\mapsto a_1 = (010 \mid 1100) \\\\ z &\mapsto a_1 = (001 \mid 0011) \\\\ \overline x &\mapsto a_1 = (100 \mid 1000) \\\\ \overline y &\mapsto a_1 = (010 \mid 0001) \\\\ \overline z &\mapsto a_1 = (001 \mid 0100) \\\\ \\\\ C_1 &\mapsto c_1, c_1' = (000 \mid 1000) \\\\ C_2 &\mapsto c_1, c_1' = (000 \mid 0100) \\\\ C_3 &\mapsto c_1, c_1' = (000 \mid 0010) \\\\ C_4 &\mapsto c_1, c_1' = (000 \mid 0001) \\\\ \end{aligned}\]Then pick the target $K$ as
\[K = (111 \mid 3333)\]