\documentclass[11pt]{article}
%\usepackage{mathrsfs}
%\usepackage{psbox}
\usepackage{amsmath,amssymb,complexity}
\usepackage{epsf,epic,eepic}
\usepackage{pgf,pgfarrows,pgfnodes,pgfautomata,pgfheaps}
\textwidth=6in
\oddsidemargin=0.25in
\evensidemargin=0.25in
\topmargin=-0.1in
\footskip=0.8in
\parindent=0.0cm
\parskip=0.3cm
\textheight=8.00in
\setcounter{tocdepth} {3}
\setcounter{secnumdepth} {2}
\sloppy
\newtheorem{example}{Example}
\newcommand{\ntime}{\hbox{NTIME}}
\newcommand{\nspace}{\hbox{NSPACE}}
\newcommand{\conspace}{\hbox{co-NSPACE}}
\newcommand{\np}{\hbox{NP}}
\newcommand{\reach}{{\sf reach}}
\newcommand{\pspace}{\hbox{PSPACE}}
\newcommand{\lspace}{\hbox{L}}
\newcommand{\conp}{\hbox{coNP}}
\newcommand{\exptime}{\hbox{EXPTIME}}
\newcommand{\elem}{\hbox{E}}
\newcommand{\nl}{\hbox{NL}}
\newcommand{\bpp}{\hbox{BPP}}
\newcommand{\nregexp}{\hbox{NREGEXP}}
\newcommand{\tqbf}{\hbox{TQBF}}
\newcommand{\MAJ}{\hbox{\sc Maj}}
\newcommand{\threesat}{\hbox{3SAT}}
\newcommand{\cvp}{\hbox{CVP}}
\newcommand{\stconn}{\hbox{STCONN}}
\newcommand{\ispath}{\hbox{ISPATH}}
\renewcommand{\C}{\mathcal{C}}
\newtheorem{conjecture}{Conjecture}
%\newcommand{\class}{\hbox{$\mathbb{C}$}}
%\newcommand{\class}{\hbox{$\mathbf{C}$}}
\newcommand{\lep}{\leq _{\hbox{P}}}
\newcommand{\lel}{\leq _{\hbox{L}}}
\newcommand{\aspace}[1]{{\rm ASPACE}(#1)}
\newcommand{\atime}[1]{{\rm ATIME}(#1)}
\newcommand{\spa}[1]{{\rm SPACE}(#1)}
\newcommand{\ti}[1]{{\rm TIME}(#1)}
\newcommand{\ap}{{\rm AP}}
\newcommand{\al}{{\rm AL}}
\newcommand{\bhp}{{\sc BHP}}
\newcommand{\sat}{{\sc SAT}}
\newcommand{\cktsat}{{\sc Circuit-Sat}}
\newcommand{\cktval}{{\sc Circuit-Value}}
\begin{document}
\input{preamble.tex}
\lecture{7}{Apr 8, 2009}{Jayalal Sarma M.N.}{Shiteng Chen}
In this lecture, we will discuss one of the basic concepts in complexity
theory; namely the notion of reduction and completeness.
In the past few lectures, we saw ways of formally arguing about a
class of problems which can be solved under certain resource
requirements. In order to understand this at a finer level, we need to
be able to analyse them at an individual level, and answer questions
like ``how do we compare resource requirements of various
computational problems?'' and ``what is the hardest problem in a class
$\C$?'' etc. The notion of reductions comes in, exactly at this
point.
The outline of this lecture is as follows. We will motivate and define
different notions of reductions between problems. In particular we
will define many-one (also known as Karp reductions), Turing
reductions (also known as Cook reduction) and finally Truth-table reductions.
Then we will define the notion of completeness to a complexity class,
and describe and (sometimes prove) completeness of problems to various
complexity classes that we described in earlier lectures. This gives
us a concrete machine-independent way of thinking about these resource
requirements, which is going to be crucial in the upcoming lectures.
We will end by mentioning a few interesting concepts about the structure of
$\NP$-complete set. We will define sparse set, poly size circuit and
collapse result which we will see in the upcoming lecture.
\section{Reductions}
In order to make comparisons between two problems (say $A$ and $B$), a
simple line of thought is to transform the instance of $A$ into an
instance of $B$ such that if you solve the $B$'s instance then you can
recover $A$'s solutions from it. There are many basic issues with
this. First of all, how do formally capture the computation that is
need to make this transformation? Well, the answer is agian our
favorite abstration : Turing machines and circuits.
%\subsection{Resource Bounds for Reductions}
If we impose the resource bounds on the reductions (since they are
also Turing machine computations), the comparisons between the
problems can vary in nature. In particular if the reduction uses only
$\log$ space then we call it a $\log$-space reduction, and if it runs
in $\poly$ time, then we call the reduction to be $\poly$-time
reductions.
Another natural thought is on how much computation is needed on the
recovery of the solution of $A$ from that of $B$. Varying this
restruction will give us different kind of transformations between
problems and gives us tools to analyse the complexity classes at a
finer level.
\subsection{Many-one reduction}
This is a most basic (or strictest) abstraction of what we described
in the previous section. Intuitively, it says, we should not have to
do any computation (not even complimentation) after we do the
trasformation. This leads to the notion of many-one reduction (we
state this only for $\poly$-time).
\begin{definition}[poly-time many-one reductions]
Let $A, B \subset \Sigma^*$ which encodes computational problems, we
say that $A$ $\poly$-time many-one reduces to $B$, denoted by $A
\le_m^p B$, if there exists a polynomial time computable function $f
:â†’\Sigma^* \to \Sigma^*$ such that,
\[ x \in A \iff f(x) \in B \]
\end{definition}
We shall quickly see some basic properties of this reduction. This
defines a relation among subsets of $\Sigma^*$. It is quite clear that
(exercise !) the relation is reflexive and transitive. It is not
symmetric (nothing can be many-one reduced to $\Sigma^*$ !).
A complexity class $\mathcal{C}$ is said to be closed under a
reduction, if $B \in \C$ and $A \le B$ then $A \in \mathcal{C}$ as
well. We wll skip the proof of the following easy proposition as
exercise.
\begin{proposition}
The classes $\NP$ and $\PSPACE$ are all closed under $\poly$-time
many-one reductions.
\end{proposition}
\subsection{Hardness and Completeness}
Having defined a way to compare problems, we can address the second
question that we started out with, namely to use it to define the
``hardest'' problem in a class. Intuitively, a hard problem in a class
$\C$ should be those problems to which every other problem in $\C$
reduces. Clearly, there may be problems which are reducible to each
other, and hence more than one hard problem can exist.
For the notion of hardness to be useful, the reduction (machine $M$)
should run in resource bounds less than that of the class
itself. Otherwise, the computation that computes the reduction to the
hard problem can solve the problem by itself.
Now we will make this formal.
\begin{definition}[Hardness \& Completeness]
AâŠ†language $L \subseteq \Sigma^*$ is hard for a class $\C$ under
$\poly$ time many-one reductions if and only if every $L' \inâˆˆ\C$, $L'
\le_m^p L$. If in addition, $L$ is in the complexity class $\C$, then
$L$ is defined to be complete for the class $\C$.
\end{definition}
Thus to show that a problem $L$ is complete for a class $\C$, we have
to show that (1) every problem in $\C$ reduces to $L$ and (2) $L$
itself is in $\C$. Suppose we have done the hard work once, to prove
that a problem (say $L$) is hard for the class $\C$. Now, in order to
show that another problem is hard for $\C$ we won't have to work from
scratch. We saw that reductions are transitive, and hence it will
suffice to show that $L$ reduces to $L'$.
Another simple observation is that if a problem $L$ is complete for
the class $\C$ then $\overline{L}$ is complete for the complement
class $\co-\C$.
Thus we need a hard problem to begin with. We will demonstrate this
with an example. Let us say our $\C$ is $\NP$. We describe below a
generic $\NP$-complete problem which is usually called {\em canonical
complete problem}.
\begin{definition}[Bounded Halting Problem]
Given the description (encoded string in $\Sigma^*$) of a
non-deterministic Turing machine $M$, an input $x$ and a integer $k$,
test if $M$ accepts input $x$ in $k$ steps.
\end{definition}
\begin{theorem}
\bhp is $\NP$-complete.
\end{theorem}
\begin{proof}
Easy part first. \bhp\ $\in \NP$. Indeed, an non-deterministic machine
can simulate $M$ and keep a counter initialised to $k$ and check if
the machine halts within the stipulated time limit $k$.
We will show that \bhp\ is $\NP$-hard. Let $L \in NP$. We will show
that $L \le_m^p$ \bhp. By definition, there is a non-deterministic
Turing machine $M$ that computes $L$ which runs in time $p(n)$ for
some fixed polynomial $p$. Thus,
\begin{eqnarray*}
x \in L & \iff & \textrm{$M$ accepts $x$ in time $p(|x|)$} \\
& \iff & (M,x,p(|x|) \in \textrm{\bhp}.
\end{eqnarray*}
\end{proof}
Using this problem as the starting point, we can derive new
$\NP$-complete problems. Stephen Cook pioneered in this conceptual
framework and proved that a bunch of important computational problem
are actually $\NP$-hard. This came as a surprise to many, and
projected up the importance of $\P$ vs. $\NP$ problem.
We will not completely describe the $\NP$-hardness proofs by Cook and
others as we have seen it in the previous courses. We will put these
into perspective, draw your attention to some aspects of the
reduction, and leave the details for you to check with what you have
already seen in the part courses.
We will define a more general problem.
\begin{definition}[\cktsat]
Given a Boolean Circuit $C(x_1, \ldots x_n)$, \cktsat\ problem asks if
there exists an assignment $a_1, \ldots a_n$ to the variables $x_1
\ldots x_n$ such that $C(a_1, \ldots a_n) = 1$.
\end{definition}
The first problem in Cook's list was \sat\ problem, which asks for
checking satisfiability for a given Boolean formular $F$ on $n$
variables. A formula, by definition, is also a circuit. Hence
\cktsat\ if at all, is harder. But it is also clearly in $\NP$
because, the non-deterministic TM can guess an assignment (which is
only length $n$), and then deterministically evaluate the value of the
circuit at this input to check if it indeed evaluates to 1. Notice
that the verification procedure (this problem is circuit value problem
as we will see below) runs in time linear in size of the circuit
(which is also a part of the input), since it just have to evaluate
the circuit in the most natural way from leafs to root. Thus we will
define the following problem as well.
\begin{definition}[\cktval]
Given a Boolean Circuit $C(x_1, \ldots x_n)$, and an assignment $a_1,
\ldots a_n$ to the variables \cktval\ problem asks if $C(a_1, \ldots
a_n) = 1$.
\end{definition}
You all have seen Cook's reduction from \bhp\ to \sat\ before, so we
will skip it completely in this course. We will enumerate certain
aspects of the reduction that are relevant for us now and state a few
more propositions based on them. We reiterate the basic implication:
{\em if there is a polynomial time algorithm for solving \sat\ then it
gives us a polynomial time algorithm for all the problems in $\NP$
thus proving $\P = \NP$.}
A first observation is that the same idea can be adapted\footnote{we
will skip the details, since it is technical and we are not doing
Cook's proof anyway.} to show that \cktval\ is hard for $\P$.
We also notice that the reduction can be done in $\AC^0$.
\begin{proposition} Under $\AC^0$ many-one reductions :
\begin{itemize}
\item \cktsat\ is $\NP$-complete.
\item \cktval\ is $\P$-complete.
\end{itemize}
\end{proposition}
By now, a lot of interesting natural problems are known to be
$\NP$-complete. The following is a short list of the first few which
were shown to be so. We will not do more hardness proofs in this
course, but it is highly recommended to go through some of them, if
you have not studied them in a previous course.
\begin{remark}
As an intriguing fact, most of the known reductions showing the
$\NP$-hardness are known to be $\AC^0$-reductions. Here is an
interesting question to think about: Are $\poly$ time many-one
reductions strictly weaker than $\AC^0$ many-one reductions?
\end{remark}
\[
\begin{array}{|l|l|}
\hline {\rm Problem} & {\rm Definition} \\
\hline \textrm{\sc Clique}
&\textrm{Given a graph $G$ and an integer $k$}, \\
&\textrm{Test if $G$ has a complete graph of size $k$ or not.} \\
\hline \textrm{\sc Independent Set}
&\textrm{Given a graph $G(V,E)$ and an integer $k$}, \\
&\textrm{Test if there is $V' \subseteq V$ of size $k$}\\
&\textrm{such that for any $u,v \in V'$, $(u,v) \notin E$} \\
\hline \textrm{\sc Vertex Cover}
&\textrm{Given a graph $G(V,E)$ and an integer $k$}, \\
&\textrm{Test if there is $V' \subseteq V$ of size $k$}\\
&\textrm{such that each $e \in E$ has at least one end point in $V'$.} \\
\hline
\end{array}
\]
\section{Turing Reduction}
Now we will define another way to compare problems. To motivate this,
we start with an example problem. We will consider a trivial way to
reduce \sat\ to $\overline{\textrm{\sat}}$. For instance, we know
that:
\[ x \in \textrm{\sat} \iff x \notin \overline{\textrm{\sat}} \]
From definition, it is unclear whether \sat\ $\le_m^p$
$\overline{\textrm{ \sat}}$, because the recovering the solution for
problem the answer to the $\overline{\textrm{\sat}}$ involves a
computation (namely switching yes to no and vice versa).
Generalising this further, we can imagine a reduction model which
allows allow $M$ (the Turing machine performing the reduction) to do
any computation after getting the result from $B$ and even allowing it
to solve many instances of $B$. This leads to the notion of a Turing
definition (also called Cook reductions). Formally:
\begin{definition}[Poly time Turing Reduction]
Let $A, B \subset \Sigma^*$ which encodes computational problems, we
say that $A$ $\poly$-time Turing reduces to $B$, denoted by $A \le_T^p
B$, if there exists a polynomial time oracle Turing machine $M$ which
has oracle access to the set $B$ and decides membership of strings in
$A$.
\end{definition}
A first observation is that this definition still captures what we
started out with, while trying to define the notion of reductions.
That is, if $B \in P$, then that implies $A \in P$ as well. We will
look for more properties. For instance:
\begin{proposition}
\[ \textrm{\sat} \le_T^p \overline{\textrm{ \sat}}. \]
\[ \textrm{\sat} \le_m^p \overline{\textrm{ \sat}} \iff
\textrm{$\NP = \co$-$\NP$}. \]
\end{proposition}
\begin{proof}
Part (1) follows from basic definition itself. Indeed, the Turing
machine asks if $x \in \overline{\textrm{ \sat}}$ and then answers yes
if it says no, and vice versa.
To see part (2): Suppose $\NP = \co$-$\NP$. By
$\co$-$\NP$-completeness of $\overline{\textrm{ \sat}}$, and
$\NP$-completeness of \sat\ we have that for every formula $\phi$
there is another formula $\phi'$ such that (1) given $\phi$, $\phi'$
can be constructed in $\poly$ time, and (2) $\phi'$ is satisfiable if
and only if $\phi$ is not satisfiable. This itself gives the many-one
reduction from \sat to $\overline{\textrm{\sat}}$.
\end{proof}
It is widely believed that $\NP \ne \co$-$\NP$. This may be an evident
that $\le_m^p$ and $\le_T^p$ are not the same.
Now we relate Turing reduction to orcale complexity classes that we
have seen in previous classes. The following theorem follows from definitions,
but characterises Turing reductions in terms of oracle Turing machines.
\begin{theorem}
For any language $A \subset \Sigma^*$, if $A \le_T^p B$ \sat\ then $A
\in \P^{B}$. In particular if $B=\textrm{\sat}$ then $L' \in \P^{\NP}$.
\end{theorem}
We will demonstrate the power of Turing reductions by descrbing one more example.
We consider the following two problems. A clique is complete graph.
\noindent {\sc Exact-Clique}: Given a graph $G$ an integer $k$,
determine if the maximum size of a clique in $G$ is exactly $k$?
\begin{theorem}
{\sc Exact-Clique} $\equiv_T^p$ {\sc Clique} and {\sc Clique} $\le_T^p$ {\sc Exact-Clique}.
\end{theorem}
\begin{proof}
{\sc Exact-Clique} $\le_T^p$ {\sc Clique} : Let $G(V,E)$ be the given
graph and $k$ be an integer. Notice that the maximum clique size of
$G$ is $k$ if and only if $(G,k) \in$ {\sc Clique} and $(G,k+1)
\notin$ {\sc Clique}. This gives 2-query Turing reduction to {\sc
Clique}. \\[2mm]
%
\noindent {\sc Clique} $\le_T^p$ {\sc Exact-Clique} : Let $G(V,E)$ be
the given graph and $\ell$ be an integer. To check if there is a
clique of size $\ell$, ask the {\sc Exact-Clique} problem with $k$
ranging from $a, \ldots, |V|$. If the maximum $k'$ for which this says
{\sc Yes}, is at least $\ell$ then answer {\sc Yes}, otherwise {\sc
No}.
\end{proof}
The above two examples, although are Turing reductions, are not
exactly using its complete power. More precisely, the queries that are
asked are not adaptive. In the example of \sat\ to
$\overline{\textrm{\sat}}$, there was only one query. In the above
example ({\sc Exact-Clique} $\le_T^p$ {\sc Clique}) the second query
does not depend on the answer to the first query. This leads us to a
type of reduction that is in between many-one reduction and the
general Turing reductions, namely Truth table reductions.
\subsection{Truth Table Reductions}
A Truth Table reduction must present all of its (finitely many) oracle
queries at the same time. Recalling the characterisation of Turing
reductions using oracle model of computation, we can imagine this to
be {\em non-adaptive oracle Turing machines} where the queries should
be written down simultaneously. This restriction ensures that once the
oracle gives out the response (answer bits, 0/1, of the queries), then
the computation (since it does not involve oracle queires any more)
will only involve a table look up into the truth table to decide
whether to accept or reject based on these response bits. This leads
to the name. More formally,
\begin{definition}[Poly time Truth Table Reduction]
Let $A, B \subseteq \Sigma^*$ which encodes computational problems, we
say that $A$ $\poly$-time Truth Table reduces to $B$, denoted by $A
\le_{tt}^p B$, if there exists a polynomial time non-adaptive oracle
Turing machine $M$ which has oracle access to the set $B$ and decides
membership of strings in $A$.
\end{definition}
As an example, all the examples of the Turing reductions that we
presented above are in fact truth table reductions. In addition, the
following comparison of the strength of the reductions follows from
our discussion.
\begin{proposition}
Let $A, B \subseteq \Sigma^*$ which encodes computational problems. Then:
\[ A \le_{m}^p B ~~\implies ~~A \le_{tt}^p B ~~\implies~~ A \le_{T}^p B \]
\end{proposition}
\section{Structure of $\NP$-complete sets}
We saw that there is a set of problems (one of them being \sat) which
are the {\em hardest} problems in the complexity class $\NP$. There
has been a lot of interest in studying the structure of these sets in
the 70s and that led to a lot of insights on the density of the
$\NP$-complete sets. In this lecture we will mention (we will skip the
details, giving pointers for further reading) some of these ideas.
A first question is about the density of any $\NP$-complete set. This
leads us to the definition of sparse set.
\subsection{Sparse sets}
We start with the definition of sparse sets.
\begin{definition}[Sparse sets]
A set $A \subseteq \Sigma^*$ is said to be sparse, if there exists a
polynomial $p(n)$ such that $A_n$, the $n^{\textrm{th}}$ slice of $A$,
defined as,
\[ A_n = \{ x \in A : |x| = n \} \]
has at most $p(n)$ strings in it.
\end{definition}
Is an $\NP$-complete set likely to be sparse. The following theorem
connects this to the well known problem regarding collapse of the
polynomial heirarchy that we saw in the previous lecture.
\begin{theorem}
If a sparse set is $\NP$-complete then $\PH$ collapses to $\Sigma_2$.
\end{theorem}
We will postpone the proof of this theorem to next lecture. In fact,
we will show more. We will see that (1) any sparse set has a poly size
circuit to compute it. (2) If an $\NP$-complete problem can be
computed by polynomial sized circuits then $\PH$ collapses to
$\Sigma_2$.
Since \cktval\ is $\P$-complete, we can conclude that any language in
$\P$ has polynomial size circuits computing them. Thus if we prove
that there is no polynomial sized circuit family computing \sat, we
prove $\P \ne \NP$.
\subsection{Berman-Hartmanis Conjecture}
This is another attempt to study the strucutre of $\NP$-complete sets.
The basic question considered is whether all the $\NP$-complete sets
are structurally the same. In particular, is there an easily
computable transformation among these sets (as subsets of $\Sigma^*$).
This is the famous Berman-Hartmanis conjecture. More formally,
\begin{definition}[$\poly$-time isomorphism]
Two sets $A$ and $B$ are polynomial-time isomorphic if there exists a
function $f: A \to B$ such that
\begin{itemize}
\item $x \in A \iff f(x) \in B$.
\item $f$ is a bijection from $A$ to $B$. Hence $f^{-1}$ exists.
\item $f$ and $f^{-1}$ is polynomial-time computable.
\end{itemize}
\end{definition}
The relation defined by the isomorphism between pairs of languages is
an equivalence relation.
\begin{conjecture}[Berman-Hartmanis, 1976]
Every pair of $\NP$-complete sets are $\poly$-time isomorphic to each
other.
\end{conjecture}
Now what is the connection with $\P$ vs. $\NP$ question? If $\P = \NP$
then there would be a finite $\NP$ complete set (the $\poly$ time
reduction can simply solve the problem and produce an instance inside
or outside the finite set $S$.). Now this set, which is finite, can't
be polynomial time isomorphic to \sat\ which is an infinite set. Thus
isomorphism conjecture, if proved, will imply $\P \ne \NP$. However,
it is widely believed these days that the isomorphism conjecture is
too strong to be true.
There is a lot of classical literature published since the late
seventy's which addresses different versions of isomorphism
conjecture. We will leave it to you to explore these using the links
provided at the course website entry for this lecture.
\end{document}