Quantum Computing Since Democritus, Aaronson
- Chapter 3, “Gödel, Turing and friends”
- Chapter 4, “Minds and machines”
- Chapter 5, “Paleocomplexity”
- Chapter 6, “P, NP, and Friends”
- Chapter 7, “Randomness”
- Chapter 8, “Crypto”
- Chapter 9, “Quantum”
- Chapter 10, “Quantum computing”
- Chapter 11, “Penrose”
- Chapter 12, “Decoherence and hidden variables”
- Chapter 13, “Proofs”
- Chapter 14, “How big are quantum states?”
- Chapter 15, “Skepticism of quantum computing”
- Chapter 16, “Learning”
- Chapter 17, “Interactive proofs and more”
- Chapter 18, “Fun with the anthropic principle”
- Chapter 19, “Free will”
- Chapter 20, “Time travel”
- Chapter 21, “Ask me anything”
“Quantum mechanics is a beautiful generalization of the laws of probability: a generalization based on the 2-norm rather than the 1-norm, and on complex numbers rather than nonnegative real numbers. It can be studied completely separately from its applications to physics (and indeed, doing so provides a good starting point for learning the physical applications later). This generalized probability theory leads naturally to a new model of computation – the quantum computing model – that challenges ideas about computation once considered a priori, and that theoretical computer scientists might have been driven to invent for their own purposes, even if there were no relation to physics. In short, while quantum mechanics was invented a century ago to solve technical problems in physics, today it can be fruitfully explained from an extremely different perspective: as part of the history of ideas, in math, logic, computation, and philosophy, about the limits of the knowable.”
This was great – enjoyed reading it just as much as I did [[Gödel, Escher, Bach]]N. I wanted to read it after seeing a really inspiring seminar given by Scott Aaronson himself called “AI Safety and Theoretical Computer Science”:
Like the talk, the book was super interesting from a technical standpoint but also very funny. And like [[Gödel, Escher, Bach]]N, it was just page after page of stuff that I want to learn more about.
In taking these notes from the book, I have accidentally sucked all of the humour out of the writing. So these notes (and really like all of my book notes) are no substitute for actually reading it yourself.
Chapter 3, “Gödel, Turing and friends”
Gödel’s incompleteness theorems from the Halting Problem
Related to [[Notes - Models of Computation MT23, Decidability]]U.
- Gödel’s first incompleteness theorem says that given any consistent, computable set of axioms, there’s a true statement about the integers that can never be proved from those axioms
- How to prove it via the Halting problem?
- Suppose there did exist a consistent, computable proof system $F$ from which any statement about the integers could be proved or disproved
- Then for any computer program, we could translate it into a statement in $F$ and search through proofs until we found a proof that it halts or a proof that it doesn’t
- Since the Halting problem is undecidable, this is impossible
A similar trick works for the second incompleteness theorem:
- Gödel’s second incompleteness theorem says that if $D$ is consistent, then $F$ can’t prove its own consistency
- Let $P$ be the program that given a program $Q$ decides whether $Q$ halts by the strategy above (searching through proofs that it halts)
- Modify $P$ to produce a new program $P’$ that:
- Runs forever if $Q$ given its own code as input is proved to halt, or
- Halts if $Q$ given its own code as input is proved to run forever
- Then feed $P’$ its own code as input
- Then $P’$ will run forever, without ever discovering a proof that it halts or doesn’t halt.
- But this is a contradiction, since then $P’$ should halt, but then it should run forever, and so on
- The incorrect assumption is that $F$ is consistent.
- If $F$ is inconsistent, then there could be a proof that $P’$ halts, even if $P’$ really did run forever.
- But if $F$ could prove that $F$ was consistent, then $F$ could also prove $P’$ ran forever.
- So the only conclusion is that if $F$ is consistent, then $F$ can’t prove its own consistency.
Chapter 4, “Minds and machines”
Turing degrees
- Problem $A$ is Turing reducible to problem $B$ if $A$ is solvable by a Turing machine given an oracle for $B$
- Two problems are Turing equivalent if each is Turing reducible to the other
- A Turing degree is the set of all problems that are Turing equivalent to a given problem
- Two famous Turing degrees:
- The set of all computable problems
- The set of all problems Turing reducible to the halting problem
- Are there other Turing degrees?
- Yes! Take the “super halting problem”: given a Turing machine with an oracle for the halting problem, decide if it halts.
- The exact some proof as the one in [[Notes - Models of Computation MT23, Decidability]]U goes through as before by giving all the machines an oracle for the halting problem.
- When this happens, the proof is said to “relativize”
- What about Turing degrees “between” the set of computable problems and the set of all problems Turing reducible to the halting problem? (Where “between” is in the sense of $A \le _ T B \le _ T C$)
- Yes, these can be constructed. But there’s currently no examples of a “natural” problem with an intermediate Turing degree
Arguments about thinking machines
“One can indeed give weighty and compelling arguments against the possibility of thinking machines. The only problems with these arguments is that they’re also arguments against the possibility of thinking brains!”
- One argument against AI is that its intelligence is just a reflection of the humans who programmed it
- But what if human intelligence is just a reflection of the process that trained it, namely a billion years of evolution?
- AI skeptics seem to take the qualia of other people is taken for granted, but the qualia of machines is put into question
The Chinese Room argument
“In the last 60 years, have there been any new insights about the Turing Test itself? In my option not many. There has, on the other hand, been a famous “attempted” insight, which is called Searle’s Chinese Room.”
“You sit in a room and someone passes you paper slips through a hole in the wale with questions written in Chinese, and you’re able to answer the questions (again in Chinese) just by consulting a rule book. In this case, you might be carrying out an intelligent Chinese conversation, yet by assumption, you don’t understand a word of Chinese! Therefore, symbol-manipulation can’t produce understanding.”
“…The third thing that annoys me about the Chinese Room argument is that the way it gets so much mileage from a possibly misleading choice of imagery, or, one might say, by trying to sidestep the entire issue of computational complexity purely through clever framing.”
- “How many slips of paper are we talking about?”
- “How big would such a rule book have to be?”
- “How quickly would you have to consult it, to carry out an intelligent Chinese conversation in anything resembling real time?”
I had a debate with someone once about whether ChatGPT (or LLMs in general) were really “intelligent”, and they used this as an argument that it wasn’t as a pretty convincing argument that it wasn’t. But this rebuttal makes things more complicated, and I’m not so sure anymore.
I still think LLMs can show “intelligent behaviour” in the Turing test sense, where you could conceivably fool someone into thinking that its outputs were produced by an intelligent human. It can do this without having to have a giant rule book which literally says “given this as input, give this as output” – instead this is encoded much more compactly as a list of numbers (that is still very large, but at least it’s not just input-output pairs).
The roles of computers in proof
“…consider the computer proof of the Four-Color Theorem, […]. That proof solved a great, century-old mathematical problem, but it did so by reducing the problem to the tedious enumeration of thousands of cases.”
“What it boils down to, I think, is that there is a sense in which the Four-Color Theorem has been proved, and there’s another sense in which many mathematicians understand proof, and those two senses aren’t the same.”
“For many mathematicians, a statement isn’t proved when a physical process ([…]) terminates saying that it’s been proved – however good the reasons might be to believe that the physical process is reliable. Rather, the statement is proved when they (the mathematicians) feel that their minds can directly perceive its truth.”
Chapter 5, “Paleocomplexity”
Faster and faster matrix multiplication
- The simplest matrix multiplication algorithm takes $O(n^3)$ time
- A more complicated algorithm takes $O(n^{2.78})$ time
- A more complicated still algorithm takes $O(n^{2.376})$ time
- The lower bound is definitely $O(n^2)$, as you need to at least pass over each of the $n^2$ entries
- Some possibilities: let $\omega$ be the “optimal exponent for matrix multiplication”. Is it:
- Equal to $2$
- Equal to $2+\varepsilon$ for some $\varepsilon > 2$
- Ill-defined, could it be that for every $\varepsilon > 0$, there exists a matrix algorithm that runs in $O(n^{2+\varepsilon})$ time?
Chapter 6, “P, NP, and Friends”
Why is $\mathbf{P} \ne \mathbf{NP}$ so hard?
- Two problems can sound of a very similar difficulty to one another, but one in $\mathbf{P}$ and one is $\mathbf{NP}$.
- e.g. finding the edit distance between two strings “sounds” difficult, but is actually possible in $\mathbf P$ (using [[Notes - DAA HT23, Dynamic programming]]U).
- And the travelling salesperson “sounds” like it would be of similar difficulty, but it is in $\mathbf{NP}$.
- So the difficulty is that any proof of $\mathbf{P} \ne \mathbf{NP}$ has to be very sensitive to this complex boundary between problems.
- “While there have been dozens of purported $\mathbf{P} \ne \mathbf{NP}$ proofs over the years, almost all of them could be rejected immediately for the simple reason that, if they worked, then they would rule out polynomial-time algorithms that we already know to exist”.
Cook, Karp and Levin’s discovery
- Cook, Karp and Levin didn’t show that $\mathbf{NP}$-complete problems exist, but instead that showed that natural $\mathbf{NP}$-complete problems exist.
- They showed $\mathbf{3SAT}$ is $\mathbf{NP}$-complete by taking the witness definition of $\mathbf{NP}$-completeness and showing that for the verifier $M$ you can come up with a load of logical relations which encode, for a given instance, whether $M$ will accept.
$\mathbf{NP} = \mathbf{coNP}$?
- $\mathbf{coNP}$ is just the “complement” of $\mathbf{NP}$, a problem is in $\mathbf{coNP}$ if non-membership in $\mathbf{NP}$ can be checked in polynomial time.
- Does $\mathbf{NP} = \mathbf{coNP}$? We don’t know!
The polynomial hierarchy
- Any $\mathbf{NP}$ problem can be put into the form: “Does there exist an $n$-bit string such that $A(X) = 1$?” (e.g. is there a satisfying assignment for this 3SAT formula?)
- Any $\mathbf{coNP}$ problem can be put into the form: “Does $A(X) = 1$ for every $X$?” (e.g. are there no satisfying assignments for this 3SAT formula?)
- You can add another quantifier: “Does there exist an $X$ such that for every $Y$, $A(X, Y) = 1$”, or “For every $X$, does there exist a $Y$ such that $A(X, Y) = 1$?”. These classes are called $\Sigma _ 2 P$ and $\Pi _ 2 P$.
- Or even another: “Does there exist an $X$ such that for every $Y$, there exists a $Z$ such that $A(X, Y, Z) = 1$”, or “For every $X$, does there exist a $Y$ such that for every $Z$, $A(X, Y, Z) = 1$?”. These are called $\Sigma _ 3 P$ and $\Pi _ 3 P$.
- These generalise to $\Sigma _ k P$ and $\Pi _ k P$.
- This generalises $\mathbf{NP}$ and $\mathbf{coNP}$.
Chapter 7, “Randomness”
Some random complexity classes
- $\mathbf{PP}$ is the class of decision problems solvable with an error probability less than $1/2$ using a polynomial time algorithm.
- You can still be exponentially close to $1/2$, e.g. $1/2 - 2^{-n}$ – it’s for this reason $\mathbf{PP}$ contains $\mathbf{NP}$.
- $\mathbf{BPP}$ is a related class: decision problems that can be solved with a polynomial-time procedure that accepts with probability greater than $2/3$ if the answer is yes, or less than $1/3$ if the answer is no.
- The fact that it’s $1/3$ rather than $1/2$ is super useful – it means that by running the algorithm multiple times, you can be quickly sure you have the correct answer.
- It’s for this reason (I think) $\mathbf{BPP}$ is constantly referred to as “the class of problems that are feasibly solvable by computer in a universe governed by classical physics”.
- $\mathbf{RP}$ is a complexity class where you can be sure you aren’t wrong: there exists a polynomial time procedure that accepts with probability greater than $1/2$ if the answer is yes, or probability zero if the answer is no.
- $\mathbf{ZPP}$ is the intersection of $\mathbf{RP}$ and $\mathbf{coRP}$. Problems here are solvable by a polynomial time randomised algorithm that has to be correct when it outputs an answer, but can output “don’t know” up to half the time.
Taking advice
- $\mathbf{P} / f(n)$ is the set of all problems solvable in deterministic polynomial time on a Turing machine, with help from an $f(n)$-bit advice string $a _ n$ that depends only on the input length $n$.
- The advice can’t depend on the specific problem, only its size.
- $\mathbf P / \mathbf{poly}$ is the set of all problems solvable in polynomial time using polynomial-size advice.
- $\mathbf P / 1$ is strictly bigger than $\mathbf P$, for example it contains a “unary version” of the Halting problem.
Chapter 8, “Crypto”
A fun puzzle
“You and a friend want to flip a coin, but the only coin you have is crooked: it lands heads with some fixed but unknown probability $p$. Can you use this coin to simulate a fair coin flip?”
“The solution is the ‘von Neumann trick’: flip the biased coin twice, interpreting $HT$ as heads and $TH$ as tails. If the flips come up $HH$ or $TT$, then try again.”
Both then occur with probability $p(1-p)$.
Throwing shade at Wolfram
“…Indeed, you could even base a candidate PRG on the apparent unpredictability of (say) the “Rule 110” cellular automaton, as advocated by Stephen Wolfram in his groundbreaking, revolutionary, paradigm-smashing book.”
Trapdoor one-way functions
- The core of any public-key cryptosystem is a “trapdoor one-way function”. It has these properties:
- Easy to compute,
- Hard to invert, and
- Easy to invert given some secret “trapdoor” information.
- So they are like one-way functions but with a secret piece of information that makes them easy to invert.
Chapter 9, “Quantum”
Why are unitary matrices everywhere?
- They are the most general sort of matrix that always maps a unit vector in the $2$-norm to another unit vector in the $2$-norm.
Why the Born rule?
- [[Notes - Quantum Information HT24, Basic measurements]]U
- Why do square the amplitudes instead of cubing them or raising them to the fourth power?
- Gleason’s theorem: Suppose you have a procedure that takes a unit vector of real numbers and outputs the probability of an event. Call this $f$. Assume further that for $v _ 1, v _ 2, v _ 3$ (or any number of vectors > 2) are orthogonal, $f(v _ 1) + f(v _ 2) + f(v _ 3) = 1$. Then there exists a mixed state $\rho$ so that $f$ arises by measuring that state according to the Born rule.
- If there are any “non-trivial” linear transformations that preserve the $p$-norm, then $p = 1$ or $p = 2$. “If $p = 1$, we get classical probability theory, while if $p = 2$, we get quantum mechanics. So if you don’t want something boring, you have to set $p = 1$ or $p = 2$”.
Why complex numbers rather than just real numbers?
- A compelling reason is that the complex numbers are algebraically closed.
A quick proof of the no-cloning theorem
- [[Notes - Quantum Information HT24, No-cloning theorem]]U
- Suppose there was a cloning map which wrote a copy of $ \vert \psi\rangle$ into another qubit initialised to $ \vert 0\rangle$. It would need to do the following:
- But the coefficients are quadratic functions of $\alpha, \beta$, which is impossible with a linear map.
Chapter 10, “Quantum computing”
Polynomial-size quantum circuits and uniformity
- A polynomial-size quantum circuit is one constructed from $p(n)$ gates from the gate set, where $P$ is a polynomial and $n$ is the number of pits of the problem instance we want to solve.
- But in fact, it should be an infinite family of circuits, one for each input length $n$.
- But there has to be some restrictions, otherwise you could efficiently solve the unary version of the Halting problem. There should exist a classical algorithm that given $n$ as input, outputs the $n$th quantum circuit in polynomial time.
The $\textbf{BQP}$ complexity class
BQP is the class of languages $L \subseteq {0, 1}^\star$ for which there exists a uniform family of polynomial-size quantum circuits, ${C _ n}$ such that for all $x \in {0, 1}^n$:
- If $x \in L$, then $C _ n$ accepts input $ \vert x\rangle \vert 0 \cdots 0\rangle$ with probability at least $2/3$
- If $x \notin L$, then $C _ n$ accepts input $ \vert x\rangle \vert 0\cdots 0\rangle$ with probability at most $1/3$.
$\mathbf{BPP} \ne \mathbf{BQP}$?
- Simon’s problem is a problem that quantum computers can provably solve exponentially faster than classical computers
- This doesn’t prove that $\mathbf{BPP} \ne \mathbf{BQP}$, instead proving that there exists an oracle relative to which $\mathbf{BPP} \ne \mathbf{BQP}$.
Misconceptions about quantum computing and $\mathbf{NP}$
- From the outside, quantum computers are advertised as being able to solve $\mathbf{NP}$-complete problems by trying every possible solution in parallel
- We can’t prove that this isn’t the case – we can’t even prove that $\mathbf{P} \ne \mathbf{NP}$.
- Grover’s algorithm shows that there exists an oracle relative to which $\mathbf{NP} \not\subseteq \mathbf{BQP}$.
- Before Grover’s algorithm was even presented, Bennett et al. had proved that any quantum algorithm to find a needle in a size $2^n$ haystack needs at least $2^{n/2}$ steps.
- So for “unstructured” problems, the best you can achieve is a quadratic speedup over classical computers (although there do exist specific problems for which there is an exponential speedup).
Chapter 11, “Penrose”
Penrose’s argument for human intelligence not being algorithmic
- The core thesis of Penrose’s thesis relies on Gödel’s incompleteness theorem
- No computer working within a fixed formal system $F$ like $ZFC$ can prove the sentence “this sentence cannot be proved in $F$”
- But humans can immediately see that this sentence is true, because if it is false, then there is a contradiction
- Aaronson’s rebuttal: can humans really “see” the consistency of $F$?
Penrose doesn’t need to be talking about Gödel’s theorem at all. The Gödel argument turns out to be just a mathematical restatement of a much older argument against reductionism: “sure a computer could say it perceives $G(F)$, but it’d just be shuffling symbols around! When I say I perceive $G(F)$, I really mean it! There’s something it feels like to be me!”. The obvious response is equally old: “what makes you sure that it doesn’t feel like anything to be a computer”.
A common misconception is that Penrose things the brain is a quantum computer. In reality, a quantum computer would be much weaker than he wants! (…) Penrose, by contrast, wants the brain to solve uncomputable problems, by exploiting hypothetical collapse effects from a yet-to-be-discovered quantum theory of gravity.
Even if we supposed the brain was solving a hard computational problem, it’s not clear why that would bring us any closer to understanding consciousness. If it doesn’t feel like anything to be a Turing machine, then why does it feel like something to be a Turing machine with an oracle for the halting problem.
- “Why Philosophers Should Care About Computational Complexity”, http://www.scottaaronson.com/papers/philos.pdf.
Teletransportation paradox and quantum mechanics
- Famous paradox about personal identity based on a hypothetical teleportation machine on Mars
- One boring way to resolve things like this is to reject the premise: maybe it’s not feasible
- If some of the information that made you up was quantum information, then it would be impossible to build such a cloning machine, by the no-cloning theorem
Chapter 12, “Decoherence and hidden variables”
Fault-tolerance theorem
If the rate of decoherence per qubit per gate operation is below a constant threshold, then it’s possible in principle to correct errors faster than they occur, and therby perform an arbitrarily long quantum computation.
Chapter 13, “Proofs”
What is a proof?
- Links back to the above, role of computers in proof
- Two definitions:
- “A proof is something that induces in the audience (or at least the prover!) an intuitive sense of certainty that the result is correct. In this view, a proof is an inner transformative experience – a way for your soul to make contact with the eternal verities of Platonic heaven.”
- “The second notion is that a proof is just a sequence of symbols obeying certain rules – or more generally, that if we’re going to take this view to what I see as its logical conclusion, a proof is a computation. In other words, a proof is a physical, mechanical process, such that, if the process terminates with a particular outcome, then you should accept that a given theorem is true.”
Lots of zero-knowledge proofs
- If you assume that one-way functions exist, then there exist zero-knowledge proofs for every $\mathbf{NP}$-complete problem.
Chapter 14, “How big are quantum states?”
Misnomer
(…), “computer science” is a bit of a misnomer, maybe it should be called “quantitative epistemology”. It’s sort of the study of the capacity of finite beings such as us to learn mathematical truths.
- [[AI - A Modern Approach]]N also considered the similar “computational rationality”.
Chapter 15, “Skepticism of quantum computing”
…didn’t make any highlights…
Chapter 16, “Learning”
…didn’t make any highlights…
Chapter 17, “Interactive proofs and more”
…also didn’t make any highlights…
Chapter 18, “Fun with the anthropic principle”
A random fragment of consciousness
From one of the dialogues:
Student: I’ve also sometimes wondered “why am I human?” Maybe I’m not a random human, but a random fragment of consciousness. In that case, since humans have more brain matter than other animals, I’m much more likely to be a human.
Chapter 19, “Free will”
Newcomb’s paradox
From Wikipedia:
There is a reliable predictor, another player, and two boxes designated A and B. The player is given a choice between taking only box B or taking both boxes A and B. The player knows the following:
- Box A is transparent and always contains a visible $1,000.
- Box B is opaque, and its content has already been set by the predictor:
- If the predictor has predicted that the player will take both boxes A and B, then box B contains nothing.
- If the predictor has predicted that the player will take only box B, then box B contains $1,000,000.
The player does not know what the predictor predicted or what box B contains while making the choice.
- The predictor has to solve a “you-complete” problem, since you could “dredge up some childhood memory and count the letters in the name of your first-grade teacher or something and based on that, choose whether to take one or two boxes”.
- So the predictor needs to run a very accurate simulation of you, so accurate that it’s effectively a copy.
Later on:
One reason I like this Newcomb’s Paradox is that it gets at a connection between “free will” and the inability to predict future behaviour. Inability to predict the future behaviour of an entity doesn’t seem sufficient for free will, but it does seem somewhat necessary.
The f
or d
program
A program that guesses which key you’re going to press next with around 70% accuracy: Aaronson Oracle.
Chapter 20, “Time travel”
This one trick for solving $\mathbf{NP}$-complete problems in polynomial time
- Start your computer working on an $\mathbf{NP}$-complete problem
- Board a space ship travelling close to the speed of light
- Return to Earth
- Pick up the solution
- …
- The big issue (apart from all of your friends being dead on your return) is that you would need an exponential amount of energy to make this happen.
CTCs and computational complexity
- CTC = “closed timelike curve”, the fancy way of talking about hypothetical time travel
- How could backwards time travel be used to speed up computation? Compute the answer, then send it back in time.
- This doesn’t work for a few reasons:
- The universe might end while you’re computing the answer
- How does the computer deal with inconsistencies relating to the Grandfather paradox?
Chapter 21, “Ask me anything”
Darwinism and religion
Years ago I was struck by an irony: in contemporary America, you’ve got these stereotypical coastal elites who believe in Darwinism and often live by themselves well into their thirties or forties, and then you’ve got these stereotypical heartland folks who reject Darwinism but marry young and have $7$ kids, $49$ grandkids and $343$ great-grandkids. So then it’s not really a content between “Darwinists” and “anti-Darwinists”; it’s just a contest between Darwinian theorists and Darwinian practitioners!
If this idea is right – that is, if religion has played this role throughout history of helping inspire people to win wars, have more babies, etc. – then the question arises, how are you ever going to counter religion except with a competing religion?
So the theory is that religion is a way of committing yourself. Someone might say that he believes in a certain moral code, but others might figure talk is cheap and not trust him. On the other hand, if he has a long beard and prays every day and really seems to believe that he’ll have an eternity in hellfire if he breaks the code, then he’s making this very expensive commitment to his belief.
Playing chicken
…talking about rational irrationality…
Student: The standard example is if you’re playing Chicken with someone, it’s advantageous to you if you break your steering whell so it can’t turn.