Quantum Computing Lecture Notes Another Set
Quantum Computing Lecture Notes Another Set
Lecture Notes
Ronald de Wolf
Preface
These lecture notes were formed in small chunks during my “Quantum computing” course at the
University of Amsterdam, Feb-May 2011, and compiled into one text thereafter. Each chapter
was covered in a lecture of 2 × 45 minutes, with an additional 45-minute lecture for exercises and
homework. The first half of the course (Chapters 1–7) covers quantum algorithms, the second half
covers quantum complexity (Chapters 8–9), stuff involving Alice and Bob (Chapters 10–13), and
error-correction (Chapter 14). A 15th lecture about physical implementations and general outlook
was more sketchy, and I didn’t write lecture notes for it.
These chapters may also be read as a general introduction to the area of quantum computation
and information from the perspective of a theoretical computer scientist. While I made an effort
to make the text self-contained and consistent, it may still be somewhat rough around the edges; I
hope to continue polishing and adding to it. Comments & constructive criticism are very welcome,
and can be sent to [email protected]
Version 2 (Jan’12): updated and corrected a few things for the Feb-Mar 2013 version of this course,
and included exercises for each chapter. Thanks to Harry Buhrman, Florian Speelman, Jeroen
Zuiddam for spotting some typos in the earlier version.
Ronald de Wolf
January 2012, Amsterdam
i
Contents
1 Quantum Computing 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 Unitary evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Quantum memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Simon’s Algorithm 15
3.1 The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 The quantum algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Classical lower bound for Simon’s problem . . . . . . . . . . . . . . . . . . . . . . . . 16
ii
5.3 Shor’s period-finding algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4 Continued fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
iii
13 Quantum Cryptography 77
13.1 Quantum key distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
13.2 Reduced density matrices and the Schmidt decomposition . . . . . . . . . . . . . . . 79
13.3 The impossibility of perfect bit commitment . . . . . . . . . . . . . . . . . . . . . . . 80
13.4 More quantum cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
iv
Chapter 1
Quantum Computing
1.1 Introduction
Today’s computers—both in theory (Turing machines) and practice (PCs)—are based on classical
physics. However, modern quantum physics tells us that the world behaves quite differently. A
quantum system can be in a superposition of many different states at the same time, and can exhibit
interference effects during the course of its evolution. Moreover, spatially separated quantum
systems may be entangled with each other and operations may have “non-local” effects because of
this.
Quantum computation is the field that investigates the computational power and other prop-
erties of computers based on quantum-mechanical principles. An important objective is to find
quantum algorithms that are significantly faster than any classical algorithm solving the same
problem. The field started in the early 1980s with suggestions for analog quantum computers by
Paul Benioff [11] and Richard Feynman [31, 32], and reached more digital ground when in 1985
David Deutsch defined the universal quantum Turing machine [25]. The following years saw only
sparse activity, notably the development of the first algorithms by Deutsch and Jozsa [27] and by
Simon [65], and the development of quantum complexity theory by Bernstein and Vazirani [14].
However, interest in the field increased tremendously after Peter Shor’s very surprising discovery
of efficient quantum algorithms for the problems of integer factorization and discrete logarithms in
1994 [64]. Since most of current classical cryptography is based on the assumption that these two
problems are computationally hard, the ability to actually build and use a quantum computer would
allow us to break most current classical cryptographic systems, notably the RSA system [60, 61].
(In contrast, a quantum form of cryptography due to Bennett and Brassard [13] is unbreakable
even for quantum computers.)
Let us mention three different motivations for studying quantum computers, from practical to
more philosophical:
1. The process of miniaturization that has made current classical computers so powerful and
cheap, has already reached micro-levels where quantum effects occur. Chip-makers tend to
go to great lengths to suppress those quantum effects, but instead one might also try to make
good use of them.
2. Making use of quantum effects allows one to speed-up certain computations enormously
(sometimes exponentially), and even enables some things that are impossible for classical
1
computers The main purpose of this course is to explain these things (algorithms, crypto,
etc.) in detail.
3. Finally, one might state the main goal of theoretical computer science as “study the power
and limitations of the strongest-possible computational devices that nature allows us.” Since
our current understanding of nature is quantum mechanical, theoretical computer science
should be studying the power of quantum computers, not classical ones.
Before limiting ourselves to theory, let us say a few words about practice: to what extent will
quantum computers ever be built? At this point in time, it is just too early to tell. The first small
2-qubit quantum computer was built in 1997 and 2001 a 5-qubit quantum computer was used to
successfully factor the number 15 [69]. Since then, experimental progress on a number of different
technologies has been steady but slow. The practical problems facing physical realizations of quan-
tum computers seem formidable. The problems of noise and decoherence have to some extent been
solved in theory by the discovery of quantum error-correcting codes and fault-tolerant computing
(see e.g. [56, Chapter 10]), but these problems are by no means solved in practice. On the other
hand, we should realize that the field of physical realization of quantum computing is still in its
infancy and that classical computing had to face and solve many formidable technical problems as
well—interestingly, often these problems were even of the same nature as those now faced by quan-
tum computing (e.g., noise-reduction and error-correction). Moreover, the difficulties facing the
implementation of a full quantum computer may seem daunting, but more limited things involving
quantum communication have already been implemented with some success, for example telepor-
tation (which is the process of sending qubits using entanglement and classical communication),
and quantum cryptography is nowadays even commercially available.
Even if the theory of quantum computing never materializes to a real physical computer,
quantum-mechanical computers are still an extremely interesting idea which will bear fruit in other
areas than practical fast computing. On the physics side, it may improve our understanding of
quantum mechanics. The emerging theory of entanglement has already done this to some extent.
On the computer science side, the theory of quantum computation generalizes and enriches classical
complexity theory and may help resolve some of its problems.
1.2.1 Superposition
Consider some physical system that can be in N different, mutually exclusive classical states. Call
these states |1i, |2i, . . . , |N i. Roughly, by a “classical” state we mean a state in which the system
can be found if we observe it. A pure quantum state (usually just called state) |φi is a superposition
of classical states, written
|φi = α1 |1i + α2 |2i + · · · + αN |N i.
Here αi is a complex number that is called the amplitude of |ii in |φi. Intuitively, a system in
quantum state |φi is in all classical states at the same time! It is in state |1i with amplitude
2
α1 , in state |2i with amplitude α2 , and so on. Mathematically, the states |1i, . . . , |N i form an
orthonormal basis of an N -dimensional Hilbert space (i.e., an N -dimensional vector space equipped
with an inner product) of dimension N , and a quantum state |φi is a vector in this space.
1.2.2 Measurement
There are two things we can do with a quantum state: measure it or let it evolve unitarily without
measuring it. We will deal with measurement first.
vector of amplitudes has (Euclidean) norm 1. If we measure |φi and see classical state |ji as a result,
then |φi itself has “disappeared”, and all that is left is |ji. In other words, observing |φi “collapses”
the quantum superposition |φi to the classical state |ji that we saw, and all “information” that
might have been contained in the amplitudes αi is gone.
Projective measurement
A somewhat more general kind of measurement than the above “measurement in the computational
(or standard) basis” is possible. This will be used only sparsely in the course, so it may be skipped
on a first reading. Such a projective measurement is described by projectors P1 , . . . , Pm (m ≤ N )
which sum to identity. These projectors are then pairwise orthogonal, meaning that Pi Pj = 0 if
i 6= j. The projector Pj projects on some subspace VP j of the total Hilbert space V , and every state
|φi ∈ V can be decomposed in a unique way as |φi = m j=1 |φj i, with |φj i = Pj |φi ∈ Vj . Because the
projectors are orthogonal, the subspaces Vj are orthogonal as well, as are the states |φj i. When we
apply this measurement to the pure state |φi, then we will get outcome j with probability k|φj ik2 =
Tr(Pj |φihφ|) and the state will then “collapse” to the new state |φj i/k|φj ik = Pj |φi/kPj |φik.
For example, a measurement in the standard basis is the specific projective measurement where
m = N and Pj = |jihj|. That is, Pj projects onto the standard basis state |ji and the corresponding
P
subspace Vj is the space spanned by |ji. Consider the state |φi = N j=1 αj |ji. Note that Pj |φi =
αj |ji, so applying our measurement to |φi will give outcome j with probability kαj |jik2 = |αj |2 ,
α α
and in that case the state collapses to αj |ji/kαj |jik = |αjj | |ji. The norm-1 factor |αjj | may be
disregarded because it has no physical significance, so we end up with the state |ji as we saw
before.
As another example, a measurement that distinguishes P between |ji with j ≤PN/2 and |ji
with j > N/2 corresponds to the two projectors P1 = j≤N/2 |jihj| and P2 = j>N/2 |jihj|.
q
Applying this measurement to the state |φi = √13 |1i + 23 |N i will give outcome 1 with probability
kP1 |φik2 = 1/3, in which case the state collapses to |1i, and will give outcome 2 with probability
kP2 |φik2 = 2/3, in which case the state collapses to |N i.
3
1.2.3 Unitary evolution
Instead of measuring |φi, we can also apply some operation to it, i.e., change the state to some
Quantum mechanics only allows linear operations to be applied to quantum states. What this
means is: if we view a state like |φi as an N -dimensional vector (α1 , . . . , αN )T , then applying an
operation that changes |φi to |ψi corresponds to multiplying |φi with an N × N complex-valued
matrix U :
α1 β1
U ... = ... .
αN βN
P P
Note that by linearity we have |ψi = U |φi = U ( i αi |ii) = i αi U |ii.
PNBecause measuring |ψi should also give a probability distribution, we have the constraint
2 = 1. This implies that the operation U must preserve the norm of vectors, and hence
j=1 |βj |
must be a unitary transformation. A matrix U is unitary if its inverse U −1 equals its conjugate
transpose U ∗ . This is equivalent to saying that U always maps a vector of norm 1 to a vector of
norm 1. Because a unitary transformation always has an inverse, it follows that any (non-measuring)
operation on quantum states must be reversible: by applying U −1 we can always “undo” the action
of U , and nothing is lost in the process. On the other hand, a measurement is clearly non-reversible,
because we cannot reconstruct |φi from the observed classical state |ji.
A unitary that acts on a small numer of qubits (say, at most 3) is often called a gate, in analogy
to classical logic gates (more about that in the next chapter). Two simple but important 1-qubit
gates are the bitflip-gate X (which negates the bit, i.e., swaps |0i and |1i) and the phaseflip gate
Z (which puts a − in front of |1i). Represented as 2 × 2 unitary matrices, these are
0 1 1 0
X= ,Z= .
1 0 0 −1
Possibly the most important 1-qubit gate is the Hadamard transform, specified by:
1 1
H|0i = √ |0i + √ |1i
2 2
1 1
H|1i = √ |0i − √ |1i
2 2
As a unitary matrix, this is represented as
1 1 1
H=√ .
2 1 −1
If we apply H to initial state |0i and then measure, we have equal probability of observing |0i
or |1i. Similarly, applying H to |1i and observing gives equal probability of |0i or |1i. However,
if we apply H to the superposition √12 |0i + √12 |1i then we obtain |0i: the positive and negative
amplitudes for |1i cancel out! (note that this also means that H is its own inverse) This effect is
called interference, and is analogous to interference patterns between light or sound waves.
We will see gates acting on more than one qubit in the next chapter.
4
1.3 Quantum memory
In classical computation, the unit of information is a bit, which can be 0 or 1. In quantum compu-
tation, this unit is a quantum bit (qubit), which is a superposition of 0 and 1. Consider a system
1
with 2 basis states, call them |0i and |1i. We identify these basis states with the vectors
0
0
and , respectively. A single qubit can be in any superposition
1
Accordingly, a single qubit “lives” in the vector space C2 . Similarly we can think of systems of
more than 1 qubit, which “live” in the tensor product space of several qubit systems. For instance,
a 2-qubit system has 4 basis states: |0i ⊗ |0i, |0i ⊗ |1i, |1i ⊗ |0i, |1i ⊗ |1i. Here for instance |1i ⊗ |0i
means that the first qubit is in its basis state |1i and the second qubit is in its basis state |0i. We
will often abbreviate this to |1i|0i, |1, 0i, or even |10i.
More generally, a register of n qubits has 2n basis states, each of the form |b1 i ⊗ |b2 i ⊗ . . . ⊗ |bn i,
with bi ∈ {0, 1}. We can abbreviate this to |b1 b2 . . . bn i. We will often abbreviate 0 . . . 0 to 0n .
Since bitstrings of length n can be viewed as numbers between 0 and 2n − 1, we can also write
the basis states as numbers |0i, |1i, |2i, . . . , |2n − 1i. A quantum register of n qubits can be in any
superposition
n −1
2X
n
α0 |0i + α1 |1i + · · · + α2n −1 |2 − 1i, |αj |2 = 1.
j=0
If we measure this in the standard basis, we obtain the n-bit state state |ji with probability |αj |2 .
Measuring just the first qubit of a state would correspond to the projective measurement that
has the two projectors P0 = |0ih0| ⊗q I2n−1 and P1 = |1ih1| ⊗ I2n−1 . For example, applying this
measurement to the state √13 |0i|φi + 23 |1i|ψi will give outcome 0 with probability 1/3 (the state
then becomes |0i|φi) and outcome 1 with probability 2/3 (the state then becomes |1i|ψi). Similarly,
measuring the first n qubits of an (n + m)-qubit state in the standard basis corresponds to the
projective measurement that has 2n projectors Pi = |iihi| ⊗ I2m for i ∈ {0, 1}n .
An important property that deserves to be mentioned is entanglement, which refers to quantum
correlations between different qubits. For instance, consider a 2-qubit register that is in the state
1 1
√ |00i + √ |11i.
2 2
Such 2-qubit states are sometimes called EPR-pairs in honor of Einstein, Podolsky, and Rosen [29],
who first examined such states and their seemingly paradoxical properties. Initially neither of the
two qubits has a classical value |0i or |1i. However, if we measure the first qubit and observe, say,
a |0i, then the whole state collapses to |00i. Thus observing the first qubit immediately fixes also
the second, unobserved qubit to a classical value. Since the two qubits that make up the register
may be far apart, this example illustrates some of the non-local effects that quantum systems can
exhibit. In general, a bipartite state |φi is called entangled if it cannot be written as a tensor
product |φA i ⊗ |φB i where |φA i lives in the first space and |φB i lives in the second.
At this point, a comparison with classical probability distributions may be helpful. Suppose
we have two probability spaces, A and B, the first with 2n possible outcomes, the second with 2m
5
possible outcomes. A distribution on the first space can be described by 2n numbers (non-negative
reals summing to 1; actually there are only 2n − 1 degrees of freedom here) and a distribution
on the second by 2m numbers. Accordingly, a product distribution on the joint space can be
described by 2n + 2m numbers. However, an arbitrary (non-product) distribution on the joint
space takes 2n+m real numbers, since there are 2n+m possible outcomes in total. Analogously, an
n-qubit state |φA i can be described by 2n numbers (complex numbers whose squared moduli sum
to 1), an m-qubit state |φB i by 2m numbers, and their tensor product |φA i ⊗ |φB i by 2n + 2m
numbers. However, an arbitrary (possibly entangled) state in the joint space takes 2n+m numbers,
since it lives in a 2n+m -dimensional space. We see that the number of parameters required to
describe quantum states is the same as the number of parameters needed to describe probability
distributions. Also note the analogy between statistical independence of two random variables A
and B and non-entanglement of the product state |φA i ⊗ |φB i. However, despite the similarities
between probabilities and amplitudes, quantum states are much more powerful than distributions,
because amplitudes may have negative parts which can lead to interference effects. Amplitudes
only become probabilities when we square them. The art of quantum computing is to use these
special properties for interesting computational purposes.
Exercises
1. Compute the result of applying a Hadamard transform to both qubits of |0i ⊗ |1i in two ways
(the first way using tensor product of vectors, the second using tensor product of matrices),
and show that the two results are equal:
2. Show that a bit-flip operation, preceded and followed by Hadamard transforms, equals a
phase-flip operation: HXH = Z.
(a) Prove that A is norm-preserving if, and only if, A is inner product-preserving.
(b) Prove that A is inner product-preserving iff A∗ A = AA∗ = I.
(c) Conclude that A is norm-preserving iff A is unitary.
Bonus: prove the same for complex instead of real vector spaces.
6
Chapter 2
7
polynomial randomized circuit family iff L ∈ BPP, where BPP (“Bounded-error Probabilistic
Polynomial time”) is the class of languages that can efficiently be recognized by randomized Turing
machines with success probability at least 2/3.
1 1 1 X
H ⊗2 |01i = √ (|0i + |1i) ⊗ √ (|0i − |1i) = (−1)01·j |ji.
2 2 2
j∈{0,1}2
The n-fold Hadamard transform will be very useful for all the quantum algorithms explained later.
Another important 1-qubit gate is the phase gate Rφ , which merely rotates the phase of the
|1i-state by an angle φ:
Rφ |0i = |0i
Rφ |1i = eiφ |1i
This corresponds to the unitary matrix
1 0
Rφ = .
0 eiφ
An example of a 2-qubit gate is the controlled-not gate CNOT. It negates the second bit of its
input if the first bit is 1, and does nothing if it’s 0:
CNOT|0i|bi = |0i|bi
CNOT|1i|bi = |1i|1 − bi
8
Adding another control register, we get the 3-qubit Toffoli gate, also called controlled-controlled-
not (CCNOT) gate. This negates the third bit of its input if both of the first two bits are 1. The
Toffoli gate is important because it is complete for classical reversible computation: any classical
computation can be implemented by a circuit of Toffoli gates. This is easy to see: using ancilla
wires with fixed values, Toffoli can implement AND (fix the 3rd ingoing wire to 0) and NOT (fix the
1st and 2nd ingoing wire to 1). It is known that AND and NOT-gates together suffice to implement
any classical Boolean circuit.
As in the classical case, a quantum circuit is a finite directed acyclic graph of input nodes,
gates, and output nodes. There are n nodes that contain the input (as classical bits); in addition
we may have some more input nodes that are initially |0i (“workspace”). The internal nodes of
the quantum circuit are quantum gates that each operate on at most 2 qubits of the state. The
gates in the circuit transform the initial state vector into a final state, which will generally be
a superposition. We measure some dedicated output bits of this final state to (probabilistically)
obtain an answer.
In analogy to the classical class BPP, we will define BQP (“Bounded-error Quantum Poly-
nomial time”) as the class of languages that can efficiently be computed with success probability
at least 2/3 by (a family of) quantum circuits whose size grows at most polynomially with the
input length. We will study this quantum complexity class and its relation with various classical
complexity classes in more detail in Chapter 9.
Alice then measures her two qubits in the computational basis and sends the result (2 random
classical bits) to Bob over a classical channel. Bob now knows which transformation he must do
on his qubit in order to regain the qubit α0 |0i + α1 |1i. For instance, if Alice sent 11 then Bob
knows that his qubit is α0 |1i − α1 |0i. A bitflip (X) followed by a phaseflip (Z) will give him Alice’s
9
original qubit α0 |0i + α1 |1i. In fact, if Alice’s qubit had been entangled with other qubits, then
teleportation preserves this entanglement: Bob then receives a qubit that is entangled in the same
way as Alice’s original qubit was.
Note that the qubit on Alice’s side has been destroyed: teleporting moves a qubit from A to B,
rather than copying it. In fact, copying an unknown qubit is impossible [72]. This can be seen as
follows. Suppose C were a 1-qubit copier, i.e., C|φi|0i = |φi|φi for every qubit |φi. In particular
C|0i|0i = |0i|0i and C|1i|0i = |1i|1i. But then C would not copy |φi = √12 (|0i + |1i) correctly,
since by linearity C|φi|0i = √1 (C|0i|0i + C|1i|0i) = √1 (|0i|0i + |1i|1i) 6= |φi|φi.
2 2
We applied U just once, but the final superposition contains f (x) for all 2n input values x! However,
by itself this is not very useful and does not give more than classical randomization, since observing
10
the final superposition will give just one random |xi|f (x)i and all other information will be lost.
As we will see below, quantum parallelism needs to be combined with the effects of interference
and entanglement in order to get something that is better than classical.
Ox : |i, 0i → |i, xi i.
The first n qubits of the state are called the address bits (or address register), while the (n + 1)st
qubit is called the target bit. Since this operation must be unitary, we also have to specify what
happens if the initial value of the target bit is 1. Therefore we actually let Ox be the following
unitary transformation:
Ox : |i, bi → |i, b ⊕ xi i,
here i ∈ {0, 1}n , b ∈ {0, 1}, and ⊕ denotes exclusive-or (addition modulo 2). In matrix representa-
tion, Ox is now a permutation matrix and hence unitary. Note also that a quantum computer can
apply Ox on a superposition of various i, something a classical computer cannot do. One applica-
tion of this black-box is called a query, and counting the required number of queries to compute
this or that function of x is something we will do a lot in the first half of these notes.
Given the ability to make a query of the above type, we can also make a query of the form
|ii → (−1)xi |ii by setting the target bit to the state |∆i = √12 (|0i − |1i):
1
Ox (|ii|∆i) = |ii √ (|xi i − |1 − xi i) = (−1)xi |ii|∆i.
2
This ±-kind of query puts the output variable in the phase of the state: if xi is 1 then we get a −1 in
the phase of basis state |ii; if xi = 0 then nothing happens to |ii. This “phase-oracle” is sometimes
more convenient than the standard type of query. We sometimes denote the corresponding n-qubit
unitary transformation by Ox,± .
11
2.5.1 Deutsch-Jozsa
Deutsch-Jozsa problem [27]:
For N = 2n , we are given x ∈ {0, 1}N such that either
(1) all xi have the same value (“constant”), or
(2) N/2 of the xi are 0 and N/2 are 1 (“balanced”).
The goal is to find out whether x is constant or balanced.
The algorithm of Deutsch and Jozsa is as follows. We start in the n-qubit zero state |0n i, apply
a Hadamard transform to each qubit, apply a query (in its ±-form), apply another Hadamard to
each qubit, and then measure the final state. As a unitary transformation, the algorithm would
be H ⊗n O± H ⊗n . We have drawn the corresponding quantum circuit in Figure 2.1 (where time
progresses from left to right).
|0i H H
|0i H H
Let us follow the state through these operations. Initially we have the state |0n i. By Equa-
tion 2.1 on page 8, after the first Hadamard transforms we have obtained the uniform superposition
of all i: X
1
√ |ii.
2n i∈{0,1}n
Applying the second batch of Hadamards gives (again by Equation 2.1) the final superposition
1 X X
(−1)xi (−1)i·j |ji,
2n
i∈{0,1}n j∈{0,1}n
P
where i · j = nk=1 ik jk as before. Since i · 0n = 0 for all i ∈ {0, 1}n , we see that the amplitude of
the |0n i-state in the final superposition is
1 X 1 if xi = 0 for all i,
xi
(−1) = −1 if xi = 1 for all i,
2n
i∈{0,1}n 0 if x is balanced.
Hence the final observation will yield |0n i if x is constant and will yield some other state if x
is balanced. Accordingly, the Deutsch-Jozsa problem can be solved with certainty using only 1
12
quantum query and O(n) other operations (the original solution of Deutsch and Jozsa used 2
queries, the 1-query solution is from [24]).
In contrast, it is easy to see that any classical deterministic algorithm needs at least N/2 + 1
queries: if it has made only N/2 queries and seen only 0s, the correct output is still undetermined.
However, a classical algorithm can solve this problem efficiently if we allow a small error probability:
just query x at two random positions, output “constant” if those bits are the same and “balanced”
if they are different. This algorithm outputs the correct answer with probability 1 if x is constant
and outputs the correct answer with probability 1/2 if x is balanced. Thus the quantum-classical
separation of this problem only holds if we consider algorithms without error probability.
2.5.2 Bernstein-Vazirani
Bernstein-Vazirani problem [14]:
For N = 2n , we are given x ∈ {0, 1}N with the property that there is some unknown a ∈ {0, 1}n
such that xi = (i · a) mod 2. The goal is to find a.
The Bernstein-Vazirani algorithm is exactly the same as the Deutsch-Jozsa algorithm, but now
the final observation miraculously yields a. Since (−1)xi = (−1)(i·a) mod 2 = (−1)i·a , we can write
the state obtained after the query as:
1 X 1 X
√ (−1)xi |ii = √ (−1)i·a |ii.
2n i∈{0,1}n
2n i∈{0,1}n
Applying a Hadamard to each qubit will turn this into the classical state |ai and hence solves
the problem with 1 query and O(n) other operations. In contrast, any classical algorithm (even
a randomized one with small error probability) needs to ask n queries for information-theoretic
reasons: the final answer consists of n bits and one classical query gives at most 1 bit of information.
Bernstein and Vazirani also defined a recursive version of this problem, which can be solved
exactly by a quantum algorithm in poly(n) steps, but for which any classical randomized algorithm
needs nΩ(log n) steps.
Exercises
1. Is the controlled-NOT operation C Hermitian?
Determine C −1 .
2. Construct a CNOT from two Hadamard gates and one controlled-Z (the controlled-Z gate
maps |11i 7→ −|11i and acts like the identity on the other basis states).
3. A SWAP-gate interchanges two qubits: it maps basis state |a, bi to |b, ai. Implement a
SWAP-gate using a few CNOTs.
13
5. During the lecture we showed that a query of the type |i, bi 7→ |i, b ⊕ xi i (where i ∈ {1, ..., n}
and b ∈ {0, 1}) can be used to implement a phase-query, i.e., one of the type |ii 7→ (−1)xi |ii.
Is the converse possible: can a query of the first type be implemented using phase-queries,
and possibly some ancilla qubits and other gates? If yes, show how. If no, explain why not.
6. Give a randomized classical algorithm (i.e., one that can flip coins during its operation) that
makes only two queries to x, and decides the Deutsch-Jozsa problem with success probability
at least 2/3 on every possible input. A high-level description is enough, no need to write out
the classical circuit.
8. Suppose Alice and Bob are not entangled. If Alice sends a qubit to Bob, then this can give
Bob at most one bit of information about Alice.1 However, if they share an EPR-pair, they
can transmit two classical bits by sending one qubit over the channel; this is called superdense
coding. This exercise will show how this works.
(a) They start with a shared EPR-pair, √12 (|00i + |11i). Alice has classical bits a and b.
Suppose she does an X-operation on her half of the EPR-pair if a = 1, and then a Z-
operation if b = 1 (she does both if ab = 11, and neither if ab = 00). Write the resulting
2-qubit state.
(b) Suppose Alice sends her half of the state to Bob, who now has two qubits. Show that
Bob can determine both a and b from his state. Write Bob’s operation as a quantum
circuit with Hadamard and CNOT gates.
1
This is actually a deep statement, a special case of Holevo’s theorem. More about this may be found in Chapter 10.
14
Chapter 3
Simon’s Algorithm
The Deutsch-Jozsa problem showed an exponential quantum improvement over the best determin-
istic classical algorithms; the Bernstein-Vazirani problem shows a polynomial improvement over
the best randomized classical algorithms with error probability ≤ 1/3. In this chapter we will
combine these two features: we will see a problem where quantum computers are exponentially
more efficient than bounded-error randomized algorithms.
Note that x, viewed as a function from [N ] to [N ] is a 2-to-1 function, where the 2-to-1-ness
is determined by the unknown mask s. The queries to the input here are slightly different from
before: the input x = (x1 , . . . , xN ) now has variables xi that themselves are n-bit strings, and one
query gives such a string completely (|i, 0n i 7→ |i, xi i). However, we can also view this problem as
having n2n binary variables that we can query individually. Since we can simulate one xi -query
using only n binary queries (just query all n bits of xi ), this alternative view will not affect the
number of queries very much.
15
At this point, the second n-qubit register still holds only zeroes. A query turns this into
1 X
√ |ii|xi i.
2n i∈{0,1}n
Now the algorithm measures the second n-bit register (this measurement is actually not necessary,
but it facilitates analysis). The measurement outcome will be some value xi and the first register
will collapse to the superposition of the two indices having that xi -value:
1
√ (|ii + |i ⊕ si)|xi i.
2
We will now ignore the second register and apply Hadamard transforms to the first n qubits. Using
Equation 2.1 and the fact that (i ⊕ s) · j = (i · j) ⊕ (s · j), we can write the resulting state as
1 X X
√ (−1)i·j |ji + (−1)(i⊕s)·j |ji =
2n+1
j∈{0,1}n j∈{0,1}n
1 X
√ (−1)i·j 1 + (−1)s·j |ji .
2n+1 j∈{0,1}n
Note that |ji has non-zero amplitude iff s · j = 0 mod 2. Accordingly, if we measure the final state
we get a linear equation that gives information about s. Repeating this algorithm an expected
number of O(n) times, we obtain n independent linear equations involving s, from which we can
extract s efficiently by a classical algorithm (Gaussian elimination over GF (2)). Simon’s algorithm
thus finds s using an expected number of O(n) xi -queries and polynomially many other operations.
Consider the input distribution µ defined as follows. With probability 1/2, x is a random permu-
tation of {0, 1}n ; this corresponds to the case s = 0n . With probability 1/2, we pick a non-zero
string s at random, and for each pair (i, i ⊕ s), we pick a unique value for xi = xi⊕s at random. If
there exists a randomized T -query algorithm that achieves success probability p under this input
distribution µ, then there also is deterministic T -query algorithm that achieves success probability
p under µ (because the behavior of the randomized algorithm is the average over a number of
16
deterministic algorithms). Now consider a deterministic √ algorithm with error ≤ 1/3 under µ, that
makes T queries to x. We want to show that T = Ω( 2n ).
First consider the case s = 0n . We can assume the algorithm never queries the same point
twice. Then the T outcomes of the queries are T distinct n-bit strings, and each sequence of T
strings is equally likely.
Now consider the case s 6= 0n . Suppose the algorithm queries the indices i1 , . . . , iT (this sequence
depends on x) and gets outputs xi1 , . . . , xiT . Call a sequence of queries i1 , . . . , iT good if it shows
a collision (i.e., xik = xiℓ for some k 6= ℓ), and bad otherwise. If the sequence of queries of the
algorithm is good, then we can find s, since ik ⊕ iℓ = s. On the other hand, if the sequence is bad,
then each sequence of T distinct outcomes is equally likely—just as in the s = 0n case! We will
now show that the probability of the bad case is very close to 1 for small T .
If i1 , . . . , ik−1 is bad, then we have excluded k−12 values of s, and all other values of s are
equally likely. The probability that the next query ik makes the sequence good, is the probability
that xik = xij for some j < k, equivalently, that the set S = {ik ⊕ ij | j < k} happens to contain
the string s. But S has only k − 1 members, while there are 2n − 1 − k−1 2 equally likely remaining
possibilities for s. This means that the probability that the sequence is still bad after query ik is
made, is very close to 1. In formulas:
T
Y
Pr[i1 , . . . , iT is bad] = Pr[i1 , . . . , ik is bad | i1 , . . . , ik−1 is bad]
k=2
T
!
Y k−1
= 1− k−1
k=2
2n − 1 − 2
T
X k−1
≥ 1− k−1
.
k=2
2n −1− 2
PT
Here we used the fact that (1 − a)(1 − b) ≥ 1 − (a + b) if a, b ≥ 0. Note that k=2 k − 1 =
√
T (T − 1)/2 ≈ T 2 /2, and 2n − 1 − k−1 ≈ 2n as long as k ≪ 2n . Hence we can approximate the
2 √
last formula by 1 − T 2 /2n+1 . Accordingly, if T ≪ 2n then with probability nearly 1 (probability
taken over the distribution µ) the algorithm’s sequence of queries is bad. If it gets a bad sequence,
it cannot “see” the difference between the s = 0n case and the s 6= 0n case, since both cases result
in a uniformly random√ sequence of T distinct n-bit strings as answers to the T queries. This shows
that T has to be Ω( 2n ) in order to enable the algorithm to get a good sequence of queries with
high probability.
Exercises
1. Prove that an EPR-pair √12 (|00i + |11i) is an entangled state, i.e., that it cannot be written
as the tensor product of two separate qubits.
2. Show that every unitary one-qubit gate with real entries can be written as a rotation matrix,
possibly preceded and followed by Z-gates. In other words, show that for every 2 × 2 real
unitary U , there exist signs s1 , s2 , s3 ∈ {1, −1} and angle θ ∈ [0, 2π) such that
1 0 cos(θ) − sin(θ) 1 0
U = s1
0 s2 sin(θ) cos(θ) 0 s3
17
3. Suppose we run Simon’s algorithm on the following input x (with N = 8 and hence n = 3):
Note that x is 2-to-1 and xi = xi⊕111 for all i ∈ {0, 1}3 , so s = 111.
4. Given a string x ∈ {0, 1}N (N = 2n ) with the promise that there exists a string s ∈ {0, 1}n
such that xi = i · s (mod 2) for all i ∈ {0, 1}n . We would like to learn what s is.
(a) Give a quantum algorithm that makes only 1 query to x and that computes s with
success probability 1. Hint: Use the Deutsch-Jozsa algorithm.
(b) Argue that any classical algorithm to compute s needs to query x at least n times.
18
Chapter 4
for some integer k, in this case k = N ). The rows of the matrix will be indexed by j ∈ {0, . . . , N −1}
jk
and the columns by k ∈ {0, . . . , N − 1}. Define the (j, k)-entry of the matrix FN by √1N ωN :
..
.
1
FN = √ · · · ω jk
N ···
N ..
.
Note that FN is a unitary matrix, since each column has norm 1, and any pair of columns (say
those indexed by k and k′ ) is orthogonal:
N
X −1 N −1
1 jk ∗ 1 jk ′ 1 X j(k′ −k) 1 if k = k′
√ (ωN ) √ ωN = ωN =
N N N 0 otherwise
j=0 j=0
Since FN is unitary and symmetric, the inverse FN−1 = FN∗ only differs from FN by having minus
signs in the exponent of the entries. For a vector v ∈ RN , the vector vb = FN v is called the
Fourier transform of v.1 Doing the matrix-vector multiplication, its entries are given by vbj =
PN −1 jk
√1
N k=0 ωN vk .
1
The literature on Fourier analysis usually talks about the Fourier transform of a function rather than of a vector,
but on finite domains that’s just a notational variant of what we do here: a vector v ∈ RN can also be viewed as a
function v : {0, . . . , N − 1} → R defined by v(i) = vi . Also, in the classical literature people sometimes use the term
“Fourier transform” for what we call the inverse Fourier transform.
19
4.2 The Fast Fourier Transform
The naive way of computing the Fourier transform vb = FN v of v ∈ RN just does the matrix-
vector multiplication to compute all the entries of vb. This would take O(N ) steps (additions and
multiplications) per entry, and O(N 2 ) steps to compute the whole vector vb. However, there is a
more efficient way of computing vb. This algorithm is called the Fast Fourier Transform (FFT,
due to Cooley and Tukey in 1965), and takes only O(N log N ) steps. This difference between the
quadratic N 2 steps and the near-linear N log N is tremendously important in practice when N is
large, and is the main reason that Fourier transforms are so widely used.
We will assume N = 2n , which is usually fine because we can add zeroes to our vector to make
its dimension a power of 2 (but similar FFTs can be given also directly for most N that aren’t a
power of 2). The key to the FFT is to rewrite the entries of vb as follows:
N −1
1 X jk
vbj = √ ω vk
N k=0 N
!
1 X jk
X jk
= √ ωN vk + ωN vk
N even k odd k
!
1 1 X jk/2 ωj X j(k−1)/2
= √ p ωN/2 vk + p N ω vk
2 N/2 even k N/2 odd k N/2
Note that we’ve rewritten the entries of the N -dimensional Fourier transform vb in terms of two
N/2-dimensional Fourier transforms, one of the even-numbered entries of v, and one of the odd-
numbered entries of v.
This suggest a recursive procedure for computing bv : first separately compute the Fourier trans-
form v[even of the N/2-dimensional vector of even-numbered entries of v and the Fourier transform
vd
odd of the N/2-dimensional vector of odd-numbered entries of v, and then compute the N entries
1 j
vbj = √ ([vevenj + ωN vd
oddj ).
2
Strictly speaking this is not well-defined, because v[even and vd odd are just N/2-dimensional vectors.
However, if we define v[evenj+N/2 = v[evenj (and similarly for d
vodd ) then it all works out.
The time T (N ) it takes to implement FN this way can be written recursively as T (N ) =
2T (N/2) + O(N ), because we need to compute two N/2-dimensional Fourier transforms and do
O(N ) additional operations to compute vb. This works out to time T (N ) = O(N log N ), as promised.
Similarly, we have an equally efficient algorithm for the inverse Fourier transform FN−1 = FN∗ , whose
−jk
entries are √1N ωN .
20
We would like to compute the product of these two polynomials, which is
!
Xd Xd 2d X
X 2d
(p · q)(x) = aj x
j
bk x k
= ( aj bℓ−j )xℓ ,
j=0 k=0 ℓ=0 j=0
| {z }
cℓ
where implicitly we set bℓ−j = 0 if j > ℓ. Clearly, each coefficient cℓ by itself takes O(d) steps
(additions and multiplications) to compute, which suggests an algorithm for computing the coeffi-
cients of p · q that takes O(d2 ) steps. However, using the fast Fourier transform we can do this in
O(d log d) steps, as follows.
N N
PN −1 of two vectors a, b ∈ R is a vector a ∗ b ∈ R whose ℓ-th entry is defined by
The convolution
1
(a ∗ b)ℓ = √N j=0 aj bℓ−jmodN . Let us set N = 2d + 1 (the number of nonzero coefficients of p · q)
and make the (d + 1)-dimensional vectors of coefficients a and b N -dimensional by adding d zeroes.
Then√the coefficients of the polynomial p · q are proportional to the entries of the convolution:
cℓ = N (a ∗ b)ℓ . It is easy to show that the Fourier coefficients of the convolution of a and b are the
d
products of the Fourier coefficients of a and b: for every ℓ ∈ {0, . . . , N −1} we have a ∗ b = b aℓ ·bbℓ .
ℓ
This immediately suggests an algorithm for computing the vector of coefficients cℓ : apply the FFT
to a and b to get ba and bb, multiply those √two vectors entrywise to get ad ∗ b, apply the inverse FFT
to get a∗b, and finally multiply a∗b with N to get the vector c of the coefficients of p ·q. Since the
FFTs and their inverse take O(N log N ) steps, and pointwise multiplication of two N -dimensional
vectors takes O(N ) steps, this whole algorithm takes O(N log N ) = O(d log d) steps.
Note that if two numbers ad · · · a1 a0 and bd · · · b1 b0 are given in decimal notation, then we can
interpret the digits as coefficients of polynomials p and q, respectively, and the two numbers will
be p(10) and q(10). Their product is the evaluation of the product-polynomial p · q at the point
x = 10. This suggests that we can use the above procedure (for fast multiplication of polynomials)
to multiply two numbers in O(d log d) steps, which would be a lot faster than the standard O(d2 )
algorithm for multiplication that one learns in primary school. However, in this case we have to
be careful since the steps of the above algorithm are themselves multiplications between numbers,
which we cannot count at unit cost anymore if our goal is to implement a multiplication between
numbers! Still, it turns out that implementing this idea carefully allows one to multiply two d-digit
numbers in O(d log d log log d) elementary operations. This is known as the Schönhage-Strassen
algorithm. We’ll skip the details.
21
gates. This is exponentially faster than even the FFT (which takes O(N log N ) = O(2n n) steps),
but it achieves something different: computing the QFT won’t give us the entries of the Fourier
transform written down on a piece of paper, but only as the amplitudes of the resulting state.
The key to doing this efficiently is to rewrite FN |ki, which turns out to be a product state. Let
|ki = |kP
1 . . . kn i (k1 being the most significant bit). Note that for integer j = j1 . . . jn , we can write
j/2n = nℓ=1 jℓ 2−ℓ . For example, binary 0.101 is 1 · 2−1 + 0 · 2−2 + 1 · 2−3 = 5/8. We have
N −1
1 X 2πijk/2n
FN |ki = √ e |ji
N j=0
N −1
1 X 2πi(Pnℓ=1 jℓ 2−ℓ )k
= √ e |j1 . . . jn i
N j=0
N −1 n
1 X Y 2πijℓ k/2ℓ
= √ e |j1 . . . jn i
N j=0 ℓ=1
1
On
ℓ
= √ |0i + e2πik/2 |1i .
ℓ=1
2
ℓ
Note that e2πik/2 = e2πi0.kℓ ...kn : the ℓ − 1 most significant bits of k don’t matter for this.
As an example, for n = 3 we have the 3-qubit product state
1 1 1
F8 |k1 k2 k3 i = √ (|0i + e2πi0.k3 |1i) ⊗ √ (|0i + e2πi0.k2 k3 |1i) ⊗ √ (|0i + e2πi0.k1 k2 k3 |1i).
2 2 2
This example suggests what the circuit should be. To prepare the first qubit of the desired state
F8 |k1 k2 k3 i, just apply a Hadamard to |k3 i, giving state √12 (|0i + (−1)k3 |1i) and observe that
(−1)k3 = e2πi0.k3 . To prepare the second qubit of the desired state, apply a Hadamard to |k2 i, giving
√1 (|0i + e2πi0.k2 |1i), and then conditioned on k3 (before we apply the Hadamard to |k3 i) apply
3
R2 . This multiplies |1i with a phase e2πi0.0k3 , producing the correct qubit √12 (|0i + e2πi0.k2 k3 |1i).
Finally, to prepare the third qubit of the desired state, we apply a Hadamard to |k1 i, apply R2
conditioned on k2 and R3 conditioned k3 . This produces the correct qubit √12 (|0i + e2πi0.k1 k2 k3 |1i).
22
Figure 4.1: The circuit for the 3-qubit QFT
We have now produced all three qubits of the desired state F8 |k1 k2 k3 i, but in the wrong order : the
first qubit should be the third and vice versa. So the final step is just to swap qubits 1 and 3.
Figure 4.1 illustrates the circuit in the case n = 3. The general case works analogously: starting
with ℓ = 1, we apply a Hadamard to |kℓ i and then “rotate in” the additional phases required,
conditioned on the values of the later bits kℓ+1 . . . kn . Some swap gates at the end then put the
qubits in the right order.
Since the circuit involves n qubits, and at most n gates are applied to each qubit, the overall
circuit uses at most n2 gates. In fact, many of those gates are phase gates Rs with s ≫ log n,
which are very close to the identity and hence don’t do much anyway. We can actually omit those
from the circuit, keeping only O(log n) gates per qubit and O(n log n) gates overall. Intuitively,
the overall error caused by these omissions will be small (a homework exercise asks you to make
this precise). Finally, note that by inverting the circuit (i.e., reversing the order of the gates and
taking the adjoint U ∗ of each gate U ) we obtain an equally efficient circuit for the inverse Fourier
transform FN−1 = FN∗ .
3. Apply U to the second register for a number of time given by the first register. In other
words, apply the map |ki|ψi 7→ |kiU k |ψi = e2πiφk |ki|ψi
4. Apply the inverse Fourier transform FN−1 to the first n qubits and measure the result.
PN −1 2πiφk
Note that after step 3, the first n qubits are in state √1N k=0 e |ki = FN |2n φi, hence the
inverse Fourier transform is going to give us |2n φi = |φ1 . . . φn i with probability 1.
23
In case φ cannot be written exactly with n bits of precision, one can show that this procedure
still (with high probability) spits out a good n-bit approximation to φ. We’ll omit the calculation.
Exercises
1 1 1 0 1
1. For ω = e2πi/3 and F3 = √13 1 ω ω 2 , calculate F3 1 and F3 ω 2
1 ω2 ω 0 ω
2. Prove that the Fourier coefficients of the convolution of vectors a and b are the product of
the Fourier coefficients of a and b. In other words, prove that for every a, b ∈ RN and every
ℓ ∈ {0, . . . , N − 1} we have ad ∗b =b aℓ · bbℓ . Here the Fourier transform ba is defined as the
ℓ P N −1
vector FN a, and the ℓ-entry of the convolution-vector a ∗ b is (a ∗ b)ℓ = √1N j=0 aj bℓ−jmodN .
P P
3. The Euclideanp distance
P between two states |φi = i αi |ii and |ψi = i βi |ii is defined as
k|φi − |ψik = 2
i |αi − βi | . Assume the states are unit vectors with (for simplicity) real
amplitudes. Suppose the distance is small: k|φi − |ψik = ǫ. Show P that then the probabilities
resulting from a measurement on the two states are also close: α 2 − β 2 ≤ 2ǫ.
i i i
Hint: use |α2i − βi2 | = |αi − βi | · |αi + βi | and the Cauchy-Schwarz inequality.
4. The distance between two matrices A and B is defined as kA − Bk = max k(A − B)vk
v:kvk=1
1 0
(a) What is the distance between the 2 × 2 identity matrix and the phase-gate ?
0 eiφ
(b) Suppose we have a product of n-qubit unitaries U = UT UT −1 · · · U1 (for instance, each Ui
could be an elementary gate on a few qubits, tensored with identity on the other qubits).
Suppose we drop the j-th gate from this sequence: U ′ = UT UT −1 · · · Uj+1 Uj−1 · · · U1 .
Show that kU ′ − U k = kI − Uj k.
(c) Suppose we also drop the k-th unitary: U ′′ = UT UT −1 · · · Uj+1 Uj−1 · · · · · · Uk+1 Uk−1 · · · U1 .
Show that kU ′′ − U k ≤ kI − Uj k + kI − Uk k. Hint: use triangle inequality.
(d) Give a quantum circuit with O(n log n) elementary gates that has distance less than 1/n
from the Fourier transform F2n . Hint: drop all phase-gates with small angles φ < 1/n3 from the
O(n2 )-gate circuit for F2n explained in the lecture. Calculate how many gates there are left in the circuit,
and analyze the distance between the unitaries corresponding to the new circuit and the original circuit.
5. Suppose a ∈ RN is a vector which is r-periodic in the following sense: there exists an integer
r such that aℓ = 1 whenever ℓ is an integer multiple of r, and aℓ = 0 otherwise. Compute the
Fourier transform FN a of this vector, i.e., write down a formula for the entries of the vector
FN a. Assuming r divides N and r ≪ N , what are the entries with largest magnitude in the
vector FN a?
24
Chapter 5
5.1 Factoring
Probably the most important quantum algorithm so far is Shor’s factoring algorithm [64]. It can
find a factor of a composite number N in roughly (log N )2 steps, which is polynomial in the length
log N of the input. On the other hand, there is no known classical (deterministic or randomized)
algorithm that can factor N in polynomial time. The best known classical randomized algorithms
run in time roughly
α
2(log N ) ,
where α = 1/3 for a heuristic upper bound [47] and α = 1/2 for a rigorous upper bound [48].
In fact, much of modern cryptography is based on the conjecture that no fast classical factoring
algorithm exists [61]. All this cryptography (for example RSA) would be broken if Shor’s algorithm
could be physically realized. In terms of complexity classes: factoring (rather, the decision problem
equivalent to it) is provably in BQP but is not known to be in BPP. If indeed factoring is not
in BPP, then the quantum computer would be the first counterexample to the “strong” Church-
Turing thesis, which states that all “reasonable” models of computation are polynomially equivalent
(see [30] and [57, p.31,36]).
Shor also gave a similar algorithm for solving the discrete logarithm problem. His algorithm
was subsequently generalized to solve the so-called Abelian hidden subgroup problem [43, 24, 52].
We will not go into those issues here, and restrict to an explanation of the quantum factoring
algorithm.
25
≥ 1/2, r is even and xr/2 + 1 and xr/2 − 1 are not multiples of N .1 In that case we have:
xr ≡ 1 mod N ⇐⇒
r/2 2
(x ) ≡ 1 mod N ⇐⇒
(xr/2 + 1)(xr/2 − 1) ≡ 0 mod N ⇐⇒
r/2 r/2
(x + 1)(x − 1) = kN for some k.
Note that k > 0 because both xr/2 + 1 > 0 and xr/2 − 1 > 0 (x > 1). Hence xr/2 + 1 or xr/2 − 1
will share a factor with N . Because xr/2 + 1 and xr/2 − 1 are not multiples of N this factor will
be < N , and in fact both these numbers will share a non-trivial factor with N . Accordingly, if
we have r then we can efficiently (in roughly log N steps) compute the greatest common divisors
gcd(xr/2 + 1, N ) and gcd(xr/2 − 1, N ), and both of these two numbers will be non-trivial factors of
N . If we are unlucky we might have chosen an x that does not give a factor (which we can detect
efficiently), but trying a few different random x gives a high probability of finding a factor.
Thus the problem of factoring reduces to finding the period r of the function given by modular
exponentiation f (a) = xa mod N . In general, the period-finding problem can be stated as follows:
We will show below how we can solve this problem efficiently, using O(log log N ) evaluations of
f and O(log log N ) quantum Fourier transforms. An evaluation of f can be viewed as analogous
to the application of a query in the previous algorithms. Even a somewhat more general kind of
period-finding can be solved by Shor’s algorithm with very few f -evaluations,√ whereas any classical
bounded-error algorithm would need to evaluate the function Ω(N 1/3 / log N ) times in order to
find the period [21].
How many steps (elementary gates) will the algorithm take? For a = N O(1) , we can compute
f (a) = xa mod N in O((log N )2 log log N log log log N ) steps: compute x2 mod N, x4 mod N, x8
mod N, . . . by repeated squaring (using the Schönhage-Strassen algorithm for fast multiplica-
tion [46]) and take an appropriate product of these to get xa mod N . Moreover, as explained
in the previous chapter, the quantum Fourier transform can be implemented using O((log N )2 )
steps. Accordingly, Shor’s algorithm finds a factor of N using an expected number of roughly
(log N )2 (log log N )2 log log log N steps, which is only slightly worse than quadratic in the length of
the input.
e
1
Here’s a proof for those familiar with basic number theory. Let N = pe11 · · · pkk be its factorization into odd
primes (the exponents ei are positive integers). By the Chinese remainder theorem, choosing a uniformly random x
mod N is equivalent to choosing, independently for all i ∈ {1, . . . , k}, a uniformly random xi mod pei i . Let r be the
period of the sequence (xa mod N )a , and ri the period of the sequence (xai mod pi )a . It is easy to see that ri divides
r, so if r is odd then all ri must be odd. The probability that ri is odd is 1/2, because the group of numbers mod pei i
is cyclic, so half of its elements are squares. Hence the probability that r is odd, is at most 1/2k (in particular, it’s
at most 1/4 if N is the product of k = 2 distinct primes). Now condition on the period r being even. Then xr/2 6= 1
mod N , for otherwise the period would be at most r/2. If xr/2 = −1 mod N , then xr/2 = −1 mod each pei i . The
latter won’t happen if ri is divisible by 4, which (since we are already conditioning on ri being even) happens with
probability at least 1/2. Hence, conditioning on r being even, the probability that xr/2 = −1 mod N , is at most
1/2k . Hence the probability that r is odd or xr/2 = ±1 mod N , is at most 2/2k ≤ 1/2.
26
5.3 Shor’s period-finding algorithm
Now we will show how Shor’s algorithm finds the period r of the function f , given a “black-box”
that maps |ai|0i → |ai|f (a)i. We can always efficiently pick some q = 2ℓ such that N 2 < q ≤ 2N 2 .
Then we can implement the Fourier transform Fq using O((log N )2 ) gates. Let Of denote the
unitary that maps |a, 0n i 7→ |a, f (a)i, where the first register consists of ℓ qubits, and the second
of n = ⌈log N ⌉ qubits.
|0i
.. Fq Fq .. measure
. .
|0i
Of
|0i
.. ..
. . measure
|0i
Shor’s period-finding algorithm is illustrated in Figure 5.1. Start with |0ℓ i|0n i. Apply the QFT
(or just ℓ Hadamard gates) to the first register to build the uniform superposition
q−1
1 X
√ |ai|0n i.
q
a=0
The second register still consists of zeroes. Now use the “black-box” to compute f (a) in quantum
parallel:
q−1
1 X
√ |ai|f (a)i.
q a=0
Observing the second register gives some value f (s), with s < r. Let m be the number of elements
of {0, . . . , q − 1} that map to the observed value f (s). Because f (a) = f (s) iff a = s mod r, the a
of the form a = jr + s (0 ≤ j < m) are exactly the a for which f (a) = f (s). Thus the first register
collapses to a superposition of |si, |r + si, |2r + si, . . . , |q − r + si and the second register collapses
to the classical state |f (s)i. We can now ignore the second register, and have in the first:
m−1
1 X
√ |jr + si.
m
j=0
27
We want to see which |bi have
Pm−1 large squared amplitudes—those are the b we are likely to see if we
now measure. Using that j=0 aj = (1 − am )/(1 − a) for a 6= 1 (see Appendix B), we compute:
rb
m−1
X X
m−1
rb j
m if e2πi q = 1
2πi jrb 2πi
e q = e q = 2πi mrb rb (5.1)
if e2πi q 6= 1
1−e q
j=0 j=0 2πi rb
q
1−e
Easy case: r divides q. Let us do an easy case first. Suppose r divides q, so the whole period
“fits” an integer number of times in the domain {0, . . . , q − 1} of f , and m = q/r. For the first case
of Eq. (5.1), note that e2πirb/q = 1 iff rb/q is an integer iff b is a multiple of q/r. Such b will have
√
squared amplitude equal to (m/ mq)2 = m/q = 1/r. Since there are exactly r such b, together
they have all the amplitude. Thus we are left with a superposition where only the b that are integer
multiples of q/r have non-zero amplitude. Observing this final superposition gives some random
multiple b = cq/r, with c a random number 0 ≤ c < r. Thus we get a b such that
b c
= ,
q r
where b and q are known and c and r are not. There are φ(r) ∈ Ω(r/ log log r) numbers smaller than
r that are coprime to r [37, Theorem 328], so c will be coprime to r with probability Ω(1/ log log r).
Accordingly, an expected number of O(log log N ) repetitions of the procedure of this section suffices
to obtain a b = cq/r with c coprime to r.2 Once we have such a b, we can obtain r as the denominator
by writing b/q in lowest terms.
Before continuing with the harder case, notice the resemblance of the basic subroutine of
Shor’s algorithm (Fourier, f -evaluation, Fourier) with the basic subroutine of Simon’s algorithm
(Hadamard, query, Hadamard).
Hard case: r does not divide q. Because our q is a power of 2, it is actually quite likely that
r does not divide q. However, the same algorithm will still yield with high probability a b which
is close to a multiple of q/r. Note that q/r is no longer an integer, and m = ⌊q/r⌋, possibly +1.
All calculations up to and including Eq. (5.1) are still valid. Using |1 − eiθ | = 2| sin(θ/2)|, we can
rewrite the absolute value of the second case of Eq. (5.1) to
2πi mrb
|1 − e q | | sin(πmrb/q)|
= .
2πi rb | sin(πrb/q)|
|1 − e q |
The righthand size is the ratio of two sine-functions of b, where the numerator oscillates much
faster than the denominator because of the additional factor of m. Note that the denominator is
close to 0 (making the ratio large) iff b is close to an integer multiple of q/r. For most of those
b, the numerator won’t be close to 0. Hence, roughly speaking, the ratio will be small if b is far
from an integer multiple of q/r, and large for most b that are close to a multiple of q/r. Doing the
calculation precisely, one can show that with high probability the measurement yields a b such that
b c 1
− ≤ .
q r 2q
2
The number of required f -evaluations for period-finding can actually be reduced from O(log log N ) to O(1).
28
As in the easy case, b and q are known to us while c and r are unknown.
Two distinct fractions, each with denominator ≤ N , must be at least 1/N 2 > 1/q apart.3
Therefore c/r is the only fraction with denominator ≤ N at distance ≤ 1/2q from b/q. Applying a
classical method called “continued-fraction expansion” to b/q efficiently gives us the fraction with
denominator ≤ N that is closest to b/q (see the next section). This fraction must be c/r. Again,
with good probability c and r will be coprime, in which case writing c/r in lowest terms gives us r.
This is called a continued fraction (CF). The ai are the partial quotients. We assume these to be
positive natural numbers ([37, p.131] calls such CF “simple”). [a0 , . . . , an ] is the nth convergent of
the fraction. [37, Theorem 149 & 157] gives a simple way to compute numerator and denominator
of the nth convergent from the partial quotients:
If
p0 = a0 , p1 = a1 a0 + 1, pn = an pn−1 + pn−2
q0 = 1, q1 = a1 , qn = an qn−1 + qn−2
pn
then [a0 , . . . , an ] = . Moreover, this fraction is in lowest terms.
qn
Note that qn increases at least exponentially with n (qn ≥ 2qn−2 ). Given a real number x, the
following “algorithm” gives a continued fraction expansion of x [37, p.135]:
a0 := ⌊x⌋, x1 := 1/(x − a0 )
a1 := ⌊x1 ⌋, x2 := 1/(x − a1 )
a2 := ⌊x2 ⌋, x3 := 1/(x − a2 )
...
Informally, we just take the integer part of the number as the partial quotient and continue with
the inverse of the decimal part of the number. The convergents of the CF approximate x fast [37,
Theorem 164 & 171] (recall that qn increases exponentially with n).
pn 1
If x = [a0 , a1 , . . .] then x − < 2.
qn qn
Moreover, pn /qn provides the best approximation of x among all fractions with denominator ≤
qn [37, Theorem 181]:
pn p
If n > 1, q ≤ qn , p/q 6= pn /qn , then x − < x− .
qn q
3
Consider two fractions z = x/y and z ′ = x′ /y ′ with y, y ′ ≤ N . If z 6= z ′ then |xy ′ − x′ y| ≥ 1, and hence
|z − z ′ | = |(xy ′ − x′ y)/yy ′ | ≥ 1/N 2 .
29
Exercises
1. This exercise is about efficient classical implementation of modular exponentiation.
(a) Given n-bit numbers x and N (where n = ⌈log2 N ⌉), compute the whole sequence
n−1
x0 mod N, x1 mod N , x2 mod N , x4 mod N , x8 mod N ,x16 mod N, . . . , x2 mod N ,
2
using O(n log(n) log log(n)) steps. Hint: The Schönhage-Strassen algorithm computes the product
of two n-bit integers mod N , in O(n log(n) log log(n)) steps.
(b) Suppose n-bit number a can be written as a = an−1 . . . a1 a0 in binary. Express xa mod
N as a product of the numbers computed in part (a).
(c) Show that you can compute f (a) = xa mod N in O(n2 log(n) log log(n)) steps.
2. Use Shor’s algorithm to find the period of the function f (a) = 7a mod 10, using a Fourier
transform over q = 128 elements. Write down all intermediate superpositions of the algorithm.
You may assume you’re lucky, meaning the first run of the algorithm already gives a b = cq/r
where c is coprime with r.
3. Suppose we can apply a unitary U and we are given an eigenvector |ψi of U (U |ψi = λ|ψi),
and we would like to approximate the corresponding eigenvalue λ. Since U is unitary, λ must
have magnitude 1, so we can write it as λ = e2πiφ for some real number φ ∈ [0, 1); the only
thing that matters is this phase φ. Suppose for simplicity that we know that φ = 0.φ1 . . . φn
can be written with n bits of precision.
(a) Let V be a two-register unitary that maps |ki|ψi 7→P |kiU k |ψi, for any n-bit integer k.
1
Write the state that results from applying V to √2n k∈{0,1}n |ki|ψi
(to avoid trivialities, your answer shouldn’t contain the letter ‘U ’ anymore).
(b) Give a quantum algorithm that computes φ exactly, using one application of V and a
small (polynomial in n) number of other gates. Hint: use the inverse QFT.
4. This exercise explains RSA encryption. Suppose Alice wants to allow other people to send
encrypted messages to her, such that she is the only one who can decrypt them. She believes
that factoring an n-bit number can’t be done efficiently (efficient = in time polynomial in n).
So in particular, she doesn’t believe in quantum computing.
Alice chooses two large random prime numbers, p and q, and computes their product N = p·q.
She computes the so-called Euler φ-function: φ(N ) = (p − 1)(q − 1); she also chooses an
encryption exponent e, which doesn’t share any nontrivial factor with φ(N ) (i.e., e and φ(N )
are coprime). Group theory guarantees there is an efficiently computable decryption exponent
d such that de = 1 mod φ(N ). The public key consists of e and N (Alice puts this on her
homepage), while the secret key consists of d and N . Any number m ∈ {1, . . . , N − 1} that is
coprime to N , can be used as a message. There are φ(N ) such m, and these numbers form a
group under the operation of multiplication mod N . The number of bits n = ⌈log2 N ⌉ of N is
the maximal length (in bits) of a message m and also the length (in bits) of the encryption.4
The encryption function is defined as C(m) = me mod N , and the decryption function is
D(c) = cd mod N .
4
A typical size is n = 1024 bits, which corresponds to both p and q being numbers of roughly 512 bits.
30
(a) Give a randomized algorithm by which Alice can efficiently generate the secret and public
key. Hint: the prime number theorem implies that roughly 1/ log N of the numbers between 1 and N are
prime; also there is an efficient algorithm to test if a given number is prime. Be explicit in how many
bits your primes p and q will have.
(b) Show that Bob can efficiently compute the encoding C(m) of the message m that he
wants to send to Alice, knowing the public key but not the private key. Hint: exercise 1
(c) Show that D(C(m)) = m for all possible message.
Hint: the set of all possible messages forms a group of size φ(N ). Euler’s Theorem says that in any
group G, we have a|G| = 1 for all a ∈ G (here ‘1’ is the identity element in the group).
(d) Show that Alice can efficiently and correctly decrypt the encryption C(m) that she
receives from Bob.
(e) Show that if Charly could factor N , then he could efficiently decrypt Bob’s message.
31
Chapter 6
The second-most important quantum algorithm after Shor’s, is Grover’s quantum search problem
from 1996 [36]. While it doesn’t provide an exponential speed-up, it is much more widely applicable
than Shor’s algorithm.
This problem may be viewed as a simplification of the problem of searching an N -slot unordered
database. Classically, a randomized
√algorithm would need√ Θ(N ) queries to solve the search problem.
Grover’s algorithm solves it in O( N ) queries, and O( N log N ) other gates.
32
|0i H ...
n |0i H G G ... G measure
...
|0i H
| {z }
k
Then the uniform state over all indices edges can be written as
N −1
1 X p
|U i = √ |ii = sin(θ)|Gi + cos(θ)|Bi, for θ = arcsin( t/N ).
N i=0
The Grover iterate G is actually the product of two reflections:1 Ox,± is a reflection through |Bi,
and
−H ⊗n OG H ⊗n = H ⊗n (2|0n ih0n | − I)H ⊗n = 2|U ihU | − I
is a reflection through |U i. Here is Grover’s algorithm restated, assuming we know the fraction of
solutions is ε = t/N :
3. Measure the first register and check that the resulting i is a solution
Geometric argument: There is a fairly simple geometric argument why the algorithm works.
The analysis is in the 2-dimensional real plane spanned by |Bi and |Gi. We start with
|U i = sin(θ)|Gi + cos(θ)|Bi.
The two reflections (a) and (b) increase the angle from θ to 3θ, moving us towards the good state,
as illustrated in Figure 6.2.
The next two reflections (a) and (b) increase the angle with another 2θ, etc. More generally,
after k applications of (a) and (b) our state has become
33
|Gi |Gi |Gi
✻ ✻ ✻ G|U i
✣
|U i |U i |U i
✿ ✿ 2θ ✿
θ ✲ |Bi θ ✲ |Bi θ ✲ |Bi
θ
③
Ox,± |U i
Figure 6.2: The first iteration of Grover: (1) start with |U i, (2) reflect through |Bi to get Ox,± |U i,
(3) reflect through |U i to get G|U i
Algebraic argument: For those who don’t like geometry, here’s an alternative (but equivalent)
algebraic argument. Let ak denote the amplitude of the indices of the t 1-bits after k Grover
iterates, and bk the amplitude
√ of the indices of the 0-bits. Initially, for the uniform superposition
|U i we have a0 = b0 = 1/ N . Using that −H ⊗n OG H ⊗n = [2/N ] − I, where [2/N ] is the matrix in
which all entries are 2/N , we find the following recursion:
N − 2t 2(N − t)
ak+1 = ak + bk
N N
−2t N − 2t
bk+1 = ak + bk
N N
The following formulas, due to Boyer et al. [15], provide a closed form for ak and bk (which may be
verified by filling them into the recursion).
1
ak = √ sin((2k + 1)θ)
t
1
bk = √ cos((2k + 1)θ)
N −t
p
where θ = arcsin( t/N )
Accordingly, after k iterations the success probability (the sum of squares of the amplitudes of the
t 1-bits) is the same as in the geometric analysis
34
p
Now we have a bounded-error quantum search algorithm with O( N/t) queries, assuming we
know t. In fact, if we know t then the algorithm can be tweaked to end up in exactly the good state.
Roughly speaking, you can make the angle θ slightly smaller, such that k̃ = π/4θ − 1/2 becomes
an integer. We will skip the details.
If we do not know t, then there is a problem: we do not know which k to use, so we do not
know when to stop doing the Grover iterates. Note that if k gets too big, the success probability
Pk = (sin((2k + 1)θ))2 goes down again! However, a slightly more complicated algorithm due to
[15] (basically running pthe above algorithm with systematic different guesses for k) shows that an
expected number of O( N/t) queries still suffice to find a solution if there are t solutions. If there
is no solution (t = 0) we can easily detect that by checking xi for the i that the algorithm outputs.
3. Measure the first register and check that the resulting vertex x is marked.
√
Defining θ = arcsin( p) and good and bad states |Gi and |Bi in analogy with the earlier geometric
argument for Grover’s algorithm, the same reasoning shows that amplitude amplification indeed
finds a solution with high probability. This way, we can speed up a very large class of classical
heuristic algorithms: any algorithm that has some non-trivial probability of finding a solution
can be amplified to success probability nearly 1 (provided we can efficiently check solutions, i.e.,
implement Oχ ).
Grover’s algorithm itself is a special case of amplitude amplification, where Oχ = Ox and
A = H ⊗n is the Hadamard transform, which can be viewed as an algorithm with success probability
p = t/N .
35
We can also use it to speed up the computation of NP-complete problems, albeit only quadrati-
cally, not exponentially. As an example, consider the satisfiability problem: we are given a Boolean
formula φ(i1 , . . . , in ) and want to know if it has a satisfying assignment, i.e., a setting of the bits
i1 , . . . , in that makes φ(i1 , . . . , in ) = 1. A classical brute-force search along all 2n possible assign-
ments takes time roughly 2n .
To find a satisfying assignment faster, define the N = 2n -bit input to Grover’s algorithm by
xi = φ(i), where i ∈ {0, 1}n . For a given i = i1 . . . in it is easy to compute φ(i) classically in
polynomial time. We can write that computation as a reversible circuit (using only Toffoli gates),
corresponding to a unitary Uφ that maps |i, 0, 0i 7→ |i, φ(i), wi i, where the third register holds
some classical workspace the computation may have needed. To apply Grover we need an oracle
that puts the answer in the phase and doesn’t leave workspace around (as that would mess up the
interference effects). Define Ox as the unitary that first applies Uφ , then applies a Z-gate to the
second register, and then applies Uφ to “clean up” the workspace again. This has the form we need
for Grover: Ox,± |ii = (−1)xi |ii. Now we can run Grover and find a satisfying assignment √ with
high probability if there is one, using a number of elementary operations that is 2 times some n
polynomial factor.
Exercises
1. (a) Suppose n = 2, and x = x00 x01 x10 x11 = 0001. Give the initial, intermediate, and final
superpositions in Grover’s algorithm, for k = 1 queries. What is the success probability?
(b) Give the final superposition for the above x after k = 2 iterations. What is now the
success probability?
2. Show that if the number of solutions is t = N/4, then Grover’s algorithm always finds a
solution with certainty after just one query. How many queries would a classical algorithm
need to find a solution with certainty if t = N/4? And if we allow the classical algorithm
error probability 1/10?
4. Let x = x0 . . . xN −1 , where N = 2n and xi ∈ {0, 1}n , be an input that we can query in the
usual way. We are promised that this input is 2-to-1: for each i there is exactly one other j
such that xi = xj .2 Such an (i, j)-pair is called a collision.
2
The 2-to-1 inputs for Simon’s algorithm are a very special case of this, where xi equals xj if i = j ⊕ s for fixed
but unknown s ∈ {0, 1}n .
36
(a) Suppose S is a randomly chosen set of s elements of {0, . . . , N − 1}. What is the
probability that there exists a collision in S?
(b) Give a classical
√ randomized algorithm that finds a collision (with probability ≥ 2/3)
√
using O( N ) queries to x. Hint: What is the above probability if s = 2 N ?
(c) Give a quantum algorithm that finds a collision (with probability ≥ 2/3) using O(N 1/3 )
queries. Hint: Choose a set S of size s = N 1/3 , and classically query all its elements. First check if S
contains a collision. If yes, you’re done. If not, use Grover to find a j 6∈ S that collides with an i ∈ S.
37
Chapter 7
This may seem like a stupid algorithm, but it has certain advantages. For instance, it only needs
space O(log N ), because you only need to keep track of the current vertex y, and maybe a counter
that keeps track of how many steps you’ve already taken.1 Such an algorithm can for example
decide whether there is a path from a specific vertex y to a specific vertex x using O(log N ) space.
We’d start the walk at y and only x would be marked; one can show that if there exists a path
from y to x in G, then we’ll reach x in poly(N ) steps.
Let us restrict attention to d-regular graphs without self-loops, so each vertex has exactly d
neighbors. Also assume the graph is connected. A random walk on such a graph corresponds to
an N × N symmetric matrix P , where Px,y = 1/d if (x, y) is an edge in G, and Px,y = 0 otherwise.
If v ∈ RN is a vector with a 1 at position y and 0s elsewhere, then P v is a vector whose xth entry
is (P v)x = 1/d if (x, y) is an edge, and (P v)x = 0 otherwise. In other words, P v is the uniform
probability distribution over the neighbors of y, which is what you get by taking one step of the
random walk starting at y. More generally, if v is a probability distribution on the vertices, then
P v is the new probability distribution on vertices after taking one step of the random walk, and
P k v is the probability distribution after taking k steps.
Suppose we start with some probability-distribution vector v (which may or may not be con-
centrated at one vertex y). Then P k v will converge to the uniform distribution over all vertices,
and the speed of convergence is determined by the difference between the first and second-largest
eigenvalues of P . This can be seen as follows. Let λ1 ≥ λ2 ≥ · · · ≥ λN be the eigenvalues of P , or-
dered by size, and v1 , . . . , vN be corresponding orthogonal eigenvectors. First, the largest eigenvalue
is λ1 = 1, and corresponds to the eigenvector v1 = u = (1/N ) which is the uniform distribution over
all vertices. Second, all other eigenvalues λi will be in (−1, 1) and the corresponding normalized
1
Here we’re assuming the neighbors of any one vertex are efficiently computable, so you don’t actually need to
keep the whole graph in memory. This will be true for all graphs we consider here.
38
eigenvector vi will be orthogonal to u (hence the sum of the entries of vi is 0). Let δ be the difference
between λ1 = 1 and maxi≥2 |λi | (this δ is known as the “spectral
P gap”). Then |λi | ≤ 1 − δ for all
i ≥ 2. Now decompose the starting distribution v as v = i αi vi . Since the sum of v’s entries is 1,
and the sum of v1 ’s entries is 1, while each other eigenvector vi has entries summing to 0, it follows
that α1 = 1. Now let’s see what happens if we apply the random walk for k steps, starting from v:
!
X X X
P kv = P k αi vi = αi λki vi = u + αi λki vi .
i i i≥2
P P
Since v is a probability distribution we have kvk2 = i vi2 ≤ i vi = 1. Choosing k = ln(1/η)/δ
we get P k v − u ≤ η. In particular, if δ is not too small, then we get quick convergence of the
random walk to the uniform distribution u, no matter which distribution v we started with.2 Once
we are close to the uniform distribution, we have probability roughly ε of hitting a marked vertex.
Of course, the same happens if we just pick a vertex at random, but that may not always be an
option if the graph is given implicitly.
Suppose it costs S to set up an initial state v, it costs U to perform one step of the random
walk, and it costs C to check whether a given vertex is marked. “Cost” is left undefined for now,
but typically it will count number of queries to some input, or number of elementary operations.
Consider a classical search algorithm that starts at v, and then repeats the following until it finds a
marked vertex: check if the current vertex is marked, and if not run a random walk for roughly 1/δ
steps to get close to the uniform distribution. Ignoring constant factors, the expected cost before
this procedure finds a marked item, is on the order of
1 1
S+ C+ U . (7.1)
ε δ
39
P p
otherwise. Define |px i = y Pxy |yi to be the uniform superposition over the neighbors of x. As
for Grover, define “good” and “bad” states as the superpositions over good and bad basis states:
1 X 1 X
|Gi = p |xi|px i and |Bi = p |xi|px i,
|M | x∈M N − |M | x6∈M
√
where M denotes the set of marked vertices. If ε = |M |/N and θ := arcsin( ε) then the uniform
state over all edges can be written as
1 X
|U i = √ |xi|px i = sin(θ)|Gi + cos(θ)|Bi.
N x
Here is the algorithm for searching marked vertices, if an ε-fraction of the vertices is marked3 :
3. Measure the first register and check that the resulting vertex x is marked.
We’ll explain in a moment how to implement (a) and (b). Assuming we know how to do that,
the proof that this algorithm finds a marked vertex is the same as for Grover. We start with
|U i = sin(θ)|Gi + cos(θ)|Bi. The two reflections (a) and (b) increase the angle from θ to 3θ,
moving us towards the good state (as for Grover, draw a 2-dimensional picture with axes |Bi and
|Gi to see this). More generally, after k applications of (a) and (b) our state has become
(a) Reflect through |Bi. Reflecting through |Bi is relatively straightforward: we just have to
“recognize” whether the first register contains a marked x, and put a −1 if so.
(b) Reflect through |U i. This is where the quantum random walk comes in. Let A be the
subspace span{|xi|px i} and B be span{|py i|yi}. Let ref(A) denote the unitary which is a reflection
through A (i.e., ref(A)v = v for all vectors v ∈ A, and ref(A)w = −w for all vectors w orthogonal
to A) and similarly define ref(B). Define W (P ) = ref(B)ref(A) to be the product of these two
reflections. This is the unitary analogue of P , and may be called “one step of a quantum random
walk.” It’s somewhat hard to quickly get a good intuition for this, and we won’t try this here.
To do the reflection through |U i, we want to construct a unitary R(P ) that maps |U i 7→ |U i and
|ψi 7→ −|ψi for all |ψi that are orthogonal to |U i (and that are in the span of the eigenvectors of
3
As in Grover, if we don’t know ε then we just run the algorithm repeatedly with exponentially decreasing guesses
for ε (1/2, 1/4, 1/8, . . . ). If at the end we still haven’t found a marked item, we’ll conclude that probably none exists.
40
W (P )). We will do that by means of phase estimation on W (P ) (see Section 4.6). The eigenvalues
of W (P ) can be related to the eigenvalues of P as follows. Let λ1 = cos(θ1 ), λ2 = cos(θ2 ), . . . be the
eigenvalues of P . It turns out that the eigenvalues of W (P ) are of the form e±2iθj . W (P ) has one
eigenvalue-1 eigenvector, which is |U i, corresponding to θ1 = 0. The spectral √ gap of P is δ, so all
other eigenvectors of W (P ) correspond to an eigenvalue e ±2iθ j where |θj | ≥ 2δ in absolute value
(because 1 − δ ≥ |λj | = | cos(θj )| ≥ 1 − θj2 /2). The procedure R(P ) will add an ancilla register
√
and do phase estimation
√ of precision O(1/ δ) to detect the unique eigenvalue-1 eigenvector |U i.
This requires O(1/ δ) applications of W (P ). Let’s analyze this on some eigenvector |wi of W (P ),
with corresponding eigenvalue e±2iθj√ . Assume for simplicity that phase estimation gives the best
estimate θ̃j of θj within precision 1/2 δ (phase estimation will actually have some error probability,
but
√ we’ll skip the technical tricks needed √ to deal with this). Because the non-zero |θj | are at least
2δ, approximating them within 1/2 δ is good enough to determine whether θj itself is 0 or not.
If θj 6= 0, then R(P ) puts a minus in front of the state. Finally, it reverses the phase estimation.
In formulas, R(P ) maps
This has the desired effect: R(P ) maps |U i 7→ |U i, and |ψi 7→ −|ψi for all |ψi orthogonal to |U i.
Now that we know how to implement the algorithm, let’s look at its complexity. Consider the
following setup, update, and checking costs:
• Update cost U : the cost of realizing the following two maps and their inverse:
(1) |xi|0i 7→ |xi|px i
(2) |0i|yi 7→ |py i|yi
• Checking cost C: the cost of the unitary map |xi|yi 7→ mx |xi|yi, where mx = −1 if x is
marked, and mx = 1 otherwise
Note that ref(A) can be implemented by applying the inverse of (1), putting a minus if the second √
register is not |0i, and applying (1). We can similarly implement ref(B). Since R(P ) takes
√ O(1/ δ)
applications of W (P ) = ref(B)ref(A), the cost of part (b) of the algorithm is O(U / δ). The cost
of part (a) of the algorithm is C. Ignoring constant factors, the total cost of the algorithm is then
1 1
S+√ C+√ U . (7.2)
ε δ
Compare this with the classical cost of Eq. (7.1): quantum search square-roots both ε and δ.
7.3 Applications
There are a number of interesting quantum random walk algorithms that beat the best classical
algorithms. We’ll give three examples here. More can be found in [62].
41
7.3.1 Grover search
Let’s first derive a quantum algorithm for search. Suppose we have an N -bit string x of weight t,
and we know t/N ≥ ε. Consider the complete graph G on N vertices. Then the matrix P for the
random walk on G has 0s on its diagonal, and its off-diagonal entries are all equal to 1/(N − 1).
This can be written as P = N 1−1 J − N 1−1 I, where J is the all-1 matrix and I is the identity. It
is easy to see that λ1 = N/(N − 1) − 1/(N − 1) = 1 (corresponding to the uniform vector) and
all other eigenvalues are −1/N . Hence δ = 1 − 1/N . We’ll mark a vertex i iff xi = 1. Then,
measuring cost by number of queries, a quantum random walk on G will have S = U = 0 and
√
√ find a marked vertex in time O(1/ ε). The
C = 1. Plugging this into Eq. (7.2), it will probably
worst case is ε = 1/N , in which case we’ll use O( N ) queries. Not surprisingly, we’ve essentially
rederived Grover’s algorithm.
The decision version of this problem (deciding if there exists at least one collision) is also known
as element distinctness.
Consider the graph whose vertices are the sets R ⊆ {0, . . . , n − 1} of r elements. The total
number of vertices is N = nr . We’ll put an edge between two vertices R and R′ iff they differ in
exactly two elements; in other words, you can get from R to R′ by removing one element i from R
and replacing it by a new element j. The resulting graph J(n, r) is known as the Johnson graph.
It is r(n − r)-regular, since every R has r(n − r) different neighbors R′ . Its spectral gap is known
to be δ = n/r(n − r); we won’t prove that here.
We’ll mark a vertex in J(n, r) if it contains a collision. In the worst case, there is exactly one
colliding pair i, j (more collisions only make the problem easier). The probability that i and j are
r−1
both in a random r-set R, is ε = nr n−1 . Hence the fraction of marked vertices is at least ε.
For each set R we also keep track of {(i, xi ) | i ∈ R}. Hence the setup cost (measured in terms
of queries) is S = r + 1: create a uniform superposition over all edges R, R′ , and for each such basis
state query all r + 1 elements of R ∪ R′ . Mapping the second register |Ri|0i to a superposition of all
neighbors R′ means removing the value xi of the element i that is removed from R, and querying
the value xj of the element j that was added to get R′ . Hence the update cost is U = 2. Checking
whether a given R contains a collision doesn’t take any queries if we already have {(i, xi ) | i ∈ R},
so C = 0. Plugging this into Eq. (7.2), the cost of a quantum random walk algorithm for collision-
finding is
1 1 √
S + √ C + √ U = O(r + n/ r).
ε δ
This is O(n2/3 ) if we set r = n2/3 . This O(n2/3 ) turns out to be the optimal query complexity for the
collision problem [2]. By some more work involving efficient data structures, the time complexity
(= total number of elementary quantum gates) can be brought down to n2/3 (log n)O(1) [7].
4
Say, all xi ≤ n2 to avoid having to use too much space to store these numbers.
42
7.3.3 Finding a triangle in a graph
Consider the following triangle-finding problem:
We’ll assume we have query access to the entries of the adjacency matrix of H, which tells us
n
whether (u, v) is an edge or not. There are 2 bits in this oracle, one for each potential edge
of H. It is not hard to see that a classical algorithm needs Ω(n2 ) queries before it can decide
with good probability whether a graph contains a triangle or not. For example, take a bipartite
graph consisting of 2 sets of n/2 vertices each, such that any pair of vertices from different sets is
connected by an edge. Such a graph is triangle-free, but adding any one edge will create a triangle.
A classical algorithm would have to query all those edges separately.
Let’s try a quantum random walk approach. Again consider the Johnson graph J(n, r). Each
r
vertex R will be a set of r vertices, annotated with the result of querying all possible 2 edges
having both endpoints in R. The setup cost will be S = 2r . The update cost will be U = 2r − 2,
because if we remove one vertex i from R then we have to remove information about r − 1 edges
in H, and if we add a new j to R we have to query r − 1 new edges in H.
Mark a vertex R if it contains one edge of a triangle. If there is at least one triangle in the
r−1
graph, the fraction of marked vertices is at least ε = nr n−1 ≈ (r/n)2 . Getting a good upper bound
for the checking cost C requires some work—namely Grover search plus another quantum random
walk! Suppose we are given a set R of r vertices. How do we decide whether R contains an edge
of a triangle? If we can efficiently decide, for a given u and R, whether R contains vertices v, w
such that u, v, w form a triangle in H, then we could combine this with a Grover search over all
n possible vertices u of H. Given u and R, let’s design a subroutine based on another quantum
random walk, this time on the Johnson graph J(r, r 2/3 ). Each vertex of this Johnson graph is a
subset R′ ⊆ R of r ′ = r 2/3 vertices. Its spectral gap is δ′ = r/r ′ (r − r ′ ) ≈ 1/r 2/3 . We’ll mark
R′ if it contains vertices v, w such that u, v, w form a triangle. If there is at least one triangle
involving u and some v, w ∈ R, then the fraction of marked vertices R′ in J(r, r 2/3 ) is at least
′ ′ −1
ε′ = rr rr−1 ≈ 1/r 2/3 . For this subroutine, the setup cost is O(r 2/3 ), the update cost is O(1), and
the checking cost is 0. Plugging this into Eq. (7.2), we can decide whether a fixed u forms a triangle
with two vertices in R′ , using O(r 2/3 ) queries. Let’s ignore the small error probability of the latter
subroutine (it can be dealt with, but that’s technical). Then we can combine it with Grover search
√
over all n vertices u to get checking cost C = O( nr 2/3 ).
Plugging these S, U , and C into Eq. (7.2), the overall cost of a quantum random walk algorithm
for triangle-finding is
1 1 n √
S + √ C + √ U = O r 2 + ( nr 2/3 + r 3/2 ) .
ε δ r
This is O(n13/10 ) if we set r = n3/5 [50]. The exponent 13/10 can be slightly improved further, and
it’s an open question what the optimal quantum query complexity for triangle-finding is; the best
lower bound is only n. Also, the optimal quantum time complexity of this problem is still wide
open.
43
Exercises
1. Let P be the projector on a d-dimensional subspace V ⊆ Rn that is spanned by orthonormal
vectors v1 , . . . , vd . This means that P v = v for all v ∈ V , and P w = 0 for all w that are
orthogonal to V .
P
(a) Show that P can be written in Dirac notation as P = di=1 |vi ihvi |.
(b) Show that R = 2P − I is a reflection through the subspace corresponding to P , i.e.,
Rv = v for all v in the subspace and Rw = −w for all w that are orthogonal to the
subspace.
2. Let A, B, and C be n × n matrices with real entries. We’d like to decide whether or not
AB = C. Of course, you could multiply A and B and compare the result with C, but matrix
multiplication is expensive (the current best algorithm takes time roughly O(n2.38 )).
(a) Give a classical randomized algorithm that verifies whether AB = C (with success prob-
ability at least 2/3) using O(n2 ) steps, using the fact that matrix-vector multiplication
can be done in O(n2 ) steps. Hint: Choose a uniformly random vector v ∈ {0, 1}n , calculate ABv
and Cv, and check whether these two vectors are the same.
(b) Show that if we have query-access to the entries of the matrices (i.e., oracles that map
i, j, 0 7→ i, j, Ai,j and similarly for B and C), then any classical algorithm with small
error probability needs at least n2 queries to detect a difference between AB and C.
Hint: Consider the case A = I.
(c) Give a quantum random walk algorithm that verifies whether AB = C (with success
probability at least 2/3) using O(n5/3 ) queries to matrix-entries. Hint: Modify the algorithm
for collision-finding: use a random walk on the Johnson graph J(n, r), where each vertex corresponds to
a set R ⊆ [n], and that vertex is marked if there are i, j ∈ R such that (AB)i,j 6= Ci,j .
One can show that this algorithm has probability at least (3/4)n of finding a satisfying
assignment (if φ is satisfiable). You may assume this without proof.
(a) Use the above to give a classical algorithm that finds a satisfying assignment with high
probability in time (4/3)n · p(n), where p(n) is some polynomial factor (no need to use
the C, U, S-framework of the lecture notes here; the answer is much simpler).
44
p
(b) Give a quantum algorithm that finds one (with high probability) in time (4/3)n · p(n).
Hint: view the 3n-step random walk algorithm as a deterministic algorithm with an additional input
r ∈ {0, 1}n × {1, 2, 3}3n , where the first n bits determine x, and the last 3n entries determine which
variable of the leftmost false clauses will be flipped in the 3n steps of the random walk. Use Grover
search on the space of all such r (no need to write out complete circuits here).
45
Chapter 8
8.1 Introduction
Almost all the algorithms we have seen so far in this course worked in the query model. Here
the goal is to compute some function f : {0, 1}N → {0, 1} on a given input x ∈ {0, 1}N . The
distinguishing feature of the query model is the way x is accessed: x is not given explicitly, but
is stored in a random access memory, and we’re being charged unit cost for each query that we
make to this memory. Informally, a query asks for and receives the i-th element xi of the input.
Formally, we model a query unitarily as the following 2-register quantum operation Ox , where the
first register is N -dimensional and the second is 2-dimensional1 :
Ox : |i, bi 7→ |i, b ⊕ xi i.
In particular, |i, 0i 7→ |i, xi i. This only states what Ox does on basis states, but by linearity
determines the full unitary. Note that a quantum algorithm can apply Ox to a superposition of
basis states, gaining some sort of access to several input bits xi at the same time.
A T -query quantum algorithm starts in a fixed state, say the all-0 state |0 . . . 0i, and then in-
terleaves fixed unitary transformations U0 , U1 , . . . , UT with queries. The algorithm’s fixed unitaries
may act on a workspace-register, in addition to the two registers on which Ox acts. In this case we
implicitly extend Ox by tensoring it with the identity operation on this extra register, so it maps
Hence the final state of the algorithm can be written as the following matrix-vector product:
UT Ox UT −1 Ox · · · Ox U1 Ox U0 |0 . . . 0i.
This state depends on the input x only via the T queries. The output of the algorithm is obtained
by a measurement of the final state. For instance, if the output is Boolean, the algorithm could
just measure the final state in the computational basis and output the first bit of the result.
The query complexity of some function f is now the minimal number of queries needed for an
algorithm that outputs the correct value f (x) for every x in the domain of f (with error probability
1
If the input x consists of non-binary items xi (as is the case for instance with the input for Simon’s algorithm)
then those can be simulated by querying individual bits of each xi .
46
at most 1/3, say). Note that we just count queries to measure the complexity of the algorithm, while
the intermediate fixed unitaries are treated as costless. In many cases, the overall computation time
of quantum query algorithms (as measured by the total number of elementary gates, say) is not
much bigger than the query complexity. This justifies analyzing the latter as a proxy for the former.
This is the model in which essentially all the quantum algorithm we’ve seen work: Deutsch-Jozsa,
Simon, Grover, the various random walk algorithms. Even the period-finding algorithm that is the
quantum core of Shor’s algorithm works because it needs only few queries to the periodic function.
for some complex numbers aS . The degree of p is deg(p) = max{|S| : aS 6= 0}. If is easy to show
that every function f : {0, 1}N → C has a unique representation as such a polynomial; deg(f ) is
defined as the degree of that polynomial. For example, the 2-bit AND function is p(x1 , x2 ) = x1 x2 ,
and the 2-bit Parity function is p(x1 , x2 ) = x1 + x2 − 2x1 x2 . Both polynomials have degree 2.
Sometimes a lower degree suffices for a polynomial to approximate the function. For example,
p(x1 , x2 ) = 31 (x1 + x2 ) approximates the 2-bit AND function up to error 1/3 for all inputs, using
degree 1.
A very useful property of T -query algorithms is that the amplitudes of their final state are
degree-T N -variate polynomials of x [33, 9]. More precisely: consider a T -query algorithm with
input x ∈ {0, 1}N acting on an m-qubit space. Then its final state can be written
X
αz (x)|zi,
z∈{0,1}m
where each αz is a multilinear polynomial in x of degree at most T . Each basis state |zi = |i, b, wi
consists of 3 registers: the two registers |i, bi of the query, and a workspace register containing basis
state |wi. The algorithm now makes another query Ox followed by a unitary UT +1 . The query
swaps basis states |i, 0, wi and |i, 1, wi if xi = 1, and doesn’t do anything to these basis states if
xi = 0. This changes amplitudes as follows:
47
Now the new amplitudes are of the form (1−xi )αi,0,w (x)+xi αi,1,w (x) or xi αi,0,w (x)+(1−xi )αi,1,w (x).
The new amplitudes are still polynomials in x, . . . , xN . Their degree is at most 1 more than the
degree of the old amplitudes, so at most T +1. Finally, since UT +1 is a linear map that is independent
of x, it does not increase the degree of the amplitudes further (the amplitudes after UT +1 are linear
combinations of the amplitudes before UT +1 ). This concludes the induction step.
Note that this construction could introduce degrees higher than 1, e.g., terms of the form x2i .
However, our inputs xi are 0/1-valued, so we have xki = xi for all integers k ≥ 1. Accordingly, we
can reduce higher degrees to 1, making the polynomials multilinear without increasing degree. ✷
Suppose our algorithm acts on an m-qubit state. If we measure the first qubit of the final state
and output the resulting bit, then the probability of output 1 is given by
X
p(x) = |αz (x)|2 ,
z∈1{0,1}m−1
which is a real-valued polynomial of x of degree at most 2T . Note that if the algorithm computes
f with error ≤ 1/3, then p is an approximating polynomial for f : if f (x) = 0 then p(x) ∈ [0, 1/3]
and if f (x) = 1 then p(x) ∈ [2/3, 1]. This gives a method to lower bound the minimal number
of queries needed to compute f : if one can show that every polynomial that approximates f has
degree at least d, then every quantum algorithm computing f with error ≤ 1/3 must use at least
d/2 queries.
Applications of the polynomial method. For our examples we will restrict attention to sym-
metric functions.2 Those are the ones where the function value f (x) only depends on the Hamming
weight (number of 1s) in the input x. Examples are N -bit OR, AND, Parity, Majority, etc.
Suppose we have a polynomial p(x1 , . . . , xN ) that approximates f with error ≤ 1/3. Then it is
easy to see that a polynomial that averages over all permutations of the input bits x1 , . . . , xN :
1 X
q(x) = p(π(x)),
N!
π∈SN
still approximates f . As it turns out, we can define a single-variate polynomial r(z) of the same
degree as q, such that q(x) = r(|x|).3 This r is defined on all real numbers, and we know something
about its behavior on integer points {0, . . . , N }. Thus it suffices to lower bound the degree of
single-variate polynomials with the appropriate behavior.
For an important example, consider the N -bit OR function. Grover’s algorithm can find an i
such that√xi = 1 (if such an i exists) and hence can compute the OR function with error ≤ 1/3
using O( N ) queries. By the above reasoning, any T -query quantum algorithm that computes the
OR with error ≤ 1/3 induces a single-variate polynomial r satisfying
2
One can also use the polynomial method for non-symmetric functions, for instance to prove a tight lower bound
of Ω(N 2/3 ) queries for the general problem of collision-finding [2]; this matches the quantum random walk algorithm
of the previous lecture. However, that proof is substantially more complicated and we won’t give it here.
3
To see why this is the case, note that for every degree i, all degree-i monomials in the symmetrized polynomial
q have the same coefficient ai . Moreover, on input x ∈ {0, 1}N of Hamming weight z, exactly zi of the degree-i
` ´
Pd `|x|´ `z ´
monomials are 1, while the others are 0. Hence q(x) = i=0 ai i . `Since d
= z(z − 1) · · · (z − d + 1)/d! is a
Pd
single-variate polynomial in z of degree d, we can define r(z) = i=0 ai zi .
´
48
r(0) ∈ [0, 1/3], and r(t) ∈ [2/3, 1] for all integers t ∈ {1, . . . , N }
This polynomial r(x) “jumps” between x = 0 and x = 1 (i.e., it has a derivative r ′ (x) ≥ 1/3 for
some x ∈ [0, 1]), while it remains fairly constant on the domain {1, . . . , N }. By a classical theorem
from approximation theory (proved independently around the same time √ by Ehlich and Zeller, √ and
by Rivlin and Cheney), such polynomials must have degree d ≥ Ω( N ). Hence T ≥ Ω( N ) as
well. Accordingly, Grover’s algorithm is optimal (up to a constant factor) in terms of number of
queries.
What about exact algorithms for OR? Could we tweak √ Grover’s algorithm so that it always
finds a solution with probability 1 (if one exists), using O( N ) queries? This turns out not to be
the case: a T -query exact algorithm for OR induces a polynomial r of degree ≤ 2T that satisfies
It is not hard to see that such a polynomial needs degree at least N : observe that r(x) − 1 is a
non-constant polynomial with at least N roots.4 Hence T ≥ N/2. Accordingly, Grover cannot be
made exact without losing the square-root speed-up!
Using the polynomial method, one can in fact show for every symmetric function f that is
defined on all 2N inputs, that quantum algorithms cannot provide a more-than-quadratic speed-up
over classical algorithms. More generally, for every function f (symmetric or non-symmetric) that
is defined on all inputs5 , quantum algorithms cannot provide a more-than-6th-root speed-up over
classical algorithms.
UT Ox UT −1 Ox · · · Ox U1 Ox U0 ,
applied to the fixed starting state |0 . . . 0i, where the basic “query transformation” Ox depends
on the input x, and U0 , U1 , . . . , UT are arbitrary unitaries that don’t depend on x. Consider the
evolution of our quantum state under all possible choices of x; formally, we let |ψxt i denote the state
at time t (i.e., after applying Ox for the t-th time) under input x. In particular, |ψx0 i = |0 . . . 0i for
all x (and hψx0 |ψy0 i = 1 for each x, y). Now if the algorithm computes the Boolean function f with
success probability 2/3 on every input, then the final measurement must accept every x ∈ f −1 (0)
4
A “root” is an x such that r(x) = 0. It is a well-known fact from algebra that every non-constant polynomial of
degree d has at most d roots (over any field).
5
Note that this doesn’t include functions where the input has to satisfy a certain promise, such as Deutsch-Jozsa
and Simon’s problem.
49
with probability ≤ 1/3, and accept every y ∈ f −1 (1) with success probability ≥ 2/3. It is not hard
17 6
to verify that this implies |hψxT |ψyT i| ≤ 18 . This suggests that we find a R ⊆ f −1 (0) × f −1 (1) of
hard-to-distinguish (x, y)-pairs, and consider the progress measure
X
St = |hψxt |ψyt i|
(x,y)∈R
as a function of t. By our observations, initially we have S0 = |R|, and in the end we must have
ST ≤ 1718 |R|. Also, crucially, the progress measure is unaffected by each application of a unitary Ut ,
since each Ut is independent of the input and unitary transformations preserve inner products.
If we can determine an upper bound ∆ on the change |St+1 − St | in the progress measure at
|R|
each step, we can conclude that the number T of queries is at least 18∆ . Ambainis proved the
following: suppose that
(iii) for each x ∈ f −1 (0) and i ∈ [N ], there are at most ℓ0 inputs y ∈ f −1 (1) such that (x, y) ∈ R
and xi 6= yi ;
(iv) for each y ∈ f −1 (1) and i ∈ [N ], there are at most ℓ1 inputs x ∈ f −1 (0) such that (x, y) ∈ R
and xi 6= yi .
q
ℓ0 ℓ1
Then for all t ≥ 0, |St+1 − St | = O ·
m0 m1 · |R| , and therefore
r
m0 m1
T =Ω · . (8.1)
ℓ0 ℓ1
Intuitively, conditions (i)-(iv) imply that |St+1 − St | is small relative to |R| by bounding the “dis-
tinguishing ability” of any query. The art in applying this technique lies in choosing the relation R
carefully to maximize this quantity, i.e., make m0 and/or m1 large, while keeping √ ℓ0 and ℓ1 small.
Note that for the N -bit OR function this method easily gives the optimal Ω( N ) lower bound,
as follows. Choose R = {(x, y) : x = 0N , y has Hamming weight 1}. Then m0 = N while m1 =
ℓ0 = ℓ1 = 1. Plugging this into Eq. (8.1) gives the right bound.
Let us give another application, a lower bound that we don’t know how to get from the poly-
nomial method. Suppose f : {0, 1}N → {0, 1} is a 2-level AND-OR tree, with N = k2 input bits:
f is the AND of k ORs, each of which has its own set of k inputs bits. By carefully doing 2 levels
of Grover search (search for a subtree which is 0k ), √one √ can can construct
√ a quantum algorithm
that computes f with small error probability and O( k · k) = O( N ) queries. It is still an open
problem to give a matching lower bound on the approximate degree. The adversary method gives
the optimal lower bound on the quantum query complexity: choose the relation R as follows
6
Remember an earlier homework exercise for states |φi and |ψi: if kφ − ψk = ε then the (total variation) distance
between the probability distributions you get from measuring |φi and |ψi, respectively, is at most 2ε. Hence if we
know there is a two-outcome measurement that accepts |φi with probability ≤ 1/3 and accepts |ψi with probability
≥ 2/3, then we must have total variation distance at least 2/3 and hence ε ≥ 1/3. Via the equation ε2 = kφ − ψk2 =
2 − 2Re(hv|wi), this translate into an upper bound |hv|wi| ≤ 1 − ε2 /2 ≤ 17/18.
50
R consists of those pairs (x, y) where
x has one subtree with input 0k and the other k − 1 subtrees have an arbitrary k-bit
input of Hamming weight 1 (note f (x) = 0)
y is obtained from x by changing one of the bits of the 0k -subtree to 1 (note f (y) = 1).
q √
m0 m1
Then m0 = m1 = k and ℓ0 = ℓ1 = 1, and we get a lower bound of Ω ℓ0 ℓ1 = Ω(k) = Ω( N ).
Exercises
1. Consider a 2-bit input x = x0 x1 with an oracle Ox : |ii 7→ (−1)xi |ii. Write out the final state
of the following 1-query quantum algorithm: HOx H|0i. Give a degree-2 polynomial p(x0 , x1 )
that equals the probability that this algorithm outputs 1 on input x. What function does
this algorithm compute?
2. Consider polynomial p(x1 , x2 ) = 0.3 + 0.4x1 + 0.5x2 , which approximates the 2-bit OR func-
tion. Write down the symmetrized polynomial q(x1 , x2 ) = 12 (p(x1 , x2 ) + p(x2 , x1 )). Give a
single-variate polynomial r such that q(x) = r(|x|) for all x ∈ {0, 1}2 .
3. Let f be the N -bit Parity function, which is 1 if its input x ∈ {0, 1}N has odd Hamming
weight, and 0 if the input has even Hamming weight (assume N is an even number).
(a) Give a quantum algorithm that computes Parity with success probability 1 on every
input x, using N/2 queries. Hint: think of Exercise 1.
(b) Show that this is optimal, even for quantum algorithms that have error probability ≤ 1/3
on every input Hint: show that the symmetrized approximate polynomial r induced by the algorithm
has degree at least N .
4. Suppose we have a T -query quantum algorithm that computes the N -bit AND function with
success probability 1 on all inputs x ∈ {0, 1}N . In the lecture we showed that such an
algorithm has T ≥ N/2 (we showed it for OR, but the same argument works for AND).
Improve this lower bound to T ≥ N .
5. Let f be the N -bit Majority function, which is 1 if its input x ∈ {0, 1}N has Hamming weight
> N/2, and 0 if the input has Hamming weight ≤ N/2 (assume N is even). Use the adversary
method to show that every bounded-error quantum algorithm for computing Majority, needs
Ω(N ) queries. Hint: when defining the relation R, consider that the hardest task for this algorithm is to
distinguish inputs of weight N/2 from inputs of weight N/2 + 1.
51
Chapter 9
52
Below we will try to classify those using the tools of complexity theory.
• P. The class of problems that can be solved by classical deterministic computers using
polynomial time.
• BPP. The problems that can be solved by classical randomized computers using polynomial
time (and with error probability ≤ 1/3 on every input).
• NP. The problems where the ‘yes’-instances can be verified in polynomial time if some
prover gives us a polynomial-length “witness.” Some problems in this class are NP-complete,
meaning that any other problem in NP can be reduced to it in polynomial time. Hence
the NP-complete problems are the hardest problems in NP. An example is the problem of
satisfiability: we can verify that a given n-variable Boolean formula is satisfiable if a prover
gives us a satisfying assignment, so it is in NP, but one can even show that is NP-complete.
Other examples are integer linear programming, travelling salesman, graph-colorability, etc.
• PSPACE. The problems that can be solved by classical deterministic computers using
polynomial space.
We can consider quantum analogues of all such classes, an enterprise that was started by Bernstein
and Vazirani [14]:
• EQP. The class of problems that can be solved exactly by quantum computers using poly-
nomial time. This class depends on the set of elementary gates one allows, and is not so
interesting.
• BQP. The problems that can be solved by quantum computers using polynomial time (and
with error probability ≤ 1/3 on every input). This class is the accepted formalization of
“efficiently solvable by quantum computers.”
• QNP. The problems where the ‘yes’-instances can be verified efficiently if some prover gives
us a polynomial-length “quantum witness.” This is again dependent on the elementary gates
one allows, and not so interesting. Allowing error probability ≤ 1/3 on every input, we get to
a class called QMA (“quantum Merlin-Arthur”). This is a more robust and more interesting
quantum version of NP; unfortunately we don’t have time to study it in this course.
• QPSPACE. The problems that can be solved by quantum computers using polynomial
space. This turns out to be the same as classical PSPACE.
In all the above cases, the error probability 1/3 can be reduced efficiently to much smaller constants
ε: just run the computation k = O(log(1/ε)) times and take the majority of the answers given by
the k different runs.
53
We should be a bit careful about what we mean by a “polynomial time [or space] quantum
algorithm.” Our model for computation has been quantum circuits, and we need a separate quan-
tum circuits for each new input length. So a quantum algorithm of time p(n) would correspond to
a family of quantum circuits {Cn }, where Cn is the circuit that is used for inputs of length n; it
should have at most p(n) elementary gates.1
In the next section we will prove that BQP ⊆ PSPACE. We have BPP ⊆ BQP, because
a BPP-machine on a fixed input length n can be written as a polynomial-size reversible circuit
(i.e., consisting of Toffoli gates) that starts from a state that involves some coin flips. Quantum
computers can generate those coin flips using Hadamard transforms, then run the reversible circuit,
and measure the final answer bit. It is believed that BQP contains problems that aren’t in BPP,
for example factoring large integers: this problem (or rather decision-version thereof) is in BQP
because of Shor’s algorithm, and is generally believed not to be in BPP. Thus we have the following
sequence of inclusions:
P ⊆ BPP ⊆ BQP ⊆ PSPACE.
It is generally believed that P = BPP, while the other inclusions are believed to be strict. Note
that a proof that BQP is strictly greater than BPP (for instance, a proof that factoring cannot
be solved efficiently by classical computers) would imply that P 6= PSPACE, solving what has
been one of the main open problems in computers science since the 1960s. Hence such a proof—if
it exists at all—will probably be very hard.
What about the relation between BQP and NP? It’s generally believed that NP-complete
problems are probably not in BQP. The main evidence for this is the lower bound for Grover
search: a quantum brute-force search on all 2n possible assignments to an n-variable formula gives
a square-root speed-up, but not more. This is of course not a proof, since there might be clever,
non-brute-force methods to solve satisfiability. There could also be problems in BQP that are not
in NP, so it may well be that BQP and NP are incomparable. Much more can be said about
quantum complexity classes; see for instance Watrous’s survey [70].
54
The suggestion to devise a quantum computer to simulate quantum physics is of course a brilliant
one, but the main motivation is not quite accurate. As it turns out, it is not necessary to keep
track of all (exponentially many) amplitudes in the state to classically simulate a quantum system.
Here we will show that it can actually be simulated efficiently in terms of space [14], though not
necessarily in terms of time.
Consider a circuit with T = poly(n) gates that acts on S qubits. Assume for simplicity that
all gates are either the 1-qubit Hadamard or the 3-qubit Toffoli gate (as mentioned before, these
two gates suffice for universal quantum computation), and that the classical output (0 or 1) of the
algorithm is determined by a measurement of the first qubit of the final state. Without loss of
generality S ≤ 3T , because T Toffoli gates won’t affect more than 3T qubits. Let Uj be the unitary
that applies the jth gate to its (1 or 3) qubits,
√ and applies
√ identity to all other qubits. The entries
of this matrix are of a simple form (0, 1/ 2, or −1/ 2 for Hadamard; 0 or 1 for Toffoli) and easy
to compute. Let |i0 i = |xi|0S−n i be the starting state, where x ∈ {0, 1}n is the classical input, and
the second register contains the workspace qubits the algorithm uses. The final state will be
X T
Y
= hij |Uj |ij−1 i.
iT −1 ,...,i1 j=1
The number hij |Uj |ij−1 i is just one entry of the matrix Uj and hence easy to calculate. Then
QT
j=1 hij |Uj |ij−1 i is also easy to compute, in polynomial space (and √
polynomial time). If ℓ of the T
gates are Hadamards, then each such number is either 0 or ±1/ 2 . ℓ
Q
Adding up Tj=1 hij |Uj |ij−1 i for all iT −1 , . . . , i1 is also easy to do in polynomial space if we
reuse space for each new iT −1 , . . . , i1 . Hence the amplitude hiT |ψx i can be computed exactly
using polynomial space.2 We assume that the BQP machine’s answer is obtained by measuring
the first qubit of the final state. Then its acceptance
P probability is the sum of squares of all
amplitudes of basis states starting with a 1: 2
iT :(iT )1 =1 |hiT |ψx i| . Since we can compute each
hiT |ψx i in polynomial space, the acceptance probability of a BQP-circuit on classical input x can
be computed in polynomial space.
Exercises
1. The following problem is a decision version of the factoring problem:
2
Of course, the calculation will take exponential time, because there are 2S(T −1) different sequences iT −1 , . . . , i1
that we need to go over sequentially.
55
Given positive integers N and k, decide if N has a prime factor p ∈ {k, . . . , N − 1}.
Show that if you can solve this decision problem efficiently (i.e., in time polynomial in the
input length n = ⌈log N ⌉), then you can also find the prime factors of N efficiently. Hint: use
binary search, running the algorithm with different choices of k to “zoom in” on the largest prime factor.
2. (a) Let U be an S-qubit unitary which applies a Hadamard gate to the kth qubit, and
identity gates to the other S − 1 qubits. Let i, j ∈ {0, 1}S . Show an efficient way to
calculate the matrix-entry Ui,j = hi|U |ji (note: even though U is a tensor product of
2 × 2 matrices, it’s still a 2S × 2S matrix, so calculating U completely isn’t efficient).
(b) Let U be an S-qubit unitary which applies a CNOT gate to the kth and ℓth qubits, and
identity gates to the other S − 2 qubits. Let i, j ∈ {0, 1}S . Show an efficient way to
calculate the matrix-entry Ui,j = hi|U |ji.
3. Consider a circuit C with T = poly(n) elementary gates (only Hadamards and Toffolis) acting
on S = poly(n) qubits. Suppose this circuit computes f : {0, 1}n → {0, 1} with bounded error
probability: for every x ∈ {0, 1}n , when we start with basis state |x, 0S−n i, run the circuit
and measure the first qubit, then the result equals f (x) with probability at least 99/100.
(a) Consider the following quantum algorithm: start with basis state |x, 0S−n i, run the
above circuit C without the final measurement, apply a Z gate to the first qubit, and
reverse the circuit C. Denote the resulting final state by |ψx i. Show that if f (x) = 0
then the amplitude of basis state |x, 0S−n i in |ψx i is in the interval [1/2, 1], while if
f (x) = 1 then the amplitude of |x, 0S−n i in |ψx i is in [−1, −1/2].
(b) PP is the class of computational decision problems that can be solved by classical
randomized polynomial-time computers with success probability > 1/2 (however, the
success probability could be exponentially close to 1/2, i.e., PP is BPP without the ‘B’
for bounded-error). Show that BQP ⊆ PP.
Hint: use part (a). Analyze the amplitude of |x, 0S−n i in the final state |ψx i, using ideas from the proof
of BQP ⊆ PSPACE that we saw in the lecture. You may assume BQP-algorithms have error at most
1/100 instead of the usual 1/3. Note that you cannot use more than polynomial time.
56
Chapter 10
|φihφ| 7→ U |φihφ|U ∗ .
By linearity, this actually tells us how a unitary acts on an arbitrary mixed state:
ρ 7→ U ρU ∗ .
57
to get outcome i is given by pi = Tr(Pi |φihφ|) = |hi|φi|2 = |αi |2 . If we get outcome i then the
state collapses to Pi |φihφ|Pi /pi = αi |iihi|α∗i /pi = |iihi|. This is exactly the measurement in the
computational basis as we have used it in the course until now. Similarly, a measurement of the
first register of a two-register state corresponds to projectors Pi = |iihi| ⊗ I, where i goes over all
basis states of the first register.
If we only care about the final probability distribution on the k outcomes, not about the
resulting state, then the most general thing we can do is a so-called positive-operator-valued
P measure
(POVM). This is specified by k positive semidefinite matrices E1 , . . . , Ek satisfying ki=1 = Ei =
I. When measuring a state ρ, the probability of outcome i is given by Tr(Ei ρ). A projective
measurement is the special case where the measurement elements Ei are projectors.
Holevo’s Theorem: The mother of all such results is Holevo’s theorem from 1973 [38], which
predates the area of quantum computing by several decades. Its proper technical statement is
in terms of a quantum generalization of mutual information, but the following consequence of it
(derived by Cleve et al. [23]) about two communicating parties, suffices for our purposes.
Theorem 1 (Holevo, CDNT) If Alice wants to send n bits of information to Bob via a quantum
channel (i.e., by exchanging quantum systems), and they do not share an entangled state, then they
have to exchange at least n qubits. If they share unlimited prior entanglement, then Alice has to
send at least n/2 qubits to Bob, no matter how many qubits Bob sends to Alice.
This theorem is slightly imprecisely stated here, but the intuition is very clear: the first part of
the theorem says that if we encode some classical random variable X in an m-qubit state2 , then
no measurement on the quantum state can give more than m bits of information about X. More
precisely: the classical mutual information between X and the classical measurement outcome M
on the m-qubit system is at most m. If we encoded the classical information in an m-bit system
instead of an m-qubit system this would be a trivial statement, but the proof of Holevo’s theorem
is quite non-trivial. Thus we see that a m-qubit state, despite somehow “containing” 2m complex
amplitudes, is no better than m classical bits for the purpose of storing information.3
Low-dimensional encodings: Here we provide a “poor man’s version” of Holevo’s theorem due
to Nayak [53, Theorem 2.4.2], which has a simple proof and often suffices for applications. Suppose
we have a classical random variable X, uniformly distributed over [N ] = {1, . . . , N }. Let x 7→ ρx
2
Via an encoding map x 7→ ρx ; we generally use capital letters like X to denote random variables, lower case like
x to denote specific values.
3
This is in the absence of prior entanglement; if Alice and Bob do share entanglement, then m qubits are no better
than 2m classical bits. The factor of 2 is necessary: if Alice and Bob share m EPR-pairs, then Alice can transmit
2m classical bits to Bob by sending only m qubits.
58
be some encoding of [N ], where ρx is a mixed state in a d-dimensional space. Let E1 , . . . , EN be
the POVM operators applied for decoding; these sum to the d-dimensional identity operator. Then
the probability of correct decoding in case X = x, is
px = Tr(Ex ρx ) ≤ Tr(Ex ).
In other words, if we are encoding one of N classical values in a d-dimensional quantum state, then
any measurement to decode the encoded classical value has average success probability at most
d/N (uniformly averaged over all N values that we can encode). For example, if we encode n bits
into m qubits, we will have N = 2n , d = 2m , and the average success probability of decoding is at
most 2m /2n .
Random access codes: The previous two results dealt with the situation where we encoded a
classical random variable X in some quantum system, and would like to recover the original value
X by an appropriate measurement on that quantum system. However, suppose X = X1 . . . Xn is
a string of n bits, uniformly distributed and encoded by a map x 7→ ρx , and it suffices for us if
we are able to decode individual bits Xi from this with some probability p > 1/2. More precisely,
for each i ∈ [n] there should exist a measurement {Mi , I − Mi } allowing us to recover xi : for each
x ∈ {0, 1}n we should have Tr(Mi ρx ) ≥ p if xi = 1 and Tr(Mi ρx ) ≤ 1 − p if xi = 0. An encoding
satisfying this is called a quantum random access code, since it allows us to choose which bit of
X we would like to access. Note that the measurement to recover xi can change the state ρx , so
generally we may not be able to decode more than one bit of x (also, we cannot copy ρx because
of the no-cloning theorem – see the homework).
An encoding that allows us to recover (with high success probability) an n-bit string requires
about n qubits by Holevo. Random access codes only allow us to recover each of the n bits. Can
they be much shorter? In small cases they can be: for instance, one can encode two classical bits
into one qubit, in such a way that each of the two bits can be recovered with success probability 85%
from that qubit (see the homework). However, Nayak [53] proved that asymptotically quantum
random access codes cannot be much shorter than classical.
Theorem 2 (Nayak) Let x 7→ ρx be a quantum random access encoding of n-bit strings into
m-qubit states such that, for each i ∈ [n], we can decode Xi from |φX i with success probability p
(averaged over a uniform choice of x and the measurement randomness). Then m ≥ (1 − H(p))n,
where H(p) = −p log p − (1 − p) log(1 − p) is the binary entropy function.
The intuition of the proof is quite simple: since the quantum state allows us to predict the bit
Xi with probability pi , it reduces the “uncertainty” about Xi from 1 bit to H(pi ) bits. Hence it
contains at least 1 − H(p
Pi ) bits of information about Xi . Since all n Xi ’s are independent, the state
has to contain at least ni=1 (1 − H(pi )) bits about X in total.
59
10.3 Lower bounds on locally decodable codes
Here we will give an application of quantum information theory to a classical problem.4
The development of error-correcting codes is one of the success stories of science in the second
half of the 20th century. Such codes are eminently practical, and are widely used to protect
information stored on discs, communication over channels, etc. From a theoretical perspective,
there exist codes that are nearly optimal in a number of different respects simultaneously: they
have constant rate, can protect against a constant noise-rate, and have linear-time encoding and
decoding procedures. We refer to Trevisan’s survey [66] for a complexity-oriented discussion of
codes and their applications.
One drawback of ordinary error-correcting codes is that we cannot efficiently decode small
parts of the encoded information. If we want to learn, say, the first bit of the encoded message
then we usually still need to decode the whole encoded string. This is relevant in situations where
we have encoded a very large string (say, a library of books, or a large database), but are only
interested in recovering small pieces of it at any given time. Dividing the data into small blocks
and encoding each block separately will not work: small chunks will be efficiently decodable but
not error-correcting, since a tiny fraction of well-placed noise could wipe out the encoding of one
chunk completely. There exist, however, error-correcting codes that are locally decodable, in the
sense that we can efficiently recover individual bits of the encoded string.
Definition 1 C : {0, 1}n → {0, 1}N is a (q, δ, ε)-locally decodable code (LDC) if there is a classical
randomized decoding algorithm A such that
2. For all x and i, and all y ∈ {0, 1}N with Hamming distance d(C(x), y) ≤ δN we have
Pr[Ay (i) = xi ] ≥ 1/2 + ε.
The notation Ay (i) reflects that the decoder A has two different types of input. On the one
hand there is the (possibly corrupted) codeword y, to which the decoder has oracle access and from
which it can read at most q bits of its choice. On the other hand there is the index i of the bit that
needs to be recovered, which is known fully to the decoder.
The main question about LDCs is the tradeoff between the codelength N and the number of
queries q (which is a proxy for the decoding-time). This tradeoff is still not very well understood.
The only case where we know the answer is the case of q = 2 queries (1-query LDCs don’t exist
once n is sufficiently large [41]). For q = 2 there is the Hadamard code: given x ∈ {0, 1}n , define
a codeword of length N = 2n by writing down the bits x · z mod 2, for all z ∈ {0, 1}n . One can
decode xi with 2 queries as follows: choose z ∈ {0, 1}n uniformly at random and query the (possibly
corrupted) codeword at indices z and z ⊕ ei , where the latter denotes the string obtained from z by
flipping its i-th bit. Individually, each of these two indices is uniformly distributed. Hence for each
of them, the probability that the returned bit of is corrupted is at most δ. By the union bound,
with probability at least 1 − 2δ, both queries return the uncorrupted values. Adding these two bits
mod 2 gives the correct answer:
C(x)z ⊕ C(x)z⊕ei = (x · z) ⊕ (x · (z ⊕ ei )) = x · ei = xi .
4
There is a growing number of such applications of quantum tools to non-quantum problems. See [28] for a survey.
60
Thus the Hadamard code is a (2, δ, 1/2 − 2δ)-LDC of exponential length.
The only superpolynomial lower bound known on the length of LDCs is for the case of 2 queries:
there one needs an exponential codelength and hence the Hadamard code is essentially optimal.
This is shown via a quantum argument [42]—despite the fact that the result is a purely classical
result, about classical codes and classical decoders. The easiest way to present this argument is to
assume the following fact, which states a kind of “normal form” for the decoder.
Fact 1 (Katz & Trevisan [41] + folklore) For every (q, δ, ε)-LDC C : {0, 1}n → {0, 1}N , and
for each i ∈ [n], there exists a set Mi of Ω(δεN/q 2 ) disjoint tuples, each of at most q indices from
[N ], and a bit ai,t for each tuple t ∈ Mi , such that the following holds:
X
Pr xi = ai,t ⊕ C(x)j ≥ 1/2 + Ω(ε/2q ), (10.2)
x∈{0,1}n
j∈t
where the probability is taken uniformly over x. Hence to decode xi from C(x), the decoder can just
query the indices in a randomly chosen tuple t from Mi , outputting the sum of those q bits and ai,t .
Note that the above decoder for the Hadamard code is already of this form, with Mi = {(z, z ⊕ ei )}.
We omit the proof of Fact 1. It uses purely classical ideas and is not hard.
Now suppose C : {0, 1}n → {0, 1}N is a (2, δ, ε)-LDC. We want to show that the codelength
N must be exponentially large in n. Our strategy is to show that the following N -dimensional
quantum encoding is in fact a quantum random access code for x (with some success probability
p > 1/2):
N
1 X
x 7→ |φx i = √ (−1)C(x)j |ji.
N j=1
Theorem 2 then implies that the number of qubits of this state (which is ⌈log N ⌉) is at least
(1 − H(p))n = Ω(n), and we are done.
Suppose we want to recover xi from |φx i. We’ll do this by a sequence of two measurements,
as follows. We turn each Mi from Fact 1 into aPmeasurement: for each pair (j, k) ∈ Mi form the
projector Pjk = |jihj| + |kihk|, and let Prest = j6∈∪t∈M t |jihj| be the projector on the remaining
i
indices. These |Mi | + 1 projectors sum to the N -dimensional identity matrix, so they form a valid
projective measurement. Applying this to |φx i gives outcome (j, k) with probability kPjk |φx ik2 =
2/N for each (j, k) ∈ Mi . There are |Mi | = Ω(δεN ) different (j, k)-pairs in Mi , so the probability
to see one of those as outcome of the measurement, is |Mi | · 2/N = Ω(δε). With the remaining
probability r = 1 − Ω(δε), we’ll get “rest” as outcome of the measurement. In the latter case we
didn’t get anything useful from the measurement, so we’ll just output a fair coin flip as our guess
for xi (then the output will equal xi with probability exactly 1/2). In case we got one of the (j, k)
as measurement outcome, the state has collapsed to the following useful superposition:
1 (−1)C(x)j
√ (−1)C(x)j |ji + (−1)C(x)k |ki = √ |ji + (−1)C(x)j ⊕C(x)k
|ki
2 2
We know what j and k are, because it is the outcome of the measurement on |φx i. Doing a
2-outcome measurement in the basis √12 (|ji ± |ki) now gives us the value C(x)j ⊕ C(x)k with
61
probability 1. By Eq. (10.2), if we add the bit ai,(j,k) to this, we get xi with probability at least
1/2 + Ω(ε). The success probability of recovering xi , averaged over all x, is
1 1 1
p≥ r+ + Ω(ε) (1 − r) = + Ω(δε2 ).
2 2 2
Thus we have constructed a random access code that encodes n bits into log N qubits, and has
success probability at least p. Applying Theorem 2 and using that 1 − H(1/2 + η) = Θ(η 2 ) for
η ∈ [0, 1/2], we obtain the following:
2 ε4 n)
Theorem 3 If C : {0, 1}n → {0, 1}N is a (2, δ, ε)-locally decodable code, then N ≥ 2Ω(δ .
Exercises
1. Prove the quantum no-cloning theorem: there does not exist a 2-qubit unitary U that maps
|φi|0i 7→ |φi|φi
for every qubit |φi. Hint: consider what U has to do when |φi = |0i, when |φi = |1i, and when |φi is a
superposition of these two.
2. (a) Give the density matrix that corresponds to a 50-50 mixture of |0i and |1i.
(b) Give the density matrix that corresponds to a 50-50 mixture of |+i = √1 (|0i + |1i) and
2
|−i = √1 (|0i − |1i).
2
3. (a) Give a quantum random access code that encodes 2 classical bits into 1 qubit, such that
each of the two classical bits can be recovered from the quantum encoding with success
probability p ≥ 0.85. Hint: It suffices to use pure states with real amplitudes as encoding. Try to
“spread out” the 4 encodings |φ00 i, |φ01 i, |φ10 i, |φ11 i in the 2-dimensional real plane as best as possible.
(b) Give a good upper bound (as a function of n) on the maximal success probability p for
a random access code that encodes n classical bits into 1 qubit.
4. Consider the Hadamard code C that encodes n = 2 bits x1 x2 into a codeword of N = 4 bits.
62
Chapter 11
Communication complexity has been studied extensively in the area of theoretical computer science
and has deep connections with seemingly unrelated areas, such as VLSI design, circuit lower bounds,
lower bounds on branching programs, size of data structures, and bounds on the length of logical
proof systems, to name just a few.
❄ ❄
✲
Alice ✛ Bob
✲
communication
❄
Output: f (x, y)
A communication protocol is a distributed algorithm where first Alice does some individual
computation, and then sends a message (of one or more bits) to Bob, then Bob does some compu-
tation and sends a message to Alice, etc. Each message is called a round. After one or more rounds
the protocol terminates and one of the parties (let’s say Bob) outputs some value that should be
1
If the domain D equals X × Y then f is called a total function, otherwise it is called a partial or promise function.
63
f (x, y). The cost of a protocol is the total number of bits communicated on the worst-case input.
A deterministic protocol for f always has to output the right value f (x, y) for all (x, y) ∈ D. In
a bounded-error protocol, Alice and Bob may flip coins and the protocol has to output the right
value f (x, y) with probability ≥ 2/3 for all (x, y) ∈ D. We could either allow Alice and Bob to toss
coins individually (local randomness, or “private coin”) or jointly (shared randomness, or “public
coin”). A public coin can simulate a private coin and is potentially more powerful. However, New-
man’s theorem [54] says that having a public coin can save at most O(log n) bits of communication,
compared to a protocol with a private coin.
To illustrate the power of randomness, let us give a simple yet efficient bounded-error protocol
for the equality problem, where the goal for Alice is to determine whether her n-bit input is the
same as Bob’s or not: f (x, y) = 1 if x = y, and f (x, y) = 0 otherwise. Alice and Bob jointly toss a
random string r ∈ {0, 1}n . Alice sends the bit a = x · r to Bob (where ‘·’ is inner product mod 2).
Bob computes b = y · r and compares this with a. If x = y then a = b, but if x 6= y then a 6= b with
probability 1/2. Repeating this a few times, Alice and Bob can decide equality with small error
using O(n) public coin flips and a constant amount of communication. This protocol uses public
coins, but note that Newman’s theorem implies that there exists an O(log n)-bit protocol that uses
a private coin.
64
gain. Suppose that in the classical world k bits have to be communicated in order to compute f .
Since Holevo’s theorem says that k qubits cannot contain more information than k classical bits, it
seems that the quantum communication complexity should be roughly k qubits as well (maybe k/2
to account for superdense coding, but not less). Surprisingly (and fortunately for us), this argument
is false, and quantum communication can sometimes be much less than classical communication
complexity. The information-theoretic argument via Holevo’s theorem fails, because Alice and
Bob do not need to communicate the information in the k bits of the classical protocol; they are
only interested in the value f (x, y), which is just 1 bit. Below we will go over four of the main
examples that have so far been found of differences between quantum and classical communication
complexity.
Note that this promise only makes sense if n is an even number, otherwise n/2 would not be integer.
In fact it will be convenient to assume n a power of 2. Here is a simple quantum protocol to solve
this promise version of equality using only log n qubits:
P
1. Alice sends Bob the log n-qubit state √1n ni=1 (−1)xi |ii, which she can prepare unitarily from
x and log n |0i-qubits.
2. Bob applies the unitary map |ii 7→ (−1)yi |ii to the state, applies a Hadamard transform to
each qubit (for this it is convenient to view i as a log n-bit string), and measures the resulting
log n-qubit state.
It is clear that this protocol only communicates log n qubits, but why does it work? Note that the
state that Bob measures is
n
! n
1 X 1X X
H ⊗ log n √ (−1)xi +yi |ii = (−1)xi +yi (−1)i·j |ji
n n
i=1 i=1 log n
j∈{0,1}
ThisPsuperposition looks rather unwieldy, but consider the amplitude of the |0log n i basis state. It
is n ni=1 (−1)xi +yi , which is 1 if x = y and 0 otherwise because the promise now guarantees that
1
x and y differ in exactly n/2 of the bits! Hence Bob will always give the correct answer.
What about efficient classical protocols (without entanglement) for this problem? Proving
lower bounds on communication complexity often requires a very technical combinatorial analysis.
65
Buhrman, Cleve, and Wigderson used a deep combinatorial result from [34] to prove that every
classical errorless protocol for this problem needs to send at least 0.007n bits.
This log n-qubits-vs-0.007n-bits example was the first exponentially large separation of quantum
and classical communication complexity. Notice, however, that the difference disappears if we move
to the bounded-error setting, allowing the protocol to have some small error probability. We can
use the randomized protocol for equality discussed above or even simpler: Alice can just send a few
(i, xi ) pairs to Bob, who then compares the xi ’s with his yi ’s. If x = y he will not see a difference,
but if x and y differ in n/2 positions, then Bob will probably detect this. Hence O(log n) classical
bits of communication suffice in the bounded-error setting, in sharp contrast to the errorless setting.
she tags on her xi in an extra qubit and sends Bob the state
n
X
αi |ii|xi i.
i=1
66
11.5 Example 3: The vector-in-subspace problem
Notice the contrast between the examples of the last two sections. For the Distributed Deutsch-
Jozsa problem we get an exponential quantum-classical separation, but the separation only holds
if we require the classical protocol to be errorless. On the other hand, the gap for the disjointness
function is only quadratic, but it holds even if we allow classical protocols to have some error
probability.
Here is a function where the quantum-classical separation has both features: the quantum
protocol is exponentially better than the classical protocol, even if the latter is allowed some error:
Alice receives a unit vector v ∈ Rm
Bob receives two m-dimensional projectors P0 and P1 such that P0 + P1 = I
Promise: either P0 v = v or P1 v = v.
Question: which of the two?
As stated, this is a problem with continuous input, but it can be discretized in a natural way by
approximating each real number by O(log m) bits. Alice and Bob’s input is now n = O(m2 log m)
bits long. There is a simple yet efficient 1-round quantum protocol for this problem: Alice views v
as a log m-qubit state and sends this to Bob; Bob measures with operators P0 and P1 , and outputs
the result. This takes only log m = O(log n) qubits of communication.
The efficiency of this protocol comes from the fact that an m-dimensional unit vector can be
“compressed” or “represented” as a log m-qubit state. Similar compression is not possible with
classical bits, which suggests that any classical protocol will have to send the vector v more or less
literally and hence will require a lot of communication. This turns out to be true but the proof is
quite hard [44]. It shows that any bounded-error protocol needs to send at least roughly n1/3 bits.
67
exist codes where N = O(n) and any two codewords C(x) and C(y) have Hamming distance close
to N/2, say d(C(x), C(y)) ∈ [0.49N, 0.51N ] (for instance, a random linear code will work). Define
the quantum fingerprint of x as follows:
N
1 X
|φx i = √ (−1)C(x)j |ji.
N j=1
This is a unit vector in an N -dimensional space, so it corresponds to only ⌈log N ⌉ = log(n) + O(1)
qubits. For distinct x and y, the corresponding fingerprints will have small inner product:
N
1 X N − 2d(C(x), C(y))
hφx |φy i = (−1)C(x)j +C(y)j = ∈ [−0.02, 0.02].
N N
j=1
Alice: x Bob: y
|φx i ❥ ✙ |φy i
Referee
❄
?
x=y
The quantum protocol is very simple (see Figure 11.2): Alice and Bob send quantum fingerprints
of x and y to the Referee, respectively. The referee now has to determine whether x = y (which
corresponds to hφx |φy i = 1) or x 6= y (which corresponds to hφx |φy i ∈ [−0.02, 0.02]). The following
test (Figure 11.3), sometimes called the SWAP-test, accomplishes this with small error probability.
|0i H H measure
|φx i
SWAP
|φy i
Figure 11.3: Quantum circuit to test if |φx i = |φy i or |hφx |φy i| is small
This circuit first applies a Hadamard transform to a qubit that is initially |0i, then SWAPs
the other two registers conditioned on the value of the first qubit being |1i, then applies another
Hadamard transform to the first qubit and measures it. Here SWAP is the operation that swaps the
two registers: |φx i|φy i 7→ |φy i|φx i. The Referee receives |φx i from Alice and |φy i from Bob and ap-
plies the test to these two states. An easy calculation reveals that the outcome of the measurement
is 1 with probability (1 − |hφx |φy i|2 )/2. Hence if |φx i = |φy i then we observe a 1 with probability 0,
but if |hφx |φy i| is close to 0 then we observe a 1 with probability close to 1/2. Repeating this
procedure with several individual fingerprints can make the error probability arbitrarily close to 0.
68
Exercises
1. Prove that classical deterministic protocols with one message (from Alice to Bob), need to
send n bits to solve the equality problem. Hint: Argue that if Alice sends the same message for distinct
inputs x and x′ , then Bob doesn’t know what to output if his input is y = x.
2. (a) Show that if |φi and |ψi are non-orthogonal states (i.e., hφ|ψi = 6 0), then there is no
two-outcome projective measurement that perfectly distinguishes these two states, in
the sense that applying the measurement on |φi always gives a different outcome from
applying the same measurement to |ψi. Hint: Argue that if P is a projector then we can’t have
both P |φi = |φi and P |ψi = 0.
(b) Prove that quantum protocols with one message (from Alice to Bob), need to send log n
qubits to solve the distributed Deutsch-Jozsa problem with success probability 1 on every
input. Hint: Observe that among Alice’s possible n-bit inputs are the n codewords of the Hadamard
code that encodes log n bits; each pair of distinct Hadamard codewords is at Hamming distance exactly
n/2. Use part (a) to argue that Alice needs to send pairwise orthogonal states for those n inputs, and
hence her message-space must have dimension at least n.
3. Consider an error-correcting code C : {0, 1}n → {0, 1}N where N = O(n), N is a square, and
any two distinct codewords are at Hamming distance d(C(x), C(y)) ∈ [0.49N, 0.51N ] (such
codes exist, but you don’t have to prove that).
√ √
(a) View the codeword C(x) as a N × N matrix. Show that if you choose a row uniformly
at random, and choose a column uniformly at random, then these intersect in a bit C(x)i
for uniformly random i ∈ {1, . . . , N }.
(b) Give a classical bounded-error SMP-protocol for the equality problem where Alice and
√
Bob each send O( n) bits to the Referee. Hint: Let Alice send a random row of C(x) (with the
row-index) and let Bob send a random column of C(y) (with the column-index).
4. Suppose Alice and Bob each have n-bit agendas, and they know that for exactly 25% of
the timeslots they are both free. Give a quantum protocol that finds such a timeslot with
probability 1, using only O(log n) qubits of communication.
69
Chapter 12
70
dicted by quantum mechanics, cannot be obtained from any local realist theory. This phenomenon
is known as “quantum non-locality.” It could of course be that the quantum mechanical predic-
tions of the resulting correlations are just wrong. However, in the early 1980s, such experiments
were actually done by Aspect and others, and they gave the outcomes that quantum mechanics
predicted.1 Note that such experiments don’t prove quantum mechanics, but they disprove any
local realist physical theory.2
Such experiments, which realize correlations that are provably impossible to realize with local
realist models, are among the deepest and most philosophical results of 20th century physics: the
commonsense idea of local realism is most probably false! Since Bell’s seminal work, the concept of
quantum non-locality has been extensively studied, by physicists, philosophers, and more recently
by computer scientists.
In the next sections we review some interesting examples. The two-party setting of these
examples is illustrated in Fig. 12.1: Alice receives input x and Bob receives input y, and they
produce outputs a and b, respectively, that have to be correlated in a certain way (which depends
on the game). They are not allowed to communicate. In physics language, we could assume they
are “space-like separated”, which means that they are so far apart that they cannot influence each
other during the course of the experiment (assuming information doesn’t travel faster than the
speed of light). In the classical scenario they are allowed to share a random variable. Physicists
would call this the “local hidden variable” that gives properties their definite value (that value may
be unknown to the experimenter). This setting captures all local realist models. In the quantum
model Alice and Bob are allowed to share entangled states, such as EPR-pairs. The goal is to show
that entanglement-based strategies can do things that local realist strategies cannot.
Inputs: x y
❄ ❄
Alice Bob
❄ ❄
Outputs: a b
Figure 12.1: The non-locality scenario involving two parties: Alice and Bob receive inputs x and
y, respectively, and are required to produce outputs a and b that satisfy certain conditions. Once
the inputs are received, no communication is permitted between the parties.
1
Modulo some technical “loopholes” due to imperfect photon sources, measurement devices, etc. These are still
hotly debated, but most people accept that Aspect’s and later experiments are convincing, and kill any hope of a
complete local-realist explanation of nature.
2
Despite its name, non-locality doesn’t disprove locality, but rather disproves the conjunction of locality and
realism—at least one of the two assumptions has to fail.
71
12.2 CHSH: Clauser-Horne-Shimony-Holt
In the CHSH game [20] Alice and Bob receive input bits x and y, and Their goal is to output bits
a and b, respectively, such that
a ⊕ b = x ∧ y, (12.1)
(‘∧’ is logical AND; ‘⊕’ is parity, i.e. addition mod 2) or, failing that, to satisfy this condition with
as high a probability as possible.
First consider the case of classical deterministic strategies, so without any randomness. For
these, Alice’s output bit depends solely on her input bit x, and similarly for Bob. Let a0 be the
bit that Alice outputs if her input is x = 0, and a1 the bit she outputs if x = 1. Let b0 , b1 be the
outputs Bob gives on inputs y = 0 and y = 1, respectively. These four bits completely characterize
any deterministic strategy. Condition (12.1) becomes
a0 ⊕ b0 = 0,
a0 ⊕ b1 = 0,
a1 ⊕ b0 = 0,
a1 ⊕ b1 = 1. (12.2)
It is impossible to satisfy all four equations simultaneously, since summing them modulo 2 yields
0 = 1. Therefore it is impossible to satisfy Condition (12.1) perfectly. Since a probabilistic strategy
(where Alice and Bob share randomness) is a probability distribution over deterministic strategies,
it follows that no probabilistic strategy can have success probability better than 3/4 on every
possible input (the 3/4 can be achieved simultaneously for every input, see homework).3
Now consider the same problem but where Alice and Bob are supplied with a shared two-qubit
system initialized to the entangled state
Such a state can easily be obtained from an EPR-pair, for instance if Alice applies a Z to her qubit.
Now the parties can produce outputs that satisfy Condition (12.1) with probability cos(π/8)2 ≈ 0.85
(higher than what is possible in the classical
case), asfollows. Recall the unitary operation that
cos θ − sin θ
rotates the qubit by angle θ: R(θ) = . If x = 0 then Alice applies R(−π/16) to
sin θ cos θ
her qubit; and if x = 1 she applies R(3π/16). Then Alice measures her qubit in the computational
basis and outputs the resulting bit a. Bob’s procedure is the same, depending on his input bit y. It
is straightforward to calculate that if Alice rotates by θ1 and Bob rotates by θ2 , the state becomes
1
√ (cos(θ1 + θ2 )(|00i − |11i) + sin(θ1 + θ2 )(|01i + |10i)) .
2
After the measurements, the probability that a ⊕ b = 0 is cos(θ1 + θ2 )2 . It is now easy to see that
Condition 12.1 is satisfied with probability cos(π/8)2 for all four input possibilities.
3
Such statements, upper bounding the optimal success probability of classical strategies for a specific game, are
known as Bell inequalities. This specific one is called the CHSH inequality.
72
12.3 Magic square game
Tsirelson [67] showed that cos(π/8)2 is the best that quantum strategies can do for CHSH, even if
they are allowed to use much more entanglement than one EPR-pair. Is there a game where the
quantum protocol always succeeds, while the best classical success probability is bounded below 1?
A particularly elegant example is the following magic square game [8]. Consider the problem of
labeling the entries of a 3 × 3 matrix with bits so that the parity of each row is even, whereas the
parity of each column is odd. This is clearly impossible: if the parity of each row is even then the
sum of the 9 bits is 0 mod 2, but if the parity of each column is odd then the sum of the 9 bits is
1 mod 2. The two matrices
0 0 0 0 0 0
0 0 0 0 0 0
1 1 0 1 1 1
each satisfy five out of the six constraints. For the first matrix, all rows have even parity, but only
the first two columns have odd parity. For the second matrix, the first two rows have even parity,
and all columns have odd parity.
Consider the game where Alice receives x ∈ {1, 2, 3} as input (specifying the number of a row),
and Bob receives y ∈ {1, 2, 3} as input (specifying the number of a column). Their goal is to each
produce 3-bit outputs, a1 a2 a3 for Alice and b1 b2 b3 for Bob, such that
As usual, Alice and Bob are forbidden from communicating once the game starts, so Alice does not
know y and Bob does not know x. We shall show the best classical strategy has success probability
8/9, while there is a quantum strategy that always succeeds.
An example of a deterministic strategy that attains success probability 8/9 (when the input xy
is uniformly distributed) is where Alice plays according to the rows of the first matrix above and
Bob plays according the columns of the second matrix above. This succeeds in all cases, except
where x = y = 3. To see why this is optimal, note that for any other classical strategy, it is possible
to represent it as two matrices as above but with different entries. Alice plays according to the
rows of the first matrix and Bob plays according to the columns of the second matrix. We can
assume that the rows of Alice’s matrix all have even parity; if she outputs a row with odd parity
then they immediately lose, regardless of Bob’s output. Similarly, we can assume that all columns
of Bob’s matrix have odd parity.4 Considering such a pair of matrices, the players lose at each
entry where they differ. There must be such an entry, since otherwise it would be possible to have
all rows even and all columns odd with one matrix. Thus, when the input xy is chosen uniformly
from {1, 2, 3} × {1, 2, 3}, the success probability of any classical strategy is at most 8/9.
We now give the quantum strategy for this game. Let I, X, Y , Z be the 2 × 2 Pauli matrices:
1 0 0 1 0 −i 1 0
I= , X= , Y = , and Z = . (12.4)
0 1 1 0 i 0 0 −1
4
In fact, the game can be simplified so that Alice and Bob each output just two bits, since the parity constraint
determines the third bit.
73
Each is an observable with eigenvalues in {+1, −1}. That is, each can be written as P+ − P−
where P+ and P− are orthogonal projectors that sum to identity, and hence define a two-outcome
measurement with outcomes +1 and −1.5 For example, Z = |0ih0| − |1ih1|, corresponding to a
measurement in the computational basis (with |bi corresponding to outcome (−1)b ). And X =
|+ih+| − |−ih−|, corresponding to a measurement in the Hadamard basis. The Pauli matrices are
self-inverse, they anti-commute (e.g., XY = −Y X), and X = iZY , Y = iXZ, and Z = iY X.
Consider the following table, where each entry is a tensor product of two Paulis:
X ⊗X Y ⊗Z Z⊗Y
Y ⊗Y Z⊗X X ⊗Z
Z⊗Z X ⊗Y Y ⊗X
74
the outcomes will be distinct: a′ ⊕ b′ = 1 and a′′ ⊕ b′′ = 1. We now have ay = bx , because
Non-local DJ problem: Alice and Bob receive n-bit inputs x and y that satisfy the
DJ promise: either x = y, or x and y differ in exactly n/2 positions. The task is for
Alice and Bob to provide outputs a, b ∈ {0, 1}log n such that if x = y then a = b, and if
x and y differ in exactly n/2 positions then a 6= b.
4. They measure in the computational basis and output the results a and b, respectively.
For every a, the probability that both Alice and Bob obtain the same result a is:
n−1 2
1 X
√ (−1)xi +yi ,
n n
i=0
which is 1/n if x = y, and 0 otherwise. This solves the problem perfectly using prior entanglement.
6
“ ”⊗k 1 X
Note that k EPR-pairs √1 (|00i + |11i) can also be written as √ |ii|ii if we reorder the qubits,
2
2k i∈{0,1}k
putting Alice’s k qubits on the left and Bob’s on the right. While these two ways of writing the state strictly speaking
correspond to two different vectors of amplitudes, they still represent the same bipartite physical state, and we will
typically view them as equal.
75
What about classical protocols? Suppose there is a classical protocol that uses C bits of com-
munication. If they ran this protocol, and then Alice communicated her output a to Bob (using
an additional log n bits), he could solve the distributed Deutsch-Jozsa problem since he could then
check whether a = b or a 6= b. But we know that solving the distributed Deutsch-Jozsa problem re-
quires at least 0.007n bits of communication. Hence C +log n ≥ 0.007n, so C ≥ 0.007n−log n. Thus
we have a non-locality problem that can be solved perfectly if Alice and Bob share log n EPR-pairs,
while classically it needs not just some communication, but actually a lot of communication.
Exercises
1. Suppose Alice and Bob share an EPR-pair √1 (|00i + |11i).
2
(a) Let U be a unitary with real entries. Show that the following two states are the same:
(1) the state obtained if Alice applies U to her qubit of the EPR-pair;
(2) the state obtained if Bob applies the transpose U T to his qubit of the EPR-pair.
(b) What state do you get if both Alice and Bob apply a Hadamard transform to their qubit
of the EPR-pair? Hint: you could write this out, but you can also get the answer almost immediately
from part (a) and the fact that H T = H −1 .
2. Give a classical strategy using shared randomness for the CHSH game, such that Alice and
Bob win the game with probability at least 3/4 for every possible input x, y (note the order
of quantification: the same strategy has to work for every x, y). Hint: For every fixed input x, y,
there is a classical strategy that gives a wrong output only on that input, and that gives a correct output on all
other possible inputs. Use the shared randomness to randomly choose one of those deterministic strategies.
3. Consider three space-like separated players: Alice, Bob, and Charlie. Alice receives input bit
x, Bob receives input bit y, and Charlie receives input bit z. The input satisfies the promise
that x ⊕ y ⊕ z = 0. The goal of the players is to output bits a, b, c, respectively, such that
a⊕ b⊕ c = OR(x, y, z). In other words, the outputs should sum to 0 (mod 2) if x = y = z = 0,
and should sum to 1 (mod 2) if x + y + z = 2.
(a) Show that every classical deterministic strategy will fail on at least one of the 4 allowed
inputs.
(b) Show that every classical randomized strategy has success probability at most 3/4 under
the uniform distribution on the four allowed inputs xyz.
(c) Suppose the players share the following entangled 3-qubit state:
1
(|000i − |011i − |101i − |110i).
2
Suppose each player does the following: if his/her input bit is 1, apply H to his/her
qubit, otherwise do nothing. Describe the resulting 3-qubit superposition.
(d) Using (c), give a quantum strategy that wins the above game with probability 1 on every
possible input.
76
Chapter 13
Quantum Cryptography
1. Alice chooses n random bits a1 , . . . , an and n random bases b1 , . . . , bn . She sends ai to Bob
in basis bi over the public quantum channel. For example, if ai = 0 and bi = 1 then the ith
qubit that she sends is in state |+i.
2. Bob chooses random bases b′1 , . . . , b′n and measures the qubits he received in those bases,
yielding bits a′1 , . . . , a′n .
1
Quantum key distribution might in fact better be called “quantum eavesdropper detection.” There are some more
assumptions underlying BB84 that should be made explicit: we assume that the classical channel used in steps 3–5
is “authenticated”, meaning that Alice and Bob know they are talking to each other, and Eve can listen but not
change the bits sent over the classical channel (in contrast to the qubits sent during step 1 of the protocol, which Eve
is allowed to manipulate in any way she wants).
77
3. Bob sends Alice all b′i , and Alice sends Bob all bi . Note that for roughly n/2 of the i’s, Alice
and Bob used the same basis bi = b′i . For those i’s Bob should have a′i = ai (if there was no
noise and Eve didn’t tamper with the ith qubit on the channel). Both Alice and Bob know
for which i’s this holds. Let’s call these roughly n/2 positions the “shared string.”
4. Alice randomly selects n/4 locations in the shared string, and sends Bob those locations as
well as and the value of ai at those locations. Bob then checks whether they have the same
bits in those positions. If the fraction of errors is bigger than some number p, then they
suspect some eavesdropper was messing with the channel, and they abort.2
5. If the test is passed, then they discard the n/4 test-bits, and have roughly n/4 bits left in their
shared string. This is called the “raw key.” Now they do some classical postprocessing on the
raw key: “information reconciliation” to ensure they end up with exactly the same shared
string, and “privacy amplification” to ensure that Eve has negligible information about that
shared string.3
The communication is n qubits in step 1, 2n bits in step 3, O(n) bits in step 4, and O(n) bits in
step 5. So the required amount of communication is linear in the length of the shared secret key
that Alice and Bob end up with.
It’s quite hard to formally prove that this protocol yields (with high probability) a shared key
about which Eve has negligible information. In fact it took more than 12 years before BB84 was
finally proven secure [51, 49]. The main reason it works is that when the encoded qubits a1 , . . . , an
are going over the public channel, Eve doesn’t know yet in which bases b1 , . . . , bn these are encoded
(she will learn the bi later from tapping the classical communication in step 3, but at that point this
information is not of much use to her anymore). She could try to get as much information as she
can about a1 , . . . , an by some measurement, but there’s an information-disturbance tradeoff : the
more information Eve learns about a1 , . . . , an by measuring the qubits, the more she will disturb
the state, and the more likely it is that Alice and Bob will detect her presence in step 4.
We won’t go into the full proof details here, just illustrate the information-disturbance tradeoff
plausible for the case where Eve individually attacks the qubits encoding each bit in step 1 of the
protocol.4 . In Fig. 13.1 we give the four possible states for one BB84-qubit. If Alice wants to send
ai = 0, then she sends a uniform mixture of |0i and |+i across the channel; if Alice wants to send
ai = 1 she sends a uniform mixture of |1i and |−i. Suppose Eve tries to learn ai from the qubit on
the channel. The best way for her to do this is to measure in the orthonormal basis corresponding to
state cos(π/8)|0i + sin(π/8)|1i and − sin(π/8)|0i + cos(π/8)|1i. Note that the first state is halfway
between the two encodings of 0, and the second state is halfway between the two encodings of 1.
This will give her the value of ai with probability cos(π/8)2 ≈ 0.85 (remember the 2-to-1 quantum
random access code from earlier homework). However, this measurement will change the state of
the qubit by an angle of at least π/8, so if Bob now measures the qubit he receives in the same basis
as Alice, his probability of recovering the incorrect value of ai is at least sin(π/8)2 ≈ 0.15 (if Bob
measures in a different basis, the result will be discarded anyway). If this i is among the test-bits
2
The number p can for instance be set to the natural error-rate that the quantum channel would have if there
were no eavesdropper.
3
This can be done for instance by something called the “leftover hash lemma.”
4
The more complicated situation where Eve does an n-qubit measurement on all qubits of step 1 simultaneously
can be reduced to the case of individual-qubit measurements by something called the quantum De Finetti theorem,
but we won’t go into the details here.
78
Alice and Bob use in step 4 of the protocol (which happens with probability 1/2), then they will
detect an error. Eve can of course try a less disturbing measurement to reduce the probability of
being detected, but such a measurement will also have a lower probability of telling her ai .
|1i
|−i ✻ |+i
■ ✒
✲ |0i
Figure 13.1: The four possible states in BB84 encoding: |0i and |+i are two different encodings
of 0, and |1i and |−i are two different encodings of 1.
79
The existence of the Schmidt decomposition is shown as follows. Let ρA = TrB P(|φihφ|) be Alice’s
local density matrix. This is Hermitian, so it has a spectral decomposition ρP
A = i µi |ai ihai | with
orthonormal eigenvectors |ai i and nonnegative real eigenvalues µi . Note i µ i = Tr(ρA ) = 1.
Since the {|ai i} form an orthonormal set, we can extend it to an orthonormal basis for Alice’s
d-dimensional space (adding additional orthonormal |ai i and µi = 0). Hence there are cij such that
d
X √
|φi = µi cij |ai i|ji,
i,j=1
√
where the |ji are the computational basis states for Bob’s space. Define λi = µi and |bi i =
P
j cij |ji. This gives the decomposition of |φi of Eq. (13.1). It only remains to show that {|bi i} is
an orthonormal set, which we do as follows. The density matrix version of Eq. (13.1) is
d
X
|φihφ| = λi λj |ai ihaj | ⊗ |bi ihbj |.
i,j=1
P
We know that if we trace out the B-part from |φihφ|, then we should get ρA = i λ2i |ai ihai |, but
that can only happen if hbj |bi i = Tr(|bi ihbj |) = 1 for i = j and hbj |bi i = 0 for i 6= j. Hence the
|bi i form an orthonormal
P set. Note that from Eq. (13.1) it easily follows that Bob’s local density
matrix is ρB = i λ2i |bi ihbi |.
1. In the “commit” phase Alice gives Bob a state which is supposed to commit her to the value
of b (without informing Bob about the value of b).
2. In the “reveal” phase Alice sends b to Bob, and possibly some other information to allow him
to check that this is indeed the same value b that Alice committed to before.
A protocol is binding if Alice can’t change her mind, meaning she can’t get Bob to “open” 1 − b.
A protocol is concealing if Bob cannot get any information about b before the “reveal phase.”5 .
A good protocol for bit commitment would be a very useful building block for many other
cryptographic applications. For instance, it would allow Alice and Bob (who still don’t trust each
other) to jointly flip a fair coin. Maybe they’re going through a divorce, and need to decide who
gets to keep their joint car. Alice can’t just flip the coin by herself because Bob doesn’t trust her
to do this honestly, and vice versa. Instead, Alice would pick a random coin b and commit to it.
5
A good metaphor to think about this: in the commit phase Alice locks b inside a safe which she sends to Bob.
This commits her to the value of b, since the safe is no longer in her hands. During the reveal phase she sends Bob
the key to the safe, who can then open it and learn b
80
Bob would then pick a random coin c and send it to Alice. Alice then reveals b, and the outcome of
the coin flip is defined to be b ⊕ c. As long as at least one of the two parties follows this protocol,
the result will be a fair coin flip.
It is easy to see that perfect bit commitment is impossible in the classical world: the person who
communicates last can always cheat. After BB84 there was some hope that perfect bit commitment
would be possible in the quantum world, and there were some seemingly-secure proposals for
quantum protocols to achieve this. Unfortunately it turns out that there is no quantum protocol
for bit commitment that is both perfectly binding and perfectly concealing.
To show that a protocol for perfect bit commitment is impossible, consider the joint pure
state |φb i that Alice and Bob would have if Alice wants to commit to bit-value b, and they both
honestly followed the protocol.6 If the protocol is perfectly concealing, then the reduced density
matrix on Bob’s side should be independent of b, i.e., TrA (|φ0 ihφ0 |) = TrA (|φ1 ihφ1 |). The way
we constructed the Schmidt decomposition in the previous section now implies that there exist
Schmidt decompositions of |φ0 i and |φ1 i with the same λi ’s and the same bi ’s: there exist ai and
a′i such that
X d d
X
|φ0 i = λi |ai i|bi i and |φ1 i = λi |a′i i|bi i
i=1 i=1
Now Alice can locally switch from |φ0 i to |φ1 i by just applying on her part of the state the map
|ai i 7→ |a′i i. Alice’s map is unitary because it takes one orthonormal basis to another orthonormal
basis. But then the protocol is not binding at all: Alice can still freely change her mind about the
value of b after the “commit” phase is over! Accordingly, if a quantum protocol for bit commitment
is perfectly concealing, it cannot be binding at all.
• There are quantum protocols for bit commitment that are partially concealing and partially
binding—something which is still impossible in the classical world. A primitive called “weak
coin flipping” can be implemented almost perfectly in the quantum world, and cannot be
implemented at all in the classical world.
• Under assumptions on the fraction of dishonest players among a set of k parties, it is possible
to implement secure multi-party quantum computation. This is a primitive that allows the
players to compute any function of their k inputs, without revealing more information to
player i than can be inferred from i’s input plus the function value.
• One can actually do nearly perfect bit commitment, coin flipping, etc., assuming the dishonest
party has bounded quantum storage, meaning that it can’t keep large quantum states coherent
for longer times. At the present state of quantum technology this is a very reasonable as-
sumption (though a breakthrough in physical realization of quantum computers would wipe
this approach out).
6
The assumption that the state is pure rather than mixed is without loss of generality.
81
• In device-independent cryptography, Alice and Bob want to solve certain cryptographic tasks
like key distribution or randomness generation without trusting their own devices (for instance
because they don’t trust the vendor of their apparatuses). Roughly speaking, the idea here
is to use Bell-inequality violations to prove the presence of entanglement, and then use this
entanglement for cryptographic purposes. Even if Alice or Bob’s apparatuses have been
tampered with, they can still only violate things like the CHSH inequality if they actually
share an entangled state.
• Experimentally it is much easier to realize quantum key distribution than general quantum
computation, because you basically just need to prepare qubits (usually photons) in either the
computational or the Hadamard basis, send then across a channel (usually an optical fibre,
but sometimes free space), and measure them in either the computational or the Hadamard
basis. Many sophisticated experiments have already been done. Somewhat surprisingly, you
can already commercially buy quantum key distribution machinery. Unfortunately the im-
plementations are typically not perfect (for instance, we don’t have perfect photon counters),
and once in a while another loophole is exposed in the implementation, which the vendor
then tries to patch, etc.
Exercises
1. Here we will consider in more detail the information-disturbance tradeoff for measuring a
qubit in one of the four BB84 states (each of which occurs with probability 25%).
(a) Suppose Eve measures the qubit in the orthonormal basis given by cos(θ)|0i + sin(θ)|1i
and − sin(θ)|0i + cos(θ)|1i, for some parameter θ ∈ [0, π/4]. For each of the four possible
BB84 states, give the probabilities of outcome 0 and outcome 1 (so the answer consists
of 8 numbers, each of which is a function of θ).
(b) What is the average probability that Eve’s measurement outcome equals the encoded
bit ai , as function of θ? (average taken both over the uniform distribution over the four
BB84 states, and over the probabilities calculated in part (a))
(c) What is the average absolute value of the angle by which the state is changed if Eve’s
outcome is the encoded bit ai ? Again, the answer should be a function of θ.
2. (a) What is the Schmidt rank of the state 21 (|00i + |01i + |10i + |11i)?
(b) Suppose Alice and Bob share k EPR-pairs. What is the Schmidt rank of their joint
state?
(c) Prove that a pure state |φi is entangled if, and only if, its Schmidt rank is greater than 1.
3. Prove that Alice cannot give information to Bob by doing a unitary operation on her part of
an entangled pure state. Hint: Show that a unitary on Alice’s side of the state won’t change Bob’s local
density matrix ρB .
4. Suppose Alice sends two n-bit messages M1 and M2 with the one-time pad scheme, reusing
the same n-bit key K. Show that Eve can now get some information about M1 , M2 from
tapping the classical channel.
82
Chapter 14
14.1 Introduction
When Shor’s algorithm had just appeared in 1994, most people (especially physicists) were ex-
tremely skeptical about the prospects of actually building a quantum computer. In their view, it
would be impossible to avoid errors when manipulating small quantum systems, and such errors
would very quickly overwhelm the computation, rendering it no more useful than classical com-
putation. However, in the few years that followed, the theory of quantum error-correction and
fault-tolerant computation was developed. This shows, roughly speaking, that if the error-rate per
operation can be brought down to something reasonably small (say 1%), then under some rea-
sonable assumptions we can actually do near-perfect quantum computing for as long as we want.
Below we give a succinct and somewhat sketchy introduction to this important but complex area,
just explaining the main ideas. See Daniel Gottesman’s survey for more details [35].
83
has been reduced from p to less than 3p2 . If the initial error-rate p0 was < 1/3, then the new
error-rate p1 < 3p20 is less than p0 and we have made progress: the error-rate on the encoded bit is
smaller than before. If we’d like it to be even smaller, we could concatenate the code with itself,
i.e., repeat each of the three bits in the code three times, so the codelength becomes 9. This would
give error-rate p2 = 3p21 (1 − p1 ) + p31 < 3p21 < 27p40 , giving a further improvement. As we can see, as
long as the initial error-rate p was at most 1/3, we can reduce the error-rate to whatever we want:
k levels of concatenation encode one “logical bit” into 3k “physical bits”, but the error-rate for each
k
logical bit has been reduced to 13 (3p0 )2 . This is a very good thing: if the initial error is below the
threshold of 1/3, then k levels of concatenation increases the number of bits exponentially (in k),
but reduces the error-rate double-exponentially fast!
Typically, already a small choice of k gets the error-rate down to negligible levels. For example,
suppose we want to protect some polynomial (in some n) number of bits for some polynomial
number of time-steps, and our physical error-rate is some fixed p0 < 1/3. Choosing k = 2 log log n
k 2
levels of concatenation already suffices for this, because then pk ≤ 31 (3p0 )2 ∼ 2−(log n) = n− log n
goes to 0 faster than any polynomial. With this choice of k, each logical bit would be encoded
in 3k = (log n)2 log2 (3) physical bits, so we only increase the number of bits by a polylogarithmic
factor.
• The classical solution of just repeating a state is not available in general in the quantum
world, because of the no-cloning theorem.
• The classical world has basically only bitflip-errors, while the quantum world is continuous
and hence has infinitely many different possible errors.
• Measurements that test whether a state is correct can collapse the state, losing information.
Depending on the specific model of errors that one adopts, it is possible to deal with all of these
issues. We will consider the following simple error model. Consider quantum circuits with S
qubits, and T time-steps; in each time-step, several gates on disjoint sets of qubits may be applied
in parallel. After each time-step, at each qubit, independently from the other qubits, some unitary
error hits that qubit with probability p. Note that we assume the gates themselves to operate
perfectly; this is just a convenient technical assumption, since a perfect gate followed by errors on
its outgoing qubits is the same as an imperfect gate.
Let’s investigate what kind of (unitary) errors we could get on one qubit. Consider the four
Pauli matrices:
1 0 0 1 0 −i 1 0
I= , X= , Y = , Z= .
0 1 1 0 i 0 0 −1
84
of i, which doesn’t matter). These four matrices span the space of all possible 2 × 2 matrices, so
every possible error-operation E on a qubit is some linear combination of the 4 Pauli matrices.
More generally, every 2k × 2k matrix can be written uniquely as a linear combinations of matrices
that each are the tensor product of k Pauli matrices.
Consider for example the error which puts a small phase φ on |1i:
1 0
E= = eφ/2 cos(φ/2)I − ieφ/2 sin(φ/2)Z.
0 eiφ
Note that for small φ most of the weight in this linear combination sits on I, which corresponds to
the fact that E is close to I. The sum of squares of the two coefficients is one in this case. That’s
not a coincidence: whenever we write a unitary as a linear combination of Pauli matrices, the sum
of squares of the coefficients will be 1 (see homework).
The fact that all one-qubit errors are linear combinations of I, X, Y, Z, together with the linearity
of quantum mechanics, implies that if we can correct bitflip-errors, phaseflip-errors, and their
product, then we can correct all possible unitary errors on a qubit.2 So typically, quantum error-
correcting codes are designed to correct bitflip and phaseflip-errors (their product is then typically
also correctable), and all other possible errors are then also handled without further work.
Our noise model does not explicitly consider errors on multiple qubits that are not a product
of errors on individual qubits. However, even such a joint error on, say, k qubits simultaneously
can still be written as a linear combination of products of k Pauli matrices. So also here the main
observation applies: if we can just correct bitflip and phaseflip-errors on individual qubits, then we
can correct all possible errors!
85
Detecting a bitflip-error. If a bitflip-error occurs on one the first three qubits, we can detect
its location by noting which of the 3 positions is the minority bit. We can do this for each of the
three 3-qubit blocks. Hence there is a unitary that writes down in 4 ancilla qubits (which are all
initially |0i) a number eb ∈ {0, 1, . . . , 9}. Here eb = 0 means that no bitflip-error was detected, and
eb ∈ {1, . . . , 9} means that a bitflip-error was detected on qubit number eb . Note that we don’t
specify what should happen if more than one bitflip-error occurred.
Together, eb = i and ep = j ′ form the “error syndrome”; this tells us which error occurred where.
The error-correction procedure can now measure this syndrome in the computational basis, and
take corrective action depending on the classical outcomes eb and ep : apply an X to qubit eb (or
no X if eb = 0), and apply a Z to one qubit in the ep -th block (or no Z if ep = 0). The case of a
Y -error on the ith qubit corresponds to the case where i = j (i.e., the ith qubit is hit by both a
phaseflip and a bitflip); our procedure still works in this case. Hence we can perfectly correct one
Pauli error on any one of the 9 codeword qubits.
As we argued before, the ability to correct Pauli-errors suffices to correct all possible errors.
Let’s see in more detail how this works. Consider for instance some 9-qubit unitary error E. Assume
it can be decomposed as a linear combination of 9-qubit products of Paulis, each having at most
one bitflip-error and one phaseflip-error:
X
E= αij Xi Zj .
i,j
If we now add ancillas |04 i|02 i and apply the above unitary U , then we go into a superposition of
error syndromes: X
U (E ⊗ I ⊗6 )|0i|04 i|02 i = αij Xi Zj |0i|ii|j ′ i.
i,j
3
Note that we are not discovering on which of the 9 qubits the phaseflip-error happened (in contrast to the case
of bitflips), but that’s OK: we can correct it nonetheless.
86
Measuring the ancillas will now probabilistically give us one of the syndromes |ii|j ′ i, and collapse
the state to
Xi Zj |0i|ii|j ′ i.
In a way, this measurement of the syndrome “discretizes” the continuously many possible errors to
the finite set of Pauli errors. Once the syndrome has been measured, we can apply a corrective X
and/or Z to the first 9 qubits to undo the specific error corresponding to the specific syndrome we
got as outcome of our measurement.
So now we can correct an error on one qubit. To achieve this, however, we have substantially
increased the number of locations where such an error could occur: the number of qubits has gone
from 1 to 9 (even to 15 if we count the ancilla as well), and we need a number of time-steps to
compute and measure the syndrome, and to correct a detected error. Hence this procedure only
gains us something if the error-rate p is so small that the probability of 2 or more errors on the
larger encoded system is smaller than the probability of 1 error in the unencoded qubit. We will
get back to this issue below, when talking about the threshold theorem. Note also that each new
application of the correction-procedure need a new, fresh 6-qubit ancilla initialized to |04 i|02 i. After
one run of the error-correction procedure these ancillas will contain the measured error syndrome,
and we can just discard this. In a way, error correction acts like a refrigerator: a fridge pumps heat
out of its system and dumps it into the environment, and error-correction pumps noise out of its
system and dumps it in the environment in the form of the discarded ancilla qubits.
The above 9-qubit code is just one example of a quantum error-correcting code. Better codes
exist, and a lot of work has gone into simultaneously optimizing the different parameters: we
want to encode a large number of logical qubits into a not-much-larger number of physical qubits,
while being able to correct as many errors as possible. The shortest code that encodes one logical
qubit and protects against one error, has five physical qubits. There are also asymptotically good
quantum error-correcting codes, which encode k logical qubits into O(k) physical qubits and can
correct errors on a constant fraction of the physical qubits (rather than just an error on one of the
qubits).
87
When designing schemes for fault-tolerant computing, it is very important to ensure that errors
do not spread too quickly. Consider for instance a CNOT: if its control-bit is erroneous, then after
doing the CNOT also its target bit will be erroneous. The trick is to keep this under control in such
a way that regular phases of error-correction don’t get overwhelmed by the errors. In addition, we
need to be able to fault-tolerantly prepare states, and measure logical qubits in the computational
basis. We won’t go into the (many) further details of fault-tolerant quantum computing (see [35]).
88
Exercises
1. Let E be an arbitrary 1-qubit unitary. We know that it can be written as
E = α0 I + α1 X + α2 Y + α3 Z,
P3 2
for some complex coefficients αi . Show that i=0 |αi | = 1. Hint: Compute the trace Tr(E ∗ E) in
two ways, and use the fact that Tr(AB) = 0 if A and B are distinct Paulis, and Tr(AB) = Tr(I) = 2 if A and
B are the same Pauli.
2. (a) Write the 1-qubit Hadamard transform H as a linear combination of the four Pauli
matrices.
(b) Suppose an H-error happens on the first qubit of α|0i + β|1i using the 9-qubit code.
Give the various steps in the error-correction procedure that corrects this error.
3. Give a quantum circuit for the encoding of Shor’s 9-qubit code, i.e., a circuit that maps
|008 i 7→ |0i and |108 i 7→ |1i. Explain why the circuit works.
4. Show that there cannot be a quantum code that encodes one logical qubit into 2k physical
qubits while being able to correct errors on up to k of the qubits. Hint: Proof by contradiction.
Given an unknown qubit α|0i + β|1i, encode it using this code. Split the 2k qubits into two sets of k qubits,
and use each to get a copy of the unknown qubit. Then invoke the no-cloning theorem.
89
Appendix A
In this appendix we sketch some useful parts of linear algebra, most of which will be used somewhere
or other in these notes.
90
1. A is unitary
4. kAvk = 1 if kvk = 1
(1) implies (2) because if A is unitary then A∗ A = I, and hence hAv|Awi = (v ∗ A∗ )Aw = hv|wi. (2)
implies (1) as follows: if A is not unitary then A∗ A 6= I, so then there is a w such that A∗ Aw 6= w
and, hence, a v such that hv|wi = 6 hv|A∗ Awi = hAv|Awi, contradicting (2). Clearly (2) implies (3).
Moreover, it is easy to show that (3) implies (2) using the following identity:
The equivalence of (3) and (4) is obvious. Note that by (4), the eigenvalues of a unitary matrix
have absolute value 1.
91
T is diagonal. This shows that normal matrices are unitarily diagonalizable. Conversely, if A is
diagonalizable as U −1 DU , then AA∗ = U −1 DD∗ U = U −1 D∗ DU = A∗ A, so then A is normal. Thus
a matrix is normal iff it is unitarily diagonalizable. If A is not normal, it may still be diagonalizable
via a non-unitary S, for example:
1 1 1 1 1 0 1 −1
= · · .
0 2 0 1 0 2 0 1
| {z } | {z } | {z } | {z }
A S D S −1
A.4 Trace
P
The trace of a matrix A is the sum of its diagonal entries: Tr(A) = i Aii . Some important and
easily verified properties of Tr(A) are:
• Tr(A + B) = Tr(A) + Tr(B)
• Tr(AB) = Tr(BA)
• Tr(A) is the sum of the eigenvalues of A
(This follows −1 −1
P from Schur and the previous item: Tr(A) = Tr(U T U ) = Tr(U U T ) =
Tr(T ) = i λi )
92
• c(A ⊗ B) = (cA) ⊗ B = A ⊗ (cB) for all scalars c
• A ⊗ (B + C) = (A ⊗ B) + (A ⊗ C)
• A ⊗ (B ⊗ C) = (A ⊗ B) ⊗ C
Different vector spaces can also be combined using tensor products. If V and V ′ are vectors spaces of
dimension d and d′ with basis {v1 , . . . , vd } and {v1′ , . . . , vd′ ′ }, respectively, then their tensor product
space is the d · d′ -dimensional space W = V ⊗ V ′ spanned by {vi ⊗ vj′ | 1 ≤ i ≤ d, 1 ≤ j ≤ d′ }.
Applying a linear operation A to V and B to V ′ corresponds to applying the tensor product A ⊗ B
to the tensor product space W .
A.6 Rank
The rank of a matrix A (over a field F) is the size of the largest linearly independent set of rows of
A (linear independence taken over F). Unless mentioned otherwise, we take F to be the field of real
numbers. We say that A has full rank if its rank equals its dimension. The following properties
are all easy to show:
• rank(A) = rank(A∗ )
• hv|wi = hv||wi
P
• If A is unitarily diagonalizable, then A = i λi |vi ihvi | for some orthonormal set of eigenvec-
tors {vi }
93
Appendix B
Here we collect various identities and other useful mathematical facts needed in parts of the lecture
notes.
Equivalently, written in terms of inner products and norms of vectors: |ha, bi| ≤ kak · kbk.
It is proved as follows: for every real λ we have 0 ≤ ha − λb, a − λbi = kak2 +λ2 kbk2 −2λha, bi.
Now set λ = kak/kbk and rearrange.
• A complex number is of the form c = a + bi, where a, b ∈ R, and i is the imaginary √ unit,
which satisfies i2 = −1. Such a c can also be written as c = reiφ where r = |c| = a2 + b2 is
the magnitude of c, and φ ∈ [0, 2π) is the angle that c makes with the positive horizontal axis
when we view it as a point (a, b) in the plane. Note that complex numbers of magnitude 1
lie on the unit circle in this plane. We can also write those as eiφ = cos(φ) + i sin(φ). The
complex conjugate c∗ is a − ib, equivalently c∗ = re−iφ .
m−1
X
j m if a = 1
• a = 1−am
1−a if a =
6 1
j=0
P
The case a = 1 is obvious; the case a 6= 1 may be seen by observing that (1 − a)( m−1 j
j=0 a ) =
Pm−1 j Pm j m 2πir/N is a root of unity, with r an integer
j=0 a − j=1 a = 1 − a . For example, if a = e
PN −1 j 1−e2πir
in {1, . . . , N − 1}, then j=0 a = 1−e 2πir/N .
• The above ratio can be rewritten using the identity |1 − eiθ | = 2| sin(θ/2)|; this identity can
be seen drawing the numbers 1 and eiθ as vectors from the origin in the complex plane, and
dividing their angle θ in two. Some other useful trigonometric identities: cos(θ)2 +sin(θ)2 = 1,
sin(2θ) = 2 sin(θ) cos(θ).
94
k
X k
Y Pk
• If εj ∈ [0, 1] then 1 − εj ≤ (1 − εj ) ≤ e− j=1 εj .
j=1 j=1
The upper bound comes from the preceding item. The lower bound follows easily by induction,
using the fact that (1 − ε1 )(1 − ε2 ) = 1 − ε1 − ε2 + ε1 ε2 ≥ 1 − ε1 − ε2 .
• When we don’t care about constant factors, we’ll often use big-Oh notation: T (n) = O(f (n))
means there exist constants c, d ≥ 0 such that for all integers n, we have T (n) ≤ cf (n) + d.
Similarly, big-Omega notation is used for lower bounds: T (n) = Ω(f (n)) means there exist
positive constants c and d such that T (n) ≥ cf (n) − d for all n.
95
Bibliography
[1] S. Aaronson and A. Ambainis. Quantum search of spatial regions. In Proceedings of 44th IEEE
FOCS, pages 200–209, 2003. quant-ph/0303041.
[2] S. Aaronson and Y. Shi. Quantum lower bounds for the collision and the element distinctness
problems. Journal of the ACM, 51(4):595–605, 2004.
[3] D. Aharonov and M. Ben-Or. Fault tolerant quantum computation with constant error. In
Proceedings of 29th ACM STOC, pages 176–188, 1997. quant-ph/9611025.
[5] A. Ambainis. Quantum lower bounds by quantum arguments. Journal of Computer and
System Sciences, 64(4):750–767, 2002. Earlier version in STOC’00. quant-ph/0002066.
[6] A. Ambainis. Polynomial degree vs. quantum query complexity. In Proceedings of 44th IEEE
FOCS, pages 230–239, 2003. quant-ph/0305028.
[7] A. Ambainis. Quantum walk algorithm for element distinctness. In Proceedings of 45th IEEE
FOCS, pages 22–31, 2004. quant-ph/0311001.
[8] P. K. Aravind. A simple demonstration of Bell’s theorem involving two observers and no
probabilities or inequalities. quant-ph/0206070, 2002.
[9] R. Beals, H. Buhrman, R. Cleve, M. Mosca, and R. de Wolf. Quantum lower bounds by
polynomials. Journal of the ACM, 48(4):778–797, 2001. Earlier version in FOCS’98. quant-
ph/9802049.
[13] C. H. Bennett and G. Brassard. Quantum cryptography: Public key distribution and coin
tossing. In Proceedings of the IEEE International Conference on Computers, Systems and
Signal Processing, pages 175–179, 1984.
96
[14] E. Bernstein and U. Vazirani. Quantum complexity theory. SIAM Journal on Computing,
26(5):1411–1473, 1997. Earlier version in STOC’93.
[15] M. Boyer, G. Brassard, P. Høyer, and A. Tapp. Tight bounds on quantum searching.
Fortschritte der Physik, 46(4–5):493–505, 1998. Earlier version in Physcomp’96. quant-
ph/9605034.
[16] G. Brassard, R. Cleve, and A. Tapp. The cost of exactly simulating quantum entanglement with
classical communication. Physical Review Letters, 83(9):1874–1877, 1999. quant-ph/9901035.
[17] H. Buhrman, R. Cleve, S. Massar, and R. de Wolf. Non-locality and communication complexity.
Reviews of Modern Physics, 82:665–698, 2010. arXiv:0907.3584.
[18] H. Buhrman, R. Cleve, J. Watrous, and R. de Wolf. Quantum fingerprinting. Physical Review
Letters, 87(16), September 26, 2001. quant-ph/0102001.
[19] H. Buhrman, R. Cleve, and A. Wigderson. Quantum vs. classical communication and compu-
tation. In Proceedings of 30th ACM STOC, pages 63–68, 1998. quant-ph/9802040.
[20] J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt. Proposed experiment to test local
hidden-variable theories. Physical Review Letters, 23(15):880–884, 1969.
[21] R. Cleve. The query complexity of order-finding. In Proceedings of 15th IEEE Conference on
Computational Complexity, pages 54–59, 2000. quant-ph/9911124.
[22] R. Cleve and H. Buhrman. Substituting quantum entanglement for communication. Physical
Review A, 56(2):1201–1204, 1997. quant-ph/9704026.
[23] R. Cleve, W. van Dam, M. Nielsen, and A. Tapp. Quantum entanglement and the com-
munication complexity of the inner product function. In Proceedings of 1st NASA QCQC
conference, volume 1509 of Lecture Notes in Computer Science, pages 61–74. Springer, 1998.
quant-ph/9708019.
[24] R. Cleve, A. Ekert, C. Macchiavello, and M. Mosca. Quantum algorithms revisited. In Proceed-
ings of the Royal Society of London, volume A454, pages 339–354, 1998. quant-ph/9708016.
[25] D. Deutsch. Quantum theory, the Church-Turing principle, and the universal quantum Turing
machine. In Proceedings of the Royal Society of London, volume A400, pages 97–117, 1985.
[26] D. Deutsch. Quantum computational networks. In Proceedings of the Royal Society of London,
volume A425, 1989.
[27] D. Deutsch and R. Jozsa. Rapid solution of problems by quantum computation. In Proceedings
of the Royal Society of London, volume A439, pages 553–558, 1992.
[28] A. Drucker and R. de Wolf. Quantum proofs for classical theorems. Theory of Computing,
2011. ToC Library, Graduate Surveys 2.
97
[30] P. van Emde Boas. Machine models and simulations. In van Leeuwen [68], pages 1–66.
[31] R. Feynman. Simulating physics with computers. International Journal of Theoretical Physics,
21(6/7):467–488, 1982.
[34] P. Frankl and V. Rödl. Forbidden intersections. Transactions of the American Mathematical
Society, 300(1):259–286, 1987.
[35] D. Gottesman. An introduction to quantum error correction and fault-tolerant quantum com-
putation. arXiv:0904.2557, 16 Apr 2009.
[36] L. K. Grover. A fast quantum mechanical algorithm for database search. In Proceedings of
28th ACM STOC, pages 212–219, 1996. quant-ph/9605043.
[37] G. H. Hardy and E. M. Wright. An Introduction to the Theory of Numbers. Oxford University
Press, New York, fifth edition, 1979.
[38] A. S. Holevo. Bounds for the quantity of information transmitted by a quantum communication
channel. Problemy Peredachi Informatsii, 9(3):3–11, 1973. English translation in Problems of
Information Transmission, 9:177–183, 1973.
[39] P. Høyer, T. Lee, and R. Špalek. Negative weights make adversaries stronger. In Proceedings
of 39th ACM STOC, pages 526–535, 2007. quant-ph/0611054.
[40] P. Høyer and R. Špalek. Lower bounds on quantum query complexity. Bulletin of the EATCS,
87:78–103, October 2005.
[41] J. Katz and L. Trevisan. On the efficiency of local decoding procedures for error-correcting
codes. In Proceedings of 32nd ACM STOC, pages 80–86, 2000.
[42] I. Kerenidis and R. de Wolf. Exponential lower bound for 2-query locally decodable codes via
a quantum argument. Journal of Computer and System Sciences, 69(3):395–420, 2004. Earlier
version in STOC’03. quant-ph/0208062.
[43] A. Yu. Kitaev. Quantum measurements and the Abelian stabilizer problem. quant-ph/9511026,
12 Nov 1995.
[44] B. Klartag and O. Regev. Quantum one-way communication is exponentially stronger than
classical communication. In Proceedings of 43rd ACM STOC, 2011. arXiv:1009.3640.
[45] M. Knill, R. Laflamme, and W. Zurek. Threshold accuracy for quantum computation. quant-
ph/9610011, 15 Oct 1996.
98
[47] A. K. Lenstra and H. W. Lenstra, Jr. The Development of the Number Field Sieve, volume
1554 of Lecture Notes in Mathematics. Springer, 1993.
[48] H. W. Lenstra and C. Pomerance. A rigorous time bound for factoring integers. Journal of
the American Mathematical Society, 5:483–516, 1992.
[49] H-K. Lo and H. F. Chau. Unconditional security of quantum key distribution over arbitrarily
long distances. quant-ph/9803006, 3 Mar 1998.
[50] F. Magniez, M. Santha, and M. Szegedy. Quantum algorithms for the triangle problem. In
Proceedings of 16th ACM-SIAM SODA, pages 1109–1117, 2005. quant-ph/0310134.
[52] M. Mosca and A. Ekert. The hidden subgroup problem and eigenvalue estimation on a quantum
computer. In Proceedings of 1st NASA QCQC conference, volume 1509 of Lecture Notes in
Computer Science, pages 174–188. Springer, 1998. quant-ph/9903071.
[53] A. Nayak. Optimal lower bounds for quantum automata and random access codes. In Pro-
ceedings of 40th IEEE FOCS, pages 369–376, 1999. quant-ph/9904093.
[54] I. Newman. Private vs. common random bits in communication complexity. Information
Processing Letters, 39(2):67–71, 1991.
[55] I. Newman and M. Szegedy. Public vs. private coin flips in one round communication games.
In Proceedings of 28th ACM STOC, pages 561–570, 1996.
[56] M. A. Nielsen and I. L. Chuang. Quantum Computation and Quantum Information. Cambridge
University Press, 2000.
[59] B. Reichardt. Span programs and quantum query complexity: The general adversary bound
is nearly tight for every Boolean function. In Proceedings of 50th IEEE FOCS, pages 544–551,
2009.
[60] R. Rivest, A. Shamir, and L. Adleman. A method for obtaining digital signatures and public
key cryptosystems. Communications of the ACM, 21:120–126, 1978.
[62] M. Santha. Quantum walk based search algorithms. In Proceedings of 5th TAMC, pages 31–46,
2008. arXiv/0808.0059.
[63] P. W. Shor. Scheme for reducing decoherence in quantum memory. Physical Review A, 52:2493,
1995.
99
[64] P. W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a
quantum computer. SIAM Journal on Computing, 26(5):1484–1509, 1997. Earlier version in
FOCS’94. quant-ph/9508027.
[65] D. Simon. On the power of quantum computation. SIAM Journal on Computing, 26(5):1474–
1483, 1997. Earlier version in FOCS’94.
[67] B. S. Tsirelson. Quantum analogues of the Bell inequalities. the case of two spatially separated
domains. Journal of Soviet Mathematics, 36:557–570, 1987.
[68] J. van Leeuwen, editor. Handbook of Theoretical Computer Science. Volume A: Algorithms
and Complexity. MIT Press, Cambridge, MA, 1990.
[71] R. de Wolf. Quantum Computing and Communication Complexity. PhD thesis, University of
Amsterdam, 2001.
[72] W. K. Wootters and W. H. Zurek. A single quantum cannot be copied. Nature, 299:802–803,
1982.
[73] A. C-C. Yao. Quantum circuit complexity. In Proceedings of 34th IEEE FOCS, pages 352–360,
1993.
100