Learning Home Catalog Composer
Learning
Home Catalog Composer Return to course
Learning

Quantum circuits

Download the slides for this lesson.

Introduction

This lesson introduces the quantum circuit model of computation, which is a standard description of quantum computations that we'll use throughout this series.

We'll also introduce a few important mathematical concepts, including inner products between vectors, the notions of orthogonality and orthonormality, and projections and projective measurements, which generalize standard basis measurements. Through these concepts, we'll derive fundamental limitations on quantum information, including the no-cloning theorem and the impossibility to perfectly discriminate non-orthogonal quantum states.

Circuits

In computer science, circuits are models of computation in which information is carried by wires through a network of gates, which represent operations that transform the information carried by the wires. Quantum circuits are just one example of a model of computation based on this more general concept.

Although the word "circuit" often refers to a circular path, circular paths aren't actually allowed in the most common circuit models of computation. That is to say, we usually study acyclic circuits when we're thinking about circuits as computational models. Quantum circuits follow this pattern; although we could run a quantum circuit as many times as we like, a quantum circuit itself represents a finite sequence of operations that cannot contain feedback loops.

Boolean circuits

Here is an example of a (classical) Boolean circuit, where the wires carry binary values and the gates represent Boolean logic operations:

Example of a Boolean circuit

The flow of information along the wires goes from left to right: the wires on the left-hand side of the figure labeled X\mathsf{X} and Y\mathsf{Y} are input bits, which can each be set to whatever binary value we choose, and the wire on the right-hand side is the output. The intermediate wires take whatever values are determined by the gates, which are evaluated from left to right.

The gates are AND gates (labeled \wedge), OR gates (labeled \vee), and NOT gates (labeled ¬\neg). The functions computed by these gates will likely be familiar to many readers, but here they are represented by tables of values:

a¬a0110abab000010100111abab000011101111\begin{array}{c} \begin{array}{c|c} a & \neg a\\ \hline 0 & 1\\ 1 & 0\\ \end{array}\\ \\ \\ \end{array} \qquad \begin{array}{c|c} ab & a \wedge b\\ \hline 00 & 0\\ 01 & 0\\ 10 & 0\\ 11 & 1 \end{array} \qquad \begin{array}{c|c} ab & a \vee b\\ \hline 00 & 0\\ 01 & 1\\ 10 & 1\\ 11 & 1 \end{array}

The two small circles on the wires just to the right of the names X\mathsf{X} and Y\mathsf{Y} represent fanout operations, which simply create a copy of whatever value is carried on the wire on which they appear, so that this value can be input into multiple gates. Fanout operations are not always considered to be gates in the classical setting — sometimes they are treated as if they are "free" in some sense — but when we discuss how ordinary Boolean circuits can be converted into equivalent quantum circuits, we must classify fanout operations explicitly as gates and account for them correctly.

Here is the same circuit illustrated in a style more common in electrical engineering, which uses conventional symbols for the AND, OR, and NOT gates:

Boolean circuit in a classic style

We won't use this style or these particular gate symbols further, but we do use different symbols to represent gates in quantum circuits, which we'll explain as we encounter them.

The particular circuit in this example computes the exclusive-OR (or XOR for short), which is denoted by the symbol \oplus:

abab000011101110\begin{array}{c|c} ab & a \oplus b\\ \hline 00 & 0\\ 01 & 1\\ 10 & 1\\ 11 & 0 \end{array}

In the next diagram we consider just one choice for the inputs: X=0\mathsf{X}=0 and Y=1.\mathsf{Y}=1. Each wire is labeled by value it carries so you can follow the operations. The output value is 11 in this case, which is the correct value for the XOR: 01=1.0 \oplus 1 = 1.

Evaluating a Boolean circuit

You can check the other three possible input settings in a similar way.

Other types of circuits

As suggested above, the notion of a circuit in computer science is very general. For example, circuits whose wires carry values other than 00 and 11 are sometimes studied, as are gates representing different choices of operations.

In arithmetic circuits, for instance, the wires may carry integer values and the gates may represent arithmetic operations, such as addition and multiplication. The following figure depicts an arithmetic circuit that takes two variable input values (xx and yy) as well as a third input set to the value 1.1. The values carried by the wires, as functions of the values xx and y,y, are shown in the figure.

Example arithmetic circuit

We can also consider circuits that incorporate randomness, such as ones where gates represent probabilistic operations.

Quantum circuits

In the quantum circuit model, wires represent qubits and gates represent operations acting on these qubits. We'll focus for now on operations we've encountered so far, namely unitary operations and standard basis measurements. As we learn about other sorts of quantum operations and measurements, we'll enhance our model accordingly.

Here's a simple example of a quantum circuit:

Simple quantum circuit

In this circuit, we have a single qubit named X,\mathsf{X}, which is represented by the horizontal line, and a sequence of gates representing unitary operations on this qubit. Just like in the examples above, the flow of information goes from left to right — so the first operation performed is a Hadamard, the second is an SS operation, the third is another Hadamard, and the final operation is a TT operation. Applying the entire circuit therefore applies the composition of these operations, THSH,THSH, to the qubit X.\mathsf{X}.

Sometimes we wish to explicitly indicate the input or output states to a circuit. For example, if we apply the operation THSHTHSH to the state 0,\vert 0\rangle, we obtain the state 1+i20+121.\frac{1+i}{2}\vert 0\rangle + \frac{1}{\sqrt{2}} \vert 1 \rangle. We may indicate this as follows:

Simple quantum circuit evaluated

Quantum circuits often have all qubits initialized to 0,\vert 0\rangle, as we have in this case, but there are also cases where we wish to set the input qubits to different states.

Now let's see how we can specify this circuit in Qiskit, beginning with the imports needed for the current section.

Copy to clipboard

No output produced

To begin, we can create the circuit as follows, by sequentially adding gates from left to right.

Copy to clipboard

Output:

The default names for qubits in Qiskit are q0,\mathsf{q_0}, q1,\mathsf{q_1}, q2,\mathsf{q_2}, etc., and when there is just a single qubit like in our example, the default name is q\mathsf{q} rather than q0.\mathsf{q_0}. If we wish to choose our own name we can do this using the QuantumRegister class like this:

Copy to clipboard

Output:

Here is another example of a quantum circuit, this time with two qubits:

Quantum circuit that creates an ebit

As always, the gate labeled HH refers to a Hadamard operation, while the second gate is a two-qubit gate: it's the controlled-NOT operation, where the solid circle represents the control qubit and the circle resembling the symbol \oplus denotes the target qubit.

Before examining this circuit in greater detail and explaining what it does, it is imperative that we clarify how qubits are ordered in quantum circuits, which connects with the convention that Qiskit uses for naming and ordering systems mentioned briefly in the previous lesson.

Qiskit's qubit ordering convention for circuits

In Qiskit, the topmost qubit in a circuit diagram has index 00 and corresponds to the rightmost position in a tuple of qubits (or in a string, Cartesian product, or tensor product corresponding to this tuple). The second-from-top qubit has index 1,1, and corresponds to the position second-from-right in a tuple, and so on, down to the bottommost qubit, which has the highest index, and corresponds to the leftmost position in a tuple. In particular, Qiskit's default names for the qubits in an nn-qubit circuit are represented by the nn-tuple (qn1,,q0),(\mathsf{q_{n-1}},\ldots,\mathsf{q_{0}}), with q0\mathsf{q_{0}} being the qubit on the top and qn1\mathsf{q_{n-1}} on the bottom in quantum circuit diagrams.

Further information on this ordering convention can be found on the Bit-ordering in Qiskit documentation page.

Although we will often deviate from the specific default names qn1,,q0\mathsf{q_{n-1}},\ldots,\mathsf{q_{0}} used for qubits by Qiskit, we will follow this ordering convention throughout this course for interpreting circuit diagrams.

Thus, our interpretation of the circuit above is that it describes an operation on a pair of qubits (X,Y)(\mathsf{X},\mathsf{Y}) — and if the input to the circuit is a quantum state ψϕ,\vert \psi\rangle \vert \phi\rangle, then this means that the lower qubit X\mathsf{X} starts in the state ψ\vert \psi\rangle and the upper qubit Y\mathsf{Y} starts in the state ϕ.\vert \phi\rangle. To understand what the circuit does, we can go from left to right through its operations.

  1. The first operation is a Hadamard operation on Y\mathsf{Y}:

    First operation e-bit creator

    When applying a gate to a single qubit like this, nothing happens to the other qubits — and nothing happening is equivalent to the identity operation being performed. In our circuit there is just one other qubit, X,\mathsf{X}, so the dotted rectangle in the figure above represents this operation:

    IH=(121200121200001212001212). \mathbb{I}\otimes H = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 & 0\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix}.

    Note that the identity matrix is on the left of the tensor product and HH is on the right, which is consistent with Qiskit's ordering convention.

  2. The second operation is the controlled-NOT operation, where Y\mathsf{Y} is the control and X\mathsf{X} is the target:

    Second operation e-bit creator

    The controlled-NOT gate's action on standard basis states is as follows:

    Controlled-NOT gate

    Given that we order the qubits as (X,Y),(\mathsf{X}, \mathsf{Y}), the matrix representation of the controlled-NOT gate is this:

    (1000000100100100). \begin{pmatrix} 1 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 1\\[2mm] 0 & 0 & 1 & 0\\[2mm] 0 & 1 & 0 & 0 \end{pmatrix}.

The unitary operation of the entire circuit, which we'll call U,U, is the composition of the operations:

U=(1000000100100100)(121200121200001212001212)=(121200001212001212121200).U = \begin{pmatrix} 1 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 1\\[2mm] 0 & 0 & 1 & 0\\[2mm] 0 & 1 & 0 & 0 \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 & 0\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix} = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 & 0\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0 \end{pmatrix}.

In particular, recalling our notation for the Bell states,

ϕ+=1200+1211ϕ=12001211ψ+=1201+1210ψ=12011210,\begin{aligned} \vert \phi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 0 0 \rangle + \frac{1}{\sqrt{2}} \vert 1 1 \rangle \\[2mm] \vert \phi^- \rangle & = \frac{1}{\sqrt{2}} \vert 0 0 \rangle - \frac{1}{\sqrt{2}} \vert 1 1 \rangle \\[2mm] \vert \psi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 0 1 \rangle + \frac{1}{\sqrt{2}} \vert 1 0 \rangle \\[2mm] \vert \psi^- \rangle & = \frac{1}{\sqrt{2}} \vert 0 1 \rangle - \frac{1}{\sqrt{2}} \vert 1 0 \rangle, \end{aligned}

we get that

U00=ϕ+U01=ϕU10=ψ+U11=ψ.\begin{aligned} U \vert 00\rangle & = \vert \phi^+\rangle\\ U \vert 01\rangle & = \vert \phi^-\rangle\\ U \vert 10\rangle & = \vert \psi^+\rangle\\ U \vert 11\rangle & = -\vert \psi^-\rangle. \end{aligned}

So, this circuit gives us a way to create the state ϕ+\vert\phi^+\rangle if we run it on two qubits initialized to 00.\vert 00\rangle. More generally, it gives us a way to convert the standard basis to the Bell basis. (The 1-1 phase factor on the last state, ψ,-\vert \psi^-\rangle, could be eliminated if we wanted, but this would require a change to the circuit. For instance, we could add a controlled-ZZ gate at the beginning or a swap gate at the end to eliminate the minus sign without affecting the circuit's action on the other three standard basis states.)

In general, quantum circuits can contain any number of qubit wires. We may also include classical bit wires, which are indicated by double lines like in this example:

Example circuit with measurements

In this circuit we have a Hadamard gate and a controlled-NOT gate on two qubits X\mathsf{X} and Y,\mathsf{Y}, just like in the previous example. We also have two classical bits, A\mathsf{A} and B,\mathsf{B}, as well as two measurement gates. The measurement gates represent standard basis measurements: the qubits are changed into their post-measurement states, while the measurement outcomes are overwritten onto the classical bits to which the arrows point.

Here is an implementation of this circuit using Qiskit:

Copy to clipboard

Output:

The circuit can be simulated using the Sampler primitive.

Copy to clipboard

Output:

Sometimes it is convenient to depict a measurement as a gate that takes a qubit as input and outputs a classical bit (as opposed to outputting the qubit in its post-measurement state and writing the result to a separate classical bit). This means the measured qubit has been discarded and can safely be ignored thereafter.

For example, the following circuit diagram represents the same process as the one in the previous diagram, but where we ignore X\mathsf{X} and Y\mathsf{Y} after measuring them:

Example circuit with measurements compact

As the series continues, we will see many more examples of quantum circuits, which will usually be much more complicated than the simple examples above. Here are some symbols for common gates:

  • Single-qubit gates are generally shown as squares with a letter indicating which operation it is, like this:

    Single-qubit gates

    Not gates (also known as XX gates) are also sometimes denoted by a circle around a plus sign:

    Not gate

  • Swap gates are denoted as follows:

    Swap gate

    Controlled-gates, meaning gates that describe controlled-unitary operations, are denoted by a filled-in circle (indicating the control) connected by a vertical line to whatever operation is being controlled. For instance, controlled-NOT gates, controlled-controlled-NOT (or Toffoli) gates, and controlled-swap (Fredkin) gates are denoted like this:

    Controlled gate

  • Arbitrary unitary operations on multiple qubits may be viewed as gates. They are depicted by rectangles labeled by the name of the unitary operation. For instance, here is a depiction of an (unspecified) unitary operation UU as a gate, along with a controlled version of this gate:

    Arbitrary unitary gate together with controlled version

Inner products, orthonormality, and projections

To better prepare ourselves to explore the capabilities and limitations of quantum circuits, we now introduce some additional mathematical concepts — namely the inner product between vectors (and its connection to the Euclidean norm), the notions of orthogonality and orthonormality for sets of vectors, and projection matrices, which will allow us to introduce a handy generalization of standard basis measurements.

Inner products

Recall from Lesson 1 that when we use the Dirac notation to refer to an arbitrary column vector as a ket, such as

ψ=(α1α2αn),\vert \psi \rangle = \begin{pmatrix} \alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n \end{pmatrix},

the corresponding bra vector is the conjugate transpose of this vector:

ψ=(ψ)=(α1α2αn).(1)\langle \psi \vert = \bigl(\vert \psi \rangle \bigr)^{\dagger} = \begin{pmatrix} \overline{\alpha_1} & \overline{\alpha_2} & \cdots & \overline{\alpha_n} \end{pmatrix}. \tag{1}

Alternatively, if we have some classical state set Σ\Sigma in mind, and we express a column vector as a ket, such as

ψ=aΣαaa,\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle,

then the corresponding row (or bra) vector is the conjugate transpose

ψ=aΣαaa.(2)\langle \psi \vert = \sum_{a\in\Sigma} \overline{\alpha_a} \langle a \vert. \tag{2}

We also observed that the product of a bra vector and a ket vector, viewed as matrices either having a single row or a single column, results in a scalar. Specifically, if we have two (column) vectors

ψ=(α1α2αn)andϕ=(β1β2βn),\vert \psi \rangle = \begin{pmatrix} \alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n \end{pmatrix} \quad\text{and}\quad \vert \phi \rangle = \begin{pmatrix} \beta_1\\ \beta_2\\ \vdots\\ \beta_n \end{pmatrix},

so that the row vector ψ\langle \psi \vert is as in equation (1),(1), then

ψϕ=ψϕ=(α1α2αn)(β1β2βn)=α1β1++αnβn.\langle \psi \vert \phi \rangle = \langle \psi \vert \vert \phi \rangle = \begin{pmatrix} \overline{\alpha_1} & \overline{\alpha_2} & \cdots & \overline{\alpha_n} \end{pmatrix} \begin{pmatrix} \beta_1\\ \beta_2\\ \vdots\\ \beta_n \end{pmatrix} = \overline{\alpha_1} \beta_1 + \cdots + \overline{\alpha_n}\beta_n.

Alternatively, if we have two column vectors that we have written as

ψ=aΣαaaandϕ=bΣβbb,\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle \quad\text{and}\quad \vert \phi \rangle = \sum_{b\in\Sigma} \beta_b \vert b \rangle,

so that ψ\langle \psi \vert is the row vector (2),(2), we find that

ψϕ=ψϕ=(aΣαaa)(bΣβbb)=aΣbΣαaβbab=aΣαaβa,\begin{aligned} \langle \psi \vert \phi \rangle & = \langle \psi \vert \vert \phi \rangle\\ & = \Biggl(\sum_{a\in\Sigma} \overline{\alpha_a} \langle a \vert\Biggr) \Biggl(\sum_{b\in\Sigma} \beta_b \vert b\rangle\Biggr)\\ & = \sum_{a\in\Sigma}\sum_{b\in\Sigma} \overline{\alpha_a} \beta_b \langle a \vert b \rangle\\ & = \sum_{a\in\Sigma} \overline{\alpha_a} \beta_a, \end{aligned}

where the last equality follows from the observation that aa=1\langle a \vert a \rangle = 1 and ab=0\langle a \vert b \rangle = 0 for classical states aa and bb satisfying ab.a\neq b.

The value ψϕ\langle \psi \vert \phi \rangle is called the inner product between the vectors ψ\vert \psi\rangle and ϕ.\vert \phi \rangle. Inner products are critically important in quantum information and computation. We would not get far in understanding quantum information at a mathematical level without this fundamental notion.

Let us now collect together some basic facts about inner products of vectors.

  1. Relationship to the Euclidean norm. The inner product of any vector

    ψ=aΣαaa\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle

    with itself is

    ψψ=aΣαaαa=aΣαa2=ψ2.\langle \psi \vert \psi \rangle = \sum_{a\in\Sigma} \overline{\alpha_a} \alpha_a = \sum_{a\in\Sigma} \vert\alpha_a\vert^2 = \bigl\| \vert \psi \rangle \bigr\|^2.

    Thus, the Euclidean norm of a vector may alternatively be expressed as

    ψ=ψψ.\bigl\| \vert \psi \rangle \bigr\| = \sqrt{ \langle \psi \vert \psi \rangle }.

    Notice that the Euclidean norm of a vector must always be a nonnegative real number. Moreover, the only way the Euclidean norm of a vector can be equal to zero is if every one of the entries is equal to zero, which is to say that the vector is the zero vector.

    We can summarize these observations like this: for every vector ψ\vert \psi \rangle we have

    ψψ0,\langle \psi \vert \psi \rangle \geq 0,

    with ψψ=0\langle \psi \vert \psi \rangle = 0 if and only if ψ=0.\vert \psi \rangle = 0. This property of the inner product is sometimes referred to as positive definiteness.

  2. Conjugate symmetry. For any two vectors

    ψ=aΣαaaandϕ=bΣβbb,\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle \quad\text{and}\quad \vert \phi \rangle = \sum_{b\in\Sigma} \beta_b \vert b \rangle,

    we have

    ψϕ=aΣαaβaandϕψ=aΣβaαa,\langle \psi \vert \phi \rangle = \sum_{a\in\Sigma} \overline{\alpha_a} \beta_a \quad\text{and}\quad \langle \phi \vert \psi \rangle = \sum_{a\in\Sigma} \overline{\beta_a} \alpha_a,

    and therefore

    ψϕ=ϕψ.\overline{\langle \psi \vert \phi \rangle} = \langle \phi \vert \psi \rangle.
  3. Linearity in the second argument (and conjugate linearity in the first). Let us suppose that ψ,\vert \psi \rangle, ϕ1,\vert \phi_1 \rangle, and ϕ2\vert \phi_2 \rangle are vectors and α1\alpha_1 and α2\alpha_2 are complex numbers. If we define a new vector

    ϕ=α1ϕ1+α2ϕ2,\vert \phi\rangle = \alpha_1 \vert \phi_1\rangle + \alpha_2 \vert \phi_2\rangle,

    then

    ψϕ=ψ(α1ϕ1+α2ϕ2)=α1ψϕ1+α2ψϕ2.\langle \psi \vert \phi \rangle = \langle \psi \vert \bigl( \alpha_1\vert \phi_1 \rangle + \alpha_2\vert \phi_2 \rangle\bigr) = \alpha_1 \langle \psi \vert \phi_1 \rangle + \alpha_2 \langle \psi \vert \phi_2 \rangle.

    That is to say, the inner product is linear in the second argument. This can be verified either through the formulas above or simply by noting that matrix multiplication is linear in each argument (and specifically in the second argument).

    Combining this fact with conjugate symmetry reveals that the inner product is conjugate linear in the first argument. That is, if ψ1,\vert \psi_1 \rangle, ψ2,\vert \psi_2 \rangle, and ϕ\vert \phi \rangle are vectors and α1\alpha_1 and α2\alpha_2 are complex numbers, and we define

    ψ=α1ψ1+α2ψ2,\vert \psi \rangle = \alpha_1 \vert \psi_1\rangle + \alpha_2 \vert \psi_2 \rangle,

    then

    ψϕ=(α1ψ1+α2ψ2)ϕ=α1ψ1ϕ+α2ψ2ϕ.\langle \psi \vert \phi \rangle = \bigl( \overline{\alpha_1} \langle \psi_1 \vert + \overline{\alpha_2} \langle \psi_2 \vert \bigr) \vert\phi\rangle = \overline{\alpha_1} \langle \psi_1 \vert \phi \rangle + \overline{\alpha_2} \langle \psi_2 \vert \phi \rangle.
  4. The Cauchy–Schwarz inequality. For every choice of vectors ϕ\vert \phi \rangle and ψ\vert \psi \rangle having the same number of entries, we have

    ψϕψϕ.\bigl\vert \langle \psi \vert \phi \rangle\bigr| \leq \bigl\| \vert\psi \rangle \bigr\| \bigl\| \vert \phi \rangle \bigr\|.

    This is an incredibly handy inequality that gets used quite extensively in quantum information (and in many other fields of study).

Orthogonal and orthonormal sets

Two vectors ϕ\vert \phi \rangle and ψ\vert \psi \rangle are said to be orthogonal if their inner product is zero:

ψϕ=0.\langle \psi \vert \phi \rangle = 0.

Geometrically, we can think about orthogonal vectors as vectors at right angles to each other.

A set of vectors {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} is called an orthogonal set if every vector in the set is orthogonal to every other vector in the set. That is, this set is orthogonal if

ψjψk=0\langle \psi_j \vert \psi_k\rangle = 0

for all choices of j,k{1,,m}j,k\in\{1,\ldots,m\} for which jk.j\neq k.

A set of vectors {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} is called an orthonormal set if it is an orthogonal set and, in addition, every vector in the set is a unit vector. Alternatively, this set is an orthonormal set if we have

ψjψk={1j=k0jk(3)\langle \psi_j \vert \psi_k\rangle = \begin{cases} 1 & j = k\\ 0 & j\neq k \end{cases} \tag{3}

for all choices of j,k{1,,m}.j,k\in\{1,\ldots,m\}.

Finally, a set {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} is an orthonormal basis if, in addition to being an orthonormal set, it forms a basis. This is equivalent to {ψ1,,ψm}\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\} being an orthonormal set and mm being equal to the dimension of the space from which ψ1,,ψm\vert \psi_1\rangle,\ldots,\vert\psi_m\rangle are drawn.

For example, for any classical state set Σ,\Sigma, the set of all standard basis vectors

{a:aΣ}\big\{ \vert a \rangle \,:\, a\in\Sigma\bigr\}

is an orthonormal basis. The set {+,}\{\vert+\rangle,\vert-\rangle\} is an orthonormal basis for the 22-dimensional space corresponding to a single qubit, and the Bell basis {ϕ+,ϕ,ψ+,ψ}\{\vert\phi^+\rangle, \vert\phi^-\rangle, \vert\psi^+\rangle, \vert\psi^-\rangle\} is an orthonormal basis for the 44-dimensional space corresponding to two qubits.

Extending orthonormal sets to orthonormal bases

Suppose that ψ1,,ψm\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle are vectors that live in an nn-dimensional space, and assume moreover that {ψ1,,ψm}\{\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle\} is an orthonormal set. Orthonormal sets are always linearly independent sets, so these vectors necessarily span a subspace of dimension m.m. From this we immediately conclude that mnm\leq n because the dimension of the subspace spanned by these vectors cannot be larger than the dimension of the entire space from which they're drawn.

If it is the case that m<n,m<n, then it is always possible to choose an additional nmn-m vectors ψm+1,,ψn\vert \psi_{m+1}\rangle,\ldots,\vert\psi_n\rangle so that {ψ1,,ψn}\{\vert\psi_1\rangle,\ldots,\vert\psi_n\rangle\} forms an orthonormal basis. A procedure known as the GramSchmidt orthogonalization process can be used to construct these vectors.

Orthonormal sets and unitary matrices

Orthonormal sets of vectors are closely connected with unitary matrices. One way to express this connection is to say that the following three statements are logically equivalent (meaning that they are all true or all false) for any choice of a square matrix UU:

  1. The matrix UU is unitary (i.e., UU=I=UUU^{\dagger} U = \mathbb{I} = U U^{\dagger}).
  2. The rows of UU form an orthonormal set.
  3. The columns of UU form an orthonormal set.

This equivalence is actually pretty straightforward when we think about how matrix multiplication and the conjugate transpose work. Suppose, for instance, that we have a 3×33\times 3 matrix like this:

U=(α1,1α1,2α1,3α2,1α2,2α2,3α3,1α3,2α3,3)U = \begin{pmatrix} \alpha_{1,1} & \alpha_{1,2} & \alpha_{1,3} \\ \alpha_{2,1} & \alpha_{2,2} & \alpha_{2,3} \\ \alpha_{3,1} & \alpha_{3,2} & \alpha_{3,3} \end{pmatrix}

The conjugate transpose of UU looks like this:

U=(α1,1α2,1α3,1α1,2α2,2α3,2α1,3α2,3α3,3)U^{\dagger} = \begin{pmatrix} \overline{\alpha_{1,1}} & \overline{\alpha_{2,1}} & \overline{\alpha_{3,1}} \\ \overline{\alpha_{1,2}} & \overline{\alpha_{2,2}} & \overline{\alpha_{3,2}} \\ \overline{\alpha_{1,3}} & \overline{\alpha_{2,3}} & \overline{\alpha_{3,3}} \end{pmatrix}

Multiplying the two matrices, with the conjugate transpose on the left-hand side, gives us this matrix:

(α1,1α2,1α3,1α1,2α2,2α3,2α1,3α2,3α3,3)(α1,1α1,2α1,3α2,1α2,2α2,3α3,1α3,2α3,3)=(α1,1α1,1+α2,1α2,1+α3,1α3,1α1,1α1,2+α2,1α2,2+α3,1α3,2α1,1α1,3+α2,1α2,3+α3,1α3,3α1,2α1,1+α2,2α2,1+α3,2α3,1α1,2α1,2+α2,2α2,2+α3,2α3,2α1,2α1,3+α2,2α2,3+α3,2α3,3α1,3α1,1+α2,3α2,1+α3,3α3,1α1,3α1,2+α2,3α2,2+α3,3α3,2α1,3α1,3+α2,3α2,3+α3,3α3,3)\begin{aligned} &\begin{pmatrix} \overline{\alpha_{1,1}} & \overline{\alpha_{2,1}} & \overline{\alpha_{3,1}} \\ \overline{\alpha_{1,2}} & \overline{\alpha_{2,2}} & \overline{\alpha_{3,2}} \\ \overline{\alpha_{1,3}} & \overline{\alpha_{2,3}} & \overline{\alpha_{3,3}} \end{pmatrix} \begin{pmatrix} \alpha_{1,1} & \alpha_{1,2} & \alpha_{1,3} \\ \alpha_{2,1} & \alpha_{2,2} & \alpha_{2,3} \\ \alpha_{3,1} & \alpha_{3,2} & \alpha_{3,3} \end{pmatrix}\\[2mm] \quad &= \begin{pmatrix} \overline{\alpha_{1,1}}\alpha_{1,1} + \overline{\alpha_{2,1}}\alpha_{2,1} + \overline{\alpha_{3,1}}\alpha_{3,1} & \overline{\alpha_{1,1}}\alpha_{1,2} + \overline{\alpha_{2,1}}\alpha_{2,2} + \overline{\alpha_{3,1}}\alpha_{3,2} & \overline{\alpha_{1,1}}\alpha_{1,3} + \overline{\alpha_{2,1}}\alpha_{2,3} + \overline{\alpha_{3,1}}\alpha_{3,3} \\[2mm] \overline{\alpha_{1,2}}\alpha_{1,1} + \overline{\alpha_{2,2}}\alpha_{2,1} + \overline{\alpha_{3,2}}\alpha_{3,1} & \overline{\alpha_{1,2}}\alpha_{1,2} + \overline{\alpha_{2,2}}\alpha_{2,2} + \overline{\alpha_{3,2}}\alpha_{3,2} & \overline{\alpha_{1,2}}\alpha_{1,3} + \overline{\alpha_{2,2}}\alpha_{2,3} + \overline{\alpha_{3,2}}\alpha_{3,3} \\[2mm] \overline{\alpha_{1,3}}\alpha_{1,1} + \overline{\alpha_{2,3}}\alpha_{2,1} + \overline{\alpha_{3,3}}\alpha_{3,1} & \overline{\alpha_{1,3}}\alpha_{1,2} + \overline{\alpha_{2,3}}\alpha_{2,2} + \overline{\alpha_{3,3}}\alpha_{3,2} & \overline{\alpha_{1,3}}\alpha_{1,3} + \overline{\alpha_{2,3}}\alpha_{2,3} + \overline{\alpha_{3,3}}\alpha_{3,3} \end{pmatrix} \end{aligned}

If we form three vectors from the columns of U,U,

ψ1=(α1,1α2,1α3,1),ψ2=(α1,2α2,2α3,2),ψ3=(α1,3α2,3α3,3),\vert \psi_1\rangle = \begin{pmatrix} \alpha_{1,1}\\ \alpha_{2,1}\\ \alpha_{3,1} \end{pmatrix}, \quad \vert \psi_2\rangle = \begin{pmatrix} \alpha_{1,2}\\ \alpha_{2,2}\\ \alpha_{3,2} \end{pmatrix}, \quad \vert \psi_3\rangle = \begin{pmatrix} \alpha_{1,3}\\ \alpha_{2,3}\\ \alpha_{3,3} \end{pmatrix},

then we can alternatively express the product above as follows:

UU=(ψ1ψ1ψ1ψ2ψ1ψ3ψ2ψ1ψ2ψ2ψ2ψ3ψ3ψ1ψ3ψ2ψ3ψ3)U^{\dagger} U = \begin{pmatrix} \langle \psi_1\vert \psi_1 \rangle & \langle \psi_1\vert \psi_2 \rangle & \langle \psi_1\vert \psi_3 \rangle \\ \langle \psi_2\vert \psi_1 \rangle & \langle \psi_2\vert \psi_2 \rangle & \langle \psi_2\vert \psi_3 \rangle \\ \langle \psi_3\vert \psi_1 \rangle & \langle \psi_3\vert \psi_2 \rangle & \langle \psi_3\vert \psi_3 \rangle \end{pmatrix}

Referring to the equation (3), we now see that the condition that this matrix is equal to the identity matrix is equivalent to the orthonormality of the set {ψ1,ψ2,ψ3}.\{\vert\psi_1\rangle,\vert\psi_2\rangle,\vert\psi_3\rangle\}.

This argument generalizes to unitary matrices of any size. The fact that the rows of a matrix form an orthonormal basis if and only if the matrix is unitary then follows from the fact that a matrix is unitary if and only if its transpose is unitary.

Given the equivalence described above, together with the fact that every orthonormal set can be extended to form an orthonormal basis, we conclude the following useful fact: Given any orthonormal set of vectors {ψ1,,ψm}\{\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle\} drawn from an nn-dimensional space, there exists a unitary matrix UU whose first mm columns are the vectors ψ1,,ψm.\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle. Pictorially, we can always find a unitary matrix having this form:

U=(ψ1ψ2ψmψm+1ψn).U = \left( \begin{array}{ccccccc} \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt}\\ \vert\psi_1\rangle & \vert\psi_2\rangle & \cdots & \vert\psi_m\rangle & \vert\psi_{m+1}\rangle & \cdots & \vert\psi_n\rangle\\ \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} \end{array} \right).

Here, the last nmn-m columns are filled in with any choice of vectors ψm+1,,ψn\vert\psi_{m+1}\rangle,\ldots,\vert\psi_n\rangle that make {ψ1,,ψn}\{\vert\psi_1\rangle,\ldots,\vert\psi_n\rangle\} an orthonormal basis.

Projections and projective measurements

Projection matrices

A square matrix Π\Pi is called a projection if it satisfies two properties:

  1. Π=Π.\Pi = \Pi^{\dagger}.
  2. Π2=Π.\Pi^2 = \Pi.

Matrices that satisfy the first condition — that they are equal to their own conjugate transpose — are called Hermitian matrices, and matrices that satisfy the second condition — that squaring them leaves them unchanged — are called idempotent matrices.

As a word of caution, the word projection is sometimes used to refer to any matrix that satisfies just the second condition but not necessarily the first, and when this is done the term orthogonal projection is typically used to refer to matrices satisfying both properties. In this series, however, we will use the terms projection and projection matrix to mean matrices satisfying both conditions.

An example of a projection is the matrix

Π=ψψ(4)\Pi = \vert \psi \rangle \langle \psi \vert \tag{4}

for any unit vector ψ.\vert \psi\rangle. We can see that this matrix is Hermitian as follows:

Π=(ψψ)=(ψ)(ψ)=ψψ=Π.\Pi^{\dagger} = \bigl( \vert \psi \rangle \langle \psi \vert \bigr)^{\dagger} = \bigl( \langle \psi \vert \bigr)^{\dagger}\bigl( \vert \psi \rangle \bigr)^{\dagger} = \vert \psi \rangle \langle \psi \vert = \Pi.

Here, to obtain the second equality, we have used the formula

(AB)=BA,(A B)^{\dagger} = B^{\dagger} A^{\dagger},

which is always true (for any two matrices AA and BB for which the product ABAB makes sense).

To see that the matrix Π\Pi in (4)(4) is idempotent, we can use the assumption that ψ\vert\psi\rangle is a unit vector, so that it satisfies ψψ=1.\langle \psi \vert \psi\rangle = 1. Thus, we have

Π2=(ψψ)2=ψψψψ=ψψ=Π.\Pi^2 = \bigl( \vert\psi\rangle\langle \psi\vert \bigr)^2 = \vert\psi\rangle\langle \psi\vert\psi\rangle\langle\psi\vert = \vert\psi\rangle\langle\psi\vert = \Pi.

More generally, if {ψ1,,ψm}\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\} is any orthonormal set of vectors, then the matrix

Π=k=1mψkψk(5)\Pi = \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert \tag{5}

is a projection. Specifically, we have

Π=(k=1mψkψk)=k=1m(ψkψk)=k=1mψkψk=Π,\begin{aligned} \Pi^{\dagger} &= \biggl(\sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\biggr)^{\dagger} \\ &= \sum_{k = 1}^m \bigl(\vert\psi_k\rangle\langle\psi_k\vert\bigr)^{\dagger} \\ &= \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\\ &= \Pi, \end{aligned}

and

Π2=(j=1mψjψj)(k=1mψkψk)=j=1mk=1mψjψjψkψk=k=1mψkψk=Π,\begin{aligned} \Pi^2 & = \biggl( \sum_{j = 1}^m \vert \psi_j\rangle \langle \psi_j \vert\Bigr)\Bigl(\sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\biggr) \\ & = \sum_{j = 1}^m\sum_{k = 1}^m \vert \psi_j\rangle \langle \psi_j \vert \psi_k\rangle \langle \psi_k \vert \\ & = \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\\ & = \Pi, \end{aligned}

where the orthonormality of {ψ1,,ψm}\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\} is used just for the second-to-last equality.

In fact, this exhausts all of the possibilities: every projection Π\Pi can be written in the form (5)(5) for some choice of an orthonormal set {ψ1,,ψm}.\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\}. (The zero matrix Π=0,\Pi=0, which is a projection, is a special case: to fit it into the general form (5) we have to allow the possibility that the sum is empty, resulting in the zero matrix.)

Projective measurements

As has already been mentioned, the notion of a measurement of a quantum system is more general than just standard basis measurements. Projective measurements are measurements that are described by a collection of projections whose sum is equal to the identity matrix. In symbols, a collection {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} of projection matrices describes a projective measurement if

Π0++Πm1=I.\Pi_0 + \cdots + \Pi_{m-1} = \mathbb{I}.

When such a measurement is performed on a system X\mathsf{X} while it is in some state ψ,\vert\psi\rangle, two things happen:

  1. For each k{0,,m1},k\in\{0,\ldots,m-1\}, the outcome of the measurement is kk with probability equal to

    Pr(outcome is k)=Πkψ2.\operatorname{Pr}\bigl(\text{outcome is $k$}\bigr) = \bigl\| \Pi_k \vert \psi \rangle \bigr\|^2.
  2. For whichever outcome kk the measurement produces, the state of X\mathsf{X} becomes

    ΠkψΠkψ.\frac{\Pi_k \vert\psi\rangle}{\bigl\|\Pi_k \vert\psi\rangle\bigr\|}.

We can also choose outcomes other than {0,,m1}\{0,\ldots,m-1\} for projective measurements if we wish. More generally, for any finite and nonempty set Σ,\Sigma, if we have a collection of projection matrices

{Πa:aΣ}\{\Pi_a:a\in\Sigma\}

that satisfies the condition

aΣΠa=I,\sum_{a\in\Sigma} \Pi_a = \mathbb{I},

then this collection describes a projective measurement whose possible outcomes coincide with the set Σ,\Sigma, where the rules are the same as before:

  1. For each aΣ,a\in\Sigma, the outcome of the measurement is aa with probability equal to

    Pr(outcome is a)=Πaψ2.\operatorname{Pr}\bigl(\text{outcome is $a$}\bigr) = \bigl\| \Pi_a \vert \psi \rangle \bigr\|^2.
  2. For whichever outcome aa the measurement produces, the state of X\mathsf{X} becomes

    ΠaψΠaψ.\frac{\Pi_a \vert\psi\rangle}{\bigl\|\Pi_a \vert\psi\rangle\bigr\|}.

For example, standard basis measurements are equivalent to projective measurements, where Σ\Sigma is the set of classical states of whatever system X\mathsf{X} we're talking about and our set of projection matrices is {aa:aΣ}.\{\vert a\rangle\langle a\vert:a\in\Sigma\}.

Another example of a projective measurement, this time on two qubits (X,Y),(\mathsf{X},\mathsf{Y}), is given by the set {Π0,Π1},\{\Pi_0,\Pi_1\}, where

Π0=ϕ+ϕ++ϕϕ+ψ+ψ+andΠ1=ψψ.\Pi_0 = \vert \phi^+\rangle\langle \phi^+ \vert + \vert \phi^-\rangle\langle \phi^- \vert + \vert \psi^+\rangle\langle \psi^+ \vert \quad\text{and}\quad \Pi_1 = \vert\psi^-\rangle\langle\psi^-\vert.

If we have multiple systems that are jointly in some quantum state and a projective measurement is performed on just one of the systems, the action is similar to what we had for standard basis measurements — and in fact we can now describe this action in much simpler terms than we could before. To be precise, let us suppose that we have two systems (X,Y)(\mathsf{X},\mathsf{Y}) in a quantum state ψ,\vert\psi\rangle, and a projective measurement described by a collection {Πa:aΣ}\{\Pi_a : a\in\Sigma\} is performed on the system X,\mathsf{X}, while nothing is done to Y.\mathsf{Y}. Doing this is then equivalent to performing the projective measurement described by the collection

{ΠaI:aΣ}\bigl\{ \Pi_a \otimes \mathbb{I} \,:\, a\in\Sigma\bigr\}

on the joint system (X,Y).(\mathsf{X},\mathsf{Y}). Each measurement outcome aa results with probability

(ΠaI)ψ2,\bigl\| (\Pi_a \otimes \mathbb{I})\vert \psi\rangle \bigr\|^2,

and conditioned on the result aa appearing, the state of the joint system (X,Y)(\mathsf{X},\mathsf{Y}) becomes

(ΠaI)ψ(ΠaI)ψ.\frac{(\Pi_a \otimes \mathbb{I})\vert \psi\rangle}{\bigl\| (\Pi_a \otimes \mathbb{I})\vert \psi\rangle \bigr\|}.

Implementing projective measurements using standard basis measurements

Arbitrary projective measurements can be implemented using unitary operations, standard basis measurements, and an extra workspace system, as will now be explained.

Let us suppose that X\mathsf{X} is a system and {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} is a projective measurement on X.\mathsf{X}. We can easily generalize this discussion to projective measurements having different sets of outcomes, but in the interest of convenience and simplicity we will assume the set of possible outcomes for our measurement is {0,,m1}.\{0,\ldots,m-1\}. Let us note explicitly that mm is not necessarily equal to the number of classical states of X\mathsf{X} — we'll let nn be the number of classical states of X,\mathsf{X}, which means that each matrix Πk\Pi_k is an n×nn\times n projection matrix. Because we assume that {Π0,Πm1}\{\Pi_0\ldots,\Pi_{m-1}\} represents a projective measurement, it is necessarily the case that

k=0m1Πk=In.\sum_{k = 0}^{m-1} \Pi_k = \mathbb{I}_n.

Our goal is to perform a process that has the same effect as performing this projective measurement on X,\mathsf{X}, but to do this using only unitary operations and standard basis measurements.

We will make use of an extra workspace system Y\mathsf{Y} to do this, and specifically we take the classical state set of Y\mathsf{Y} to be {0,,m1},\{0,\ldots,m-1\}, which is the same as the set of outcomes of the projective measurement. The idea is that we will perform a standard basis measurement on Y,\mathsf{Y}, and interpret the outcome of this measurement as being equivalent to the outcome of the projective measurement on X.\mathsf{X}. We will need to assume that Y\mathsf{Y} is initialized to some fixed state, which we'll choose to be 0.\vert 0\rangle. (Any other choice of fixed quantum state vector could be made to work, but choosing 0\vert 0\rangle makes the explanation to follow much simpler.)

Of course, in order for a standard basis measurement of Y\mathsf{Y} to tell us anything about X,\mathsf{X}, we will need to allow X\mathsf{X} and Y\mathsf{Y} to interact somehow before measuring Y,\mathsf{Y}, by performing a unitary operation on the system (Y,X).(\mathsf{Y},\mathsf{X}). First consider this matrix:

M=k=0m1k0Πk.M = \sum_{k = 0}^{m-1} \vert k \rangle \langle 0 \vert \otimes \Pi_k.

Expressed explicitly as a block matrix, this matrix looks like this:

M=(Π000Π100Πm100).M = \begin{pmatrix} \Pi_0 & 0 & \cdots & 0\\[1mm] \Pi_1 & 0 & \cdots & 0\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \Pi_{m-1} & 0 & \cdots & 0 \end{pmatrix}.

(Each 00 in this matrix represents an n×nn\times n matrix filled entirely with zeros.)

Now, MM is certainly not a unitary matrix (unless m=1,m=1, in which case Π0=I,\Pi_0 = \mathbb{I}, giving M=IM = \mathbb{I} in this trivial case) because unitary matrices cannot have any columns (or rows) that are entirely 0;0; unitary matrices have columns that form orthonormal bases, and the all-zero vector is not a unit vector. However, it is the case that the first nn columns of MM are orthonormal, and we get this from the assumption that {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} is a measurement. To verify this claim, notice that for each j{0,,n1},j\in\{0,\ldots,n-1\}, column number jj of MM is this vector:

ψj=M0,j=k=0m1kΠkj.\vert \psi_j\rangle = M \vert 0, j\rangle = \sum_{k = 0}^{m-1} \vert k \rangle \otimes \Pi_k \vert j\rangle.

Taking the inner product of column ii with column jj (still assuming we're talking about the first nn columns, so i,j{0,,n1}i,j\in\{0,\ldots,n-1\}) gives

ψiψj=(k=0m1kΠki)(l=0m1lΠlj)=k=0m1l=0m1kliΠkΠlj=k=0m1iΠkΠkj=k=0m1iΠkj=iIj={1i=j0ij,\begin{aligned} \langle \psi_i \vert \psi_j \rangle & = \biggl(\sum_{k = 0}^{m-1} \vert k \rangle \otimes \Pi_k \vert i\rangle\biggr)^{\dagger} \biggl(\sum_{l = 0}^{m-1} \vert l \rangle \otimes \Pi_l \vert j\rangle\biggr) \\ & = \sum_{k = 0}^{m-1} \sum_{l = 0}^{m-1} \langle k \vert l \rangle \langle i \vert \Pi_k \Pi_l \vert j\rangle\\ & = \sum_{k = 0}^{m-1} \langle i \vert \Pi_k \Pi_k \vert j\rangle\\ & = \sum_{k = 0}^{m-1} \langle i \vert \Pi_k \vert j\rangle\\ & = \langle i \vert \mathbb{I} \vert j \rangle\\ & = \begin{cases} 1 & i = j\\ 0 & i\neq j, \end{cases} \end{aligned}

which is what we needed to show.

Thus, because the first nn columns of the matrix MM are orthonormal, we can replace all of the remaining zero entries by some different choice of complex number entries so that the entire matrix is unitary:

U=(Π0??Π1??Πm1??)U = \begin{pmatrix} \Pi_0 & \fbox{?} & \cdots & \fbox{?}\\[1mm] \Pi_1 & \fbox{?} & \cdots & \fbox{?}\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \Pi_{m-1} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}

(If we are given the matrices Π0,,Πm1,\Pi_0,\ldots,\Pi_{m-1}, we can compute suitable matrices to fill in for the blocks marked ?\fbox{?} in the equation — using the Gram–Schmidt process — but it will not matter specifically what these matrices are for the sake of this discussion.)

Finally we can describe the measurement process: we first perform UU on the joint system (Y,X)(\mathsf{Y},\mathsf{X}) and then measure Y\mathsf{Y} with respect to a standard basis measurement. For an arbitrary state ϕ\vert \phi \rangle of X,\mathsf{X}, we obtain the state

U(0ϕ)=M(0ϕ)=k=0m1kΠkϕ,U \bigl( \vert 0\rangle \vert \phi\rangle\bigr) = M \bigl( \vert 0\rangle \vert \phi\rangle\bigr) = \sum_{k = 0}^{m-1} \vert k\rangle \otimes \Pi_k \vert\phi\rangle,

where the first equality follows from the fact that UU and MM agree on their first nn columns. When we perform a projective measurement on Y,\mathsf{Y}, we obtain each outcome kk with probability

Πkϕ2,\bigl\| \Pi_k \vert \phi\rangle \bigr\|^2,

in which case the state of (Y,X)(\mathsf{Y},\mathsf{X}) becomes

kΠkϕΠkϕ.\vert k\rangle \otimes \frac{\Pi_k \vert \phi\rangle}{\bigl\| \Pi_k \vert \phi\rangle \bigr\|}.

Thus, Y\mathsf{Y} stores a copy of the measurement outcome and X\mathsf{X} changes precisely as it would had the projective measurement described by {Π0,,Πm1}\{\Pi_0,\ldots,\Pi_{m-1}\} been performed directly on X.\mathsf{X}.

Limitations on quantum information

Despite sharing a common underlying mathematical structure, quantum and classical information have some key differences. As we continue on in this series, we will see many examples of tasks that quantum information allows, but classical information does not.

Before doing this, however, we should note some important limitations on quantum information. Understanding things quantum information can't do helps us identify the things it can do.

Irrelevance of global phases

The first limitation we'll cover — which is really more of a slight degeneracy in the way that quantum states are represented by quantum state vectors, as opposed to an actual limitation — concerns the notion of a global phase.

What we mean by a global phase is this. Suppose that ψ\vert \psi \rangle and ϕ\vert \phi \rangle are unit vectors representing quantum states of some system, and assume moreover that there exists a complex number α\alpha on the unit circle (meaning that α=1,\vert \alpha \vert = 1, or alternatively α=eiθ\alpha = e^{i\theta} for some real number θ\theta) such that

ϕ=αψ.\vert \phi \rangle = \alpha \vert \psi \rangle.

The vectors ψ\vert \psi \rangle and ϕ\vert \phi \rangle are then said to differ by a global phase. We also sometimes refer to α\alpha as a global phase, although this is context-dependent: any number on the unit circle can be thought of as a global phase when multiplied to a unit vector.

Now consider what happens when a system is in one of two quantum states that differ by a global phase, ψ\vert\psi\rangle and ϕ,\vert\phi\rangle, and the system undergoes a standard basis measurement. In the first case, in which the system is in the state ψ,\vert\psi\rangle, the probability of measuring any classical state aa is

aψ2.\bigl\vert \langle a \vert \psi \rangle \bigr\vert^2.

In the second case, in which the system is in the state ϕ,\vert\phi\rangle, the probability of measuring any classical state aa is

aϕ2=αaψ2=α2aψ2=aψ2,\bigl\vert \langle a \vert \phi \rangle \bigr\vert^2 = \bigl\vert \alpha \langle a \vert \psi \rangle \bigr\vert^2 = \vert \alpha \vert^2 \bigl\vert \langle a \vert \psi \rangle \bigr\vert^2 = \bigl\vert \langle a \vert \psi \rangle \bigr\vert^2,

because α=1.\vert\alpha\vert = 1. That is, the probability of an outcome appearing is the same for both states.

Let's consider what happens when we apply an arbitrary unitary operation UU to both states. In the first case, in which the initial state is ψ,\vert \psi \rangle, the state becomes

Uψ,U \vert \psi \rangle,

and in the second case, in which the initial state is ϕ,\vert \phi\rangle, it becomes

Uϕ=αUψ.U \vert \phi \rangle = \alpha U \vert \psi \rangle.

That is, the two resulting states still differ by the same global phase α.\alpha.

Consequently, the two quantum states ψ\vert\psi\rangle and ϕ\vert\phi\rangle that differ by a global phase are completely indistinguishable: no matter what operation, or sequence of operations, we apply to the two states, they will always differ by a global phase, and performing a standard basis measurement will produce outcomes with precisely the same probabilities as the other. For this reason, two quantum state vectors that differ by a global phase are considered to be equivalent, and are effectively viewed as being the same state.

For example, the quantum states

=120121and=120+121\vert - \rangle = \frac{1}{\sqrt{2}} \vert 0 \rangle - \frac{1}{\sqrt{2}} \vert 1 \rangle \quad\text{and}\quad -\vert - \rangle = -\frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{1}{\sqrt{2}} \vert 1 \rangle

differ by a global phase (which is 1-1 in this example), and are therefore considered to be the same state.

On the other hand, the quantum states

+=120+121and=120121\vert + \rangle = \frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{1}{\sqrt{2}} \vert 1 \rangle \quad\text{and}\quad \vert - \rangle = \frac{1}{\sqrt{2}} \vert 0 \rangle - \frac{1}{\sqrt{2}} \vert 1 \rangle

do not differ by a global phase. Although the only difference between the two states is that a plus sign turns into a minus sign, this is not a global phase difference, it is a relative phase difference because it does not affect every vector entry, but only a proper subset of the entries. This is consistent with what we have already observed in Lesson 1, which is that the states +\vert + \rangle and \vert - \rangle can be discriminated perfectly — performing a Hadamard operation and then measuring yields outcome probabilities as follows:

0H+2=10H2=01H+2=01H2=1.\begin{aligned} \bigl\vert \langle 0 \vert H \vert + \rangle \bigr\vert^2 = 1 & \hspace{1cm} \bigl\vert \langle 0 \vert H \vert - \rangle \bigr\vert^2 = 0 \\[1mm] \bigl\vert \langle 1 \vert H \vert + \rangle \bigr\vert^2 = 0 & \hspace{1cm} \bigl\vert \langle 1 \vert H \vert - \rangle \bigr\vert^2 = 1. \end{aligned}

As an aside, here we find another advantage of the general description of quantum information based on density matrices over the simplified description based on quantum state vectors. In the general description of quantum information, the degeneracy in which two quantum state vectors can differ by a global phase, and hence effectively represent the same quantum state, disappears. That is, two density matrices that differ necessarily represent two distinct quantum states that can be discriminated in a statistical sense.

No-cloning theorem

The no-cloning theorem shows it is impossible to create a perfect copy of an unknown quantum state.

Theorem (No-cloning theorem). Let X\mathsf{X} and Y\mathsf{Y} be systems sharing the same classical state set Σ\Sigma having at least two elements. There does not exist a quantum state ϕ\vert \phi\rangle of Y\mathsf{Y} and a unitary operation UU on the pair (X,Y)(\mathsf{X},\mathsf{Y}) such that U(ψϕ)=ψψU \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle for every state ψ\vert \psi \rangle of X.\mathsf{X}.

That is, there is no way to initialize the system Y\mathsf{Y} (to any state ϕ\vert\phi\rangle whatsoever) and perform a unitary operation UU on the joint system (X,Y)(\mathsf{X},\mathsf{Y}) so that the effect is for the state ψ\vert\psi\rangle of X\mathsf{X} to be cloned — resulting in (X,Y)(\mathsf{X},\mathsf{Y}) being in the state ψψ.\vert \psi \rangle \otimes \vert\psi\rangle.

The proof of this theorem is actually quite simple: it boils down to the observation that the mapping

ψϕψψ\vert\psi\rangle \otimes \vert \phi\rangle\mapsto\vert\psi\rangle \otimes \vert \psi\rangle

is not linear in ψ.\vert\psi\rangle.

In particular, because Σ\Sigma has at least two elements, we may choose a,bΣa,b\in\Sigma with ab.a\neq b. If there did exist a quantum state ϕ\vert \phi\rangle of Y\mathsf{Y} and a unitary operation UU on the pair (X,Y)(\mathsf{X},\mathsf{Y}) for which U(ψϕ)=ψψU \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle for every quantum state ψ\vert\psi\rangle of X,\mathsf{X}, then it would be the case that

U(aϕ)=aaandU(bϕ)=bb.U \bigl( \vert a \rangle \otimes \vert\phi\rangle\bigr) = \vert a \rangle \otimes \vert a\rangle \quad\text{and}\quad U \bigl( \vert b \rangle \otimes \vert\phi\rangle\bigr) = \vert b \rangle \otimes \vert b\rangle.

By linearity, meaning specifically the linearity of the tensor product in the first argument and the linearity of matrix-vector multiplication in the second (vector) argument, we must therefore have

U((12a+12b)ϕ)=12aa+12bb.U \biggl(\biggl( \frac{1}{\sqrt{2}}\vert a \rangle + \frac{1}{\sqrt{2}} \vert b\rangle \biggr) \otimes \vert\phi\rangle\biggr) = \frac{1}{\sqrt{2}} \vert a \rangle \otimes \vert a\rangle + \frac{1}{\sqrt{2}} \vert b \rangle \otimes \vert b\rangle.

However, the requirement that U(ψϕ)=ψψU \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle for every quantum state ψ\vert\psi\rangle demands that

U((12a+12b)ϕ)=(12a+12b)(12a+12b)=12aa+12ab+12ba+12bb12aa+12bb\begin{aligned} & U \biggl(\biggl( \frac{1}{\sqrt{2}}\vert a \rangle + \frac{1}{\sqrt{2}} \vert b\rangle \biggr) \otimes \vert\phi\rangle\biggr)\\ & \qquad = \biggl(\frac{1}{\sqrt{2}} \vert a \rangle + \frac{1}{\sqrt{2}} \vert b \rangle\biggr) \otimes \biggl(\frac{1}{\sqrt{2}} \vert a \rangle + \frac{1}{\sqrt{2}} \vert b \rangle\biggr)\\ & \qquad = \frac{1}{2} \vert a \rangle \otimes \vert a\rangle + \frac{1}{2} \vert a \rangle \otimes \vert b\rangle + \frac{1}{2} \vert b \rangle \otimes \vert a\rangle + \frac{1}{2} \vert b \rangle \otimes \vert b\rangle\\ & \qquad \neq \frac{1}{\sqrt{2}} \vert a \rangle \otimes \vert a\rangle + \frac{1}{\sqrt{2}} \vert b \rangle \otimes \vert b\rangle \end{aligned}

Therefore there cannot exist a state ϕ\vert \phi\rangle and a unitary operation UU for which U(ψϕ)=ψψU \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle for every quantum state vector ψ.\vert \psi\rangle.

A few remarks concerning the no-cloning theorem are in order. The first one is that the statement of the no-cloning theorem above is absolute, in the sense that it states that perfect cloning is impossible — but it does not say anything about possibly cloning with limited accuracy, where we might succeed in producing an approximate clone (with respect to some way of measuring how similar two different quantum states might be). There are, in fact, statements of the no-cloning theorem that place limitations on approximate cloning, as well as methods to achieve approximate cloning (with limited accuracy), but we will delay this discussion to a later lesson when the pieces needed to explain approximate cloning are in place.

The second remark is that the no-cloning theorem is a statement about the impossibility of cloning an arbitrary state ψ.\vert\psi\rangle. We can easily create a clone of any standard basis state, for instance. For example, we can clone a qubit standard basis state using a controlled-NOT operation:

Classical copy

While there is no difficulty in creating a clone of a standard basis state, this does not contradict the no-cloning theorem — this approach of using a controlled-NOT gate would not succeed in creating a clone of the state +,\vert + \rangle, for instance.

One final remark about the no-cloning theorem is that it really isn't unique to quantum information, it's also impossible to clone an arbitrary probabilistic state using a classical (deterministic or probabilistic) process. This is pretty intuitive. Imagine someone hands you a system in some probabilistic state, but you're not sure what that probabilistic state is. For example, maybe they randomly generated a number between 11 and 10,10, but they didn't tell you how they generated that number. There's certainly no physical process through which you can obtain two independent copies of that same probabilistic state: all you have in your hands is a number between 11 and 10,10, and there just isn't enough information present for you to somehow reconstruct the probabilities for all of the other outcomes to appear. Mathematically speaking, a version of the no-cloning theorem for probabilistic states can be proved in exactly the same way as the regular no-cloning theorem (for quantum states). That is, cloning an arbitrary probabilistic state is a non-linear process, so it cannot possibly be represented by a stochastic matrix.

Non-orthogonal states cannot be perfectly discriminated

For the final limitation covered in this lesson, we'll show that if we have two quantum states ψ\vert\psi\rangle and ϕ\vert\phi\rangle that are not orthogonal, which means that ϕψ0,\langle \phi\vert\psi\rangle \neq 0, then it's impossible to discriminate them (or, in other words, to tell them apart) perfectly.

In fact, we'll show something logically equivalent: if we do have a way to discriminate two states perfectly, without any error, then they must be orthogonal.

We will restrict our attention to quantum circuits that consist of any number of unitary gates, followed by a single standard basis measurement of the top qubit. What we require of a quantum circuit, to say that it perfectly discriminates the states ψ\vert\psi\rangle and ϕ,\vert\phi\rangle, is that the measurement always yields the value 00 for one of the two states and always yields 11 for the other state. To be precise, we shall assume that we have a quantum circuit that operates as the following diagrams suggest:

Discriminate psi

The box labeled UU denotes the unitary operation representing the combined action of all of the unitary gates in our circuit, but not including the final measurement. There is no loss of generality in assuming that the measurement outputs 00 for ψ\vert\psi\rangle and 11 for ϕ;\vert\phi\rangle; the analysis would not differ fundamentally if these output values were reversed.

Notice that in addition to the qubits that initially store either ψ\vert\psi\rangle or ϕ,\vert\phi\rangle, the circuit is free to make use of any number of additional workspace qubits. These qubits are initially each set to the 0\vert 0\rangle state — so their combined state is denoted 00\vert 0\cdots 0\rangle in the figures — and these qubits can be used by the circuit in any way that might be beneficial. It is very common to make use of workspace qubits in quantum circuits like this, as we will see in the next unit.

Now, consider what happens when we run our circuit on the state ψ\vert\psi\rangle (along with the initialized workspace qubits). The resulting state, immediately prior to the measurement being performed, can be written as

U(00ψ)=γ00+γ11U \bigl( \vert 0\cdots 0 \rangle \vert \psi \rangle\bigr) = \vert \gamma_0\rangle\vert 0 \rangle + \vert \gamma_1 \rangle\vert 1 \rangle

for two vectors γ0\vert \gamma_0\rangle and γ1\vert \gamma_1\rangle that correspond to all of the qubits except the top qubit. In general, for such a state the probabilities that a measurement of the top qubit yields the outcomes 00 and 11 are as follows:

Pr(outcome is 0)=γ02andPr(outcome is 1)=γ12.\operatorname{Pr}(\text{outcome is $0$}) = \bigl\| \vert\gamma_0\rangle \bigr\|^2 \qquad\text{and}\qquad \operatorname{Pr}(\text{outcome is $1$}) = \bigl\| \vert\gamma_1\rangle \bigr\|^2.

Because our circuit always outputs 00 for the state ψ,\vert\psi\rangle, it must be that γ1=0,\vert\gamma_1\rangle = 0, and so

U(00ψ)=γ00.U \bigl( \vert 0\cdots 0\rangle\vert \psi \rangle \bigr) = \vert\gamma_0\rangle\vert 0 \rangle.

Multiplying both sides of this equation by UU^{\dagger} yields this equation:

00ψ=U(γ00).(7)\vert 0\cdots 0\rangle\vert \psi \rangle = U^{\dagger} \bigl( \vert \gamma_0\rangle\vert 0 \rangle \bigr). \tag{7}

Reasoning similarly for ϕ\vert\phi\rangle in place of ψ,\vert\psi\rangle, we conclude that

U(00ϕ)=δ11U \bigl( \vert 0\cdots 0\rangle\vert \phi \rangle \bigr) = \vert \delta_1\rangle\vert 1 \rangle

for some vector δ1,\vert\delta_1\rangle, and therefore

00ϕ=U(δ11).(8)\vert 0\cdots 0\rangle\vert \phi \rangle = U^{\dagger} \bigl( \vert \delta_1\rangle\vert 1 \rangle\bigr). \tag{8}

Now let us take the inner product of the vectors represented by the equations (7)(7) and (8),(8), starting with the representations on the right-hand side of each equation. We have

(U(γ00))=(γ00)U,\bigl(U^{\dagger} \bigl( \vert \gamma_0\rangle\vert 0 \rangle \bigr)\bigr)^{\dagger} = \bigl( \langle\gamma_0\vert\langle 0\vert \bigr)U,

so the inner product of the vector (7)(7) with the vector (8)(8) is

(γ00)UU(δ1)=(γ00)(δ11)=γ0δ101=0.\bigl( \langle\gamma_0\vert\langle 0\vert \bigr)U U^{\dagger} \bigl( \vert \delta\rangle\vert 1 \rangle\bigr) = \bigl( \langle\gamma_0\vert\langle 0\vert \bigr) \bigl( \vert \delta_1\rangle\vert 1 \rangle\bigr) = \langle \gamma_0 \vert \delta_1\rangle \langle 0 \vert 1 \rangle = 0.

Here we have used the fact that UU=I,U U^{\dagger} = \mathbb{I}, as well as the fact that the inner product of tensor products is the product of the inner products:

uvwx=uwvx\langle u \otimes v \vert w \otimes x\rangle = \langle u \vert w\rangle \langle v \vert x\rangle

for any choices of these vectors (assuming u\vert u\rangle and w\vert w\rangle have the same number of entries and v\vert v\rangle and x\vert x\rangle have the same number of entries, so that it makes sense to form the inner products uw\langle u\vert w\rangle and vx\langle v\vert x \rangle). Notice that the value of the inner product γ0δ1\langle \gamma_0 \vert \delta_1\rangle is irrelevant because it is multiplied by 01=0.\langle 0 \vert 1 \rangle = 0. This is fortunate because we really don't know much about these two vectors.

Finally, taking the inner product of the vectors (7)(7) and (8)(8) in terms of the left-hand side of the equations must result in the same zero value, and so

0=(00ψ)(00ϕ)=0000ψϕ=ψϕ.0 = \bigl( \vert 0\cdots 0\rangle \vert \psi\rangle\bigr)^{\dagger} \bigl(\vert 0\cdots 0\rangle\vert \phi\rangle\bigr) = \langle 0\cdots 0 \vert 0\cdots 0 \rangle \langle \psi \vert \phi \rangle = \langle \psi \vert \phi \rangle.

We have concluded what we wanted, which is that ψ\vert \psi\rangle and ϕ\vert\phi\rangle are orthogonal: ψϕ=0.\langle \psi \vert \phi \rangle = 0.

It is possible, by the way, to perfectly discriminate any two states that are orthogonal. Suppose that the two states to be discriminated are ϕ\vert \phi\rangle and ψ,\vert \psi\rangle, where ϕψ=0.\langle \phi\vert\psi\rangle = 0. We can then perfectly discriminate these states by performing the projective measurement described by these matrices, for instance:

{ϕϕ,Iϕϕ}.\bigl\{ \vert\phi\rangle\langle\phi\vert,\,\mathbb{I} - \vert\phi\rangle\langle\phi\vert \bigr\}.

For the state ϕ,\vert\phi\rangle, the first outcome is always obtained:

ϕϕϕ2=ϕϕϕ2=ϕ2=1,(Iϕϕ)ϕ2=ϕϕϕϕ2=ϕϕ2=0.\begin{aligned} & \bigl\| \vert\phi\rangle\langle\phi\vert \vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle\langle\phi\vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle \bigr\|^2 = 1,\\[1mm] & \bigl\| (\mathbb{I} - \vert\phi\rangle\langle\phi\vert) \vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle - \vert\phi\rangle\langle\phi\vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle - \vert\phi\rangle \bigr\|^2 = 0. \end{aligned}

And, for the state ψ,\vert\psi\rangle, the second outcome is always obtained:

ϕϕψ2=ϕϕψ2=02=0,(Iϕϕ)ψ2=ψϕϕψ2=ψ2=1.\begin{aligned} & \bigl\| \vert\phi\rangle\langle\phi\vert \vert\psi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle\langle\phi\vert\psi\rangle \bigr\|^2 = \bigl\| 0 \bigr\|^2 = 0,\\[1mm] & \bigl\| (\mathbb{I} - \vert\phi\rangle\langle\phi\vert) \vert\psi\rangle \bigr\|^2 = \bigl\| \vert\psi\rangle - \vert\phi\rangle\langle\phi\vert\psi\rangle \bigr\|^2 = \bigl\| \vert\psi\rangle \bigr\|^2 = 1. \end{aligned}

Was this page helpful?