Quantum circuits

Open the YouTube video for this lesson in a separate window.

Introduction

This lesson introduces the quantum circuit model of computation, which provides a standard way to describe quantum computations. It also introduces a few important mathematical concepts, including inner products between vectors, the notions of orthogonality and orthonormality, and projections and projective measurements, which generalize standard basis measurements. Through these concepts, we'll derive fundamental limitations on quantum information, including the no-cloning theorem and the impossibility to perfectly discriminate non-orthogonal quantum states.

Circuits

In computer science, circuits are models of computation in which information is carried by wires through a network of gates, which represent operations on the information carried by the wires. Quantum circuits are a specific model of computation based on this more general concept.

Although the word "circuit" often refers to a circular path, circular paths aren't actually allowed in the circuit models of computation that are most commonly studied. That is to say, we usually consider acyclic circuits when we're thinking about circuits as computational models. Quantum circuits follow this pattern; a quantum circuit represents a finite sequence of operations that cannot contain feedback loops.

Boolean circuits

Here is an example of a (classical) Boolean circuit, where the wires carry binary values and the gates represent Boolean logic operations:

Example of a Boolean circuit

The flow of information along the wires goes from left to right: the wires on the left-hand side of the figure labeled $\mathsf{X}$ and $\mathsf{Y}$ are input bits, which can each be set to whatever binary value we choose, and the wire on the right-hand side is the output. The intermediate wires take whatever values are determined by the gates, which are evaluated from left to right.

The gates are AND gates (labeled $\wedge$ ), OR gates (labeled $\vee$ ), and NOT gates (labeled $\neg$ ). The functions computed by these gates will likely be familiar to many readers, but here they are represented by tables of values:

\begin{array}{c} \begin{array}{c|c} a & \neg a\\ \hline 0 & 1\\ 1 & 0\\ \end{array}\\ \\ \\ \end{array} \qquad\quad \begin{array}{c|c} ab & a \wedge b\\ \hline 00 & 0\\ 01 & 0\\ 10 & 0\\ 11 & 1 \end{array} \qquad\quad \begin{array}{c|c} ab & a \vee b\\ \hline 00 & 0\\ 01 & 1\\ 10 & 1\\ 11 & 1 \end{array}

The two small, solid circles on the wires just to the right of the names $\mathsf{X}$ and $\mathsf{Y}$ represent fanout operations, which simply create a copy of whatever value is carried on the wire on which they appear, allowing this value to be input into multiple gates. Fanout operations are not always considered to be gates in the classical setting; sometimes they're treated as if they're "free" in some sense. When Boolean circuits are converted into equivalent quantum circuits, however, we do need to classify fanout operations explicitly as gates to handle and account for them correctly.

Here's the same circuit illustrated in a style more common in electrical engineering, which uses conventional symbols for the AND, OR, and NOT gates:

Boolean circuit in a classic style

We won't use this style or these particular gate symbols further, but we will use different symbols to represent gates in quantum circuits, which we'll explain as we encounter them.

The particular circuit in this example computes the exclusive-OR (or XOR for short), which is denoted by the symbol $\oplus$ :

\begin{array}{c|c} ab & a \oplus b\\ \hline 00 & 0\\ 01 & 1\\ 10 & 1\\ 11 & 0 \end{array}

In the next diagram we consider just one choice for the inputs: $\mathsf{X}=0$ and $\mathsf{Y}=1.$ Each wire is labeled by value it carries so you can follow the operations. The output value is $1$ in this case, which is the correct value for the XOR: $0 \oplus 1 = 1.$

Evaluating a Boolean circuit

The other three possible input settings can be checked in a similar way.

Other types of circuits

As was suggested above, the notion of a circuit in computer science is very general. For example, circuits whose wires carry values other than $0$ and $1$ are sometimes analyzed, as are gates representing different choices of operations.

In arithmetic circuits, for instance, the wires may carry integer values while the gates represent arithmetic operations, such as addition and multiplication. The following figure depicts an arithmetic circuit that takes two variable input values ( $x$ and $y$ ) as well as a third input set to the value $1.$ The values carried by the wires, as functions of the values $x$ and $y,$ are shown in the figure.

Example arithmetic circuit

We can also consider circuits that incorporate randomness, such as ones where gates represent probabilistic operations.

Quantum circuits

In the quantum circuit model, wires represent qubits and gates represent operations on these qubits. We'll focus for now on operations we've encountered so far, namely unitary operations and standard basis measurements. As we learn about other sorts of quantum operations and measurements, we can enhance our model accordingly.

Here's a simple example of a quantum circuit:

Simple quantum circuit

In this circuit, we have a single qubit named $\mathsf{X},$ which is represented by the horizontal line, and a sequence of gates representing unitary operations on this qubit. Just like in the examples above, the flow of information goes from left to right — so the first operation performed is a Hadamard operation, the second is an $S$ operation, the third is another Hadamard operation, and the final operation is a $T$ operation. Applying the entire circuit therefore applies the composition of these operations, $THSH,$ to the qubit $\mathsf{X}.$

Sometimes we may wish to explicitly indicate the input or output states of circuits. For example, if we apply the operation $THSH$ to the state $\vert 0\rangle,$ we obtain the state $\frac{1+i}{2}\vert 0\rangle + \frac{1}{\sqrt{2}} \vert 1 \rangle.$ This can be indicated as follows:

Simple quantum circuit evaluated

Quantum circuits often start out with all qubits initialized to $\vert 0\rangle,$ as we have in this case, but there are also situations where the input qubits are initially set to different states.

Now let's see how we can specify this circuit in Qiskit. We'll start with a version check along with the imports needed for the remainder of the lesson.

Output:

1.3.1

No output produced

We've seen some of these imports in the two previous lessons, but others are new. For now, let's just highlight that we will be using the Aer simulator to simulate quantum circuits.

To begin, we can build the circuit above as follows, by defining a quantum circuit with one qubit and sequentially adding gates from left to right.

Output:

The default names for qubits in Qiskit are $\mathsf{q_0},$ $\mathsf{q_1},$ $\mathsf{q_2},$ etc., and when there's just a single qubit, like in our example, the default name is $\mathsf{q}$ rather than $\mathsf{q_0}.$ If we wish to choose our own name we can do this using the QuantumRegister class, which allows us to name a collection of qubits as treat it as a single object. Here we're doing this with just a single qubit.

Output:

Here's another example of a quantum circuit, this time with two qubits:

Quantum circuit that creates an ebit

As always, the gate labeled $H$ refers to a Hadamard operation, while the second gate is a controlled-NOT operation: the solid circle represents the control qubit and the circle resembling the symbol $\oplus$ denotes the target qubit.

Before examining this circuit in greater detail and explaining what it does, it is imperative that we first clarify how qubits are ordered in quantum circuits. This connects with the convention that Qiskit uses for naming and ordering systems that was mentioned briefly in the previous lesson.

Qiskit's qubit ordering convention for circuits

In Qiskit, the topmost qubit in a circuit diagram has index $0$ and corresponds to the rightmost position in a tuple of qubits (or in a string, Cartesian product, or tensor product corresponding to this tuple). The second-from-top qubit has index $1,$ and corresponds to the position second-from-right in a tuple, and so on, down to the bottommost qubit, which has the highest index, and corresponds to the leftmost position in a tuple.

In particular, Qiskit's default names for the qubits in an $n$ -qubit circuit are represented by the $n$ -tuple $(\mathsf{q_{n-1}},\ldots,\mathsf{q_{0}}),$ with $\mathsf{q_{0}}$ being the qubit on the top and $\mathsf{q_{n-1}}$ on the bottom in quantum circuit diagrams.

Please be aware that this is a reversal of a more common convention for ordering qubits in circuits, and is a frequent source of confusion. Further information on this ordering convention can be found on the Bit-ordering in Qiskit documentation page.

Although we sometimes deviate from the specific default names $\mathsf{q_{0}},\ldots,\mathsf{q_{n-1}}$ used for qubits by Qiskit, we will always follow the ordering convention described above when interpreting circuit diagrams throughout this course. Thus, our interpretation of the circuit above is that it describes an operation on a pair of qubits $(\mathsf{X},\mathsf{Y}).$ If the input to the circuit is a quantum state $\vert\psi\rangle \otimes \vert\phi\rangle,$ for instance, then this means that the lower qubit $\mathsf{X}$ starts in the state $\vert\psi\rangle$ and the upper qubit $\mathsf{Y}$ starts in the state $\vert\phi\rangle.$

To understand what the circuit does, we can go from left to right through its operations.

The first operation is a Hadamard operation on $\mathsf{Y}$ :
When applying a gate to a single qubit like this, nothing happens to the other qubits (which is just one other qubit in this case). Nothing happening is equivalent to the identity operation being performed. The dotted rectangle in the figure above therefore represents this operation:
$\mathbb{I}\otimes H = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 & 0\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix}.$
Note that the identity matrix is on the left of the tensor product and $H$ is on the right, which is consistent with Qiskit's ordering convention.
The second operation is the controlled-NOT operation, where $\mathsf{Y}$ is the control and $\mathsf{X}$ is the target:
The controlled-NOT gate's action on standard basis states is as follows:
Given that we order the qubits as $(\mathsf{X}, \mathsf{Y}),$ with $\mathsf{X}$ being on the bottom and $\mathsf{Y}$ being on the top of our circuit, the matrix representation of the controlled-NOT gate is this:
$\begin{pmatrix} 1 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 1\\[2mm] 0 & 0 & 1 & 0\\[2mm] 0 & 1 & 0 & 0 \end{pmatrix}.$

The unitary operation implemented by the entire circuit, which we'll give the name $U,$ is the composition of the operations:

U = \begin{pmatrix} 1 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 1\\[2mm] 0 & 0 & 1 & 0\\[2mm] 0 & 1 & 0 & 0 \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 & 0\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix} = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 & 0\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0 \end{pmatrix}.

In particular, recalling our notation for the Bell states,

\begin{aligned} \vert \phi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 0 0 \rangle + \frac{1}{\sqrt{2}} \vert 1 1 \rangle \\[2mm] \vert \phi^- \rangle & = \frac{1}{\sqrt{2}} \vert 0 0 \rangle - \frac{1}{\sqrt{2}} \vert 1 1 \rangle \\[2mm] \vert \psi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 0 1 \rangle + \frac{1}{\sqrt{2}} \vert 1 0 \rangle \\[2mm] \vert \psi^- \rangle & = \frac{1}{\sqrt{2}} \vert 0 1 \rangle - \frac{1}{\sqrt{2}} \vert 1 0 \rangle, \end{aligned}

we find that

\begin{aligned} U \vert 00\rangle & = \vert \phi^+\rangle\\ U \vert 01\rangle & = \vert \phi^-\rangle\\ U \vert 10\rangle & = \vert \psi^+\rangle\\ U \vert 11\rangle & = -\vert \psi^-\rangle. \end{aligned}

This circuit therefore gives us a way to create the state $\vert\phi^+\rangle$ if we run it on two qubits initialized to $\vert 00\rangle.$ More generally, it provides us with a way to convert the standard basis to the Bell basis. (Note that, while it is not important for this example, the $-1$ phase factor on the last state, $-\vert \psi^-\rangle,$ could be eliminated if we wanted by making a small addition to the circuit. For instance, we could add a controlled- $Z$ gate at the beginning, which is similar to a controlled-NOT gate except that a $Z$ operation is applied to the target qubit rather than a NOT operation when the control is set to $1.$ Alternatively, we could add a swap gate at the end. Either choice eliminates the minus sign without affecting the circuit's action on the other three standard basis states.)

Let's create the circuit in Qiskit and check that our calculations are correct.

Output:

\begin{bmatrix} \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} & 0 & 0 \\ 0 & 0 & \frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2} \\ 0 & 0 & \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2} & 0 & 0 \\ \end{bmatrix}

In general, quantum circuits can contain any number of qubit wires. We may also include classical bit wires, which are indicated by double lines, like in this example:

Example circuit with measurements

Here we have a Hadamard gate and a controlled-NOT gate on two qubits $\mathsf{X}$ and $\mathsf{Y},$ just like in the previous example. We also have two classical bits, $\mathsf{A}$ and $\mathsf{B},$ as well as two measurement gates. The measurement gates represent standard basis measurements: the qubits are changed into their post-measurement states, while the measurement outcomes are overwritten onto the classical bits to which the arrows point.

Here's an implementation of this circuit using Qiskit:

Output:

The circuit can be simulated using the Aer simulator like this:

Output:

It's often convenient to depict a measurement as a gate that takes a qubit as input and outputs a classical bit (as opposed to outputting the qubit in its post-measurement state and writing the result to a separate classical bit). This means the measured qubit has been discarded and can safely be ignored thereafter, its state having changed into $\vert 0\rangle$ or $\vert 1\rangle$ depending upon the measurement outcome.

For example, the following circuit diagram represents the same process as the one in the previous diagram, but where we disregard $\mathsf{X}$ and $\mathsf{Y}$ after measuring them:

Example circuit with measurements compact

As the course continues, we'll see more examples of quantum circuits, which are usually more complicated than the simple examples above. Here are some examples of symbols used to denote gates that commonly appear in circuit diagrams:

Single-qubit gates are generally shown as squares with a letter indicating which operation it is, like this:
Not gates (or, equivlently, $X$ gates) are also sometimes denoted by a circle around a plus sign:
Swap gates are denoted as follows:
Controlled-gates, meaning gates that describe controlled-unitary operations, are denoted by a filled-in circle (indicating the control) connected by a vertical line to whatever operation is being controlled. For instance, controlled-NOT gates, controlled-controlled-NOT (or Toffoli) gates, and controlled-swap (Fredkin) gates are denoted like this:
Arbitrary unitary operations on multiple qubits may be viewed as gates. They are depicted by rectangles labeled by the name of the unitary operation. For instance, here is a depiction of an (unspecified) unitary operation $U$ as a gate, along with a controlled version of this gate:

Inner products, orthonormality, and projections

To better prepare ourselves to explore the capabilities and limitations of quantum circuits, we now introduce some additional mathematical concepts — namely the inner product between vectors (and its connection to the Euclidean norm), the notions of orthogonality and orthonormality for sets of vectors, and projection matrices, which will allow us to introduce a handy generalization of standard basis measurements.

Inner products

Recall from the Single systems lesson that, when we use the Dirac notation to refer to an arbitrary column vector as a ket, such as

\vert \psi \rangle = \begin{pmatrix} \alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n \end{pmatrix},

the corresponding bra vector is the conjugate transpose of this vector:

\langle \psi \vert = \bigl(\vert \psi \rangle \bigr)^{\dagger} = \begin{pmatrix} \overline{\alpha_1} & \overline{\alpha_2} & \cdots & \overline{\alpha_n} \end{pmatrix}. \tag{1}

Alternatively, if we have some classical state set $\Sigma$ in mind, and we express a column vector as a ket, such as

\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle,

then the corresponding row (or bra) vector is the conjugate transpose

\langle \psi \vert = \sum_{a\in\Sigma} \overline{\alpha_a} \langle a \vert. \tag{2}

We also observed that the product of a bra vector and a ket vector, viewed as matrices either having a single row or a single column, results in a scalar. Specifically, if we have two (column) vectors

\vert \psi \rangle = \begin{pmatrix} \alpha_1\\ \alpha_2\\ \vdots\\ \alpha_n \end{pmatrix} \quad\text{and}\quad \vert \phi \rangle = \begin{pmatrix} \beta_1\\ \beta_2\\ \vdots\\ \beta_n \end{pmatrix},

so that the row vector $\langle \psi \vert$ is as in equation $(1),$ then

\langle \psi \vert \phi \rangle = \langle \psi \vert \vert \phi \rangle = \begin{pmatrix} \overline{\alpha_1} & \overline{\alpha_2} & \cdots & \overline{\alpha_n} \end{pmatrix} \begin{pmatrix} \beta_1\\ \beta_2\\ \vdots\\ \beta_n \end{pmatrix} = \overline{\alpha_1} \beta_1 + \cdots + \overline{\alpha_n}\beta_n.

Alternatively, if we have two column vectors that we have written as

\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle \quad\text{and}\quad \vert \phi \rangle = \sum_{b\in\Sigma} \beta_b \vert b \rangle,

so that $\langle \psi \vert$ is the row vector $(2),$ we find that

\begin{aligned} \langle \psi \vert \phi \rangle & = \langle \psi \vert \vert \phi \rangle\\ & = \Biggl(\sum_{a\in\Sigma} \overline{\alpha_a} \langle a \vert\Biggr) \Biggl(\sum_{b\in\Sigma} \beta_b \vert b\rangle\Biggr)\\ & = \sum_{a\in\Sigma}\sum_{b\in\Sigma} \overline{\alpha_a} \beta_b \langle a \vert b \rangle\\ & = \sum_{a\in\Sigma} \overline{\alpha_a} \beta_a, \end{aligned}

where the last equality follows from the observation that $\langle a \vert a \rangle = 1$ and $\langle a \vert b \rangle = 0$ for classical states $a$ and $b$ satisfying $a\neq b.$

The value $\langle \psi \vert \phi \rangle$ is called the inner product between the vectors $\vert \psi\rangle$ and $\vert \phi \rangle.$ Inner products are critically important in quantum information and computation — we would not get far in understanding quantum information at a mathematical level without them.

Let us now collect together some basic facts about inner products of vectors.

Relationship to the Euclidean norm. The inner product of any vector
$\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle$
with itself is
$\langle \psi \vert \psi \rangle = \sum_{a\in\Sigma} \overline{\alpha_a} \alpha_a = \sum_{a\in\Sigma} \vert\alpha_a\vert^2 = \bigl\| \vert \psi \rangle \bigr\|^2.$
Thus, the Euclidean norm of a vector may alternatively be expressed as
$\bigl\| \vert \psi \rangle \bigr\| = \sqrt{ \langle \psi \vert \psi \rangle }.$
Notice that the Euclidean norm of a vector must always be a nonnegative real number. Moreover, the only way the Euclidean norm of a vector can be equal to zero is if every one of the entries is equal to zero, which is to say that the vector is the zero vector.
We can summarize these observations like this: for every vector $\vert \psi \rangle$ we have
$\langle \psi \vert \psi \rangle \geq 0,$
with $\langle \psi \vert \psi \rangle = 0$ if and only if $\vert \psi \rangle = 0.$ This property of the inner product is sometimes referred to as positive definiteness.
Conjugate symmetry. For any two vectors
$\vert \psi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle \quad\text{and}\quad \vert \phi \rangle = \sum_{b\in\Sigma} \beta_b \vert b \rangle,$
we have
$\langle \psi \vert \phi \rangle = \sum_{a\in\Sigma} \overline{\alpha_a} \beta_a \quad\text{and}\quad \langle \phi \vert \psi \rangle = \sum_{a\in\Sigma} \overline{\beta_a} \alpha_a,$
and therefore
$\overline{\langle \psi \vert \phi \rangle} = \langle \phi \vert \psi \rangle.$
Linearity in the second argument (and conjugate linearity in the first). Let us suppose that $\vert \psi \rangle,$ $\vert \phi_1 \rangle,$ and $\vert \phi_2 \rangle$ are vectors and $\alpha_1$ and $\alpha_2$ are complex numbers. If we define a new vector
$\vert \phi\rangle = \alpha_1 \vert \phi_1\rangle + \alpha_2 \vert \phi_2\rangle,$
then
$\langle \psi \vert \phi \rangle = \langle \psi \vert \bigl( \alpha_1\vert \phi_1 \rangle + \alpha_2\vert \phi_2 \rangle\bigr) = \alpha_1 \langle \psi \vert \phi_1 \rangle + \alpha_2 \langle \psi \vert \phi_2 \rangle.$
That is to say, the inner product is linear in the second argument. This can be verified either through the formulas above or simply by noting that matrix multiplication is linear in each argument (and specifically in the second argument).
Combining this fact with conjugate symmetry reveals that the inner product is conjugate linear in the first argument. That is, if $\vert \psi_1 \rangle,$ $\vert \psi_2 \rangle,$ and $\vert \phi \rangle$ are vectors and $\alpha_1$ and $\alpha_2$ are complex numbers, and we define
$\vert \psi \rangle = \alpha_1 \vert \psi_1\rangle + \alpha_2 \vert \psi_2 \rangle,$
then
$\langle \psi \vert \phi \rangle = \bigl( \overline{\alpha_1} \langle \psi_1 \vert + \overline{\alpha_2} \langle \psi_2 \vert \bigr) \vert\phi\rangle = \overline{\alpha_1} \langle \psi_1 \vert \phi \rangle + \overline{\alpha_2} \langle \psi_2 \vert \phi \rangle.$
The Cauchy–Schwarz inequality. For every choice of vectors $\vert \phi \rangle$ and $\vert \psi \rangle$ having the same number of entries, we have
$\bigl\vert \langle \psi \vert \phi \rangle\bigr| \leq \bigl\| \vert\psi \rangle \bigr\| \bigl\| \vert \phi \rangle \bigr\|.$
This is an incredibly handy inequality that gets used quite extensively in quantum information (and in many other fields of study).

Orthogonal and orthonormal sets

Two vectors $\vert \phi \rangle$ and $\vert \psi \rangle$ are said to be orthogonal if their inner product is zero:

\langle \psi \vert \phi \rangle = 0.

Geometrically, we can think about orthogonal vectors as vectors at right angles to each other.

A set of vectors $\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\}$ is called an orthogonal set if every vector in the set is orthogonal to every other vector in the set. That is, this set is orthogonal if

\langle \psi_j \vert \psi_k\rangle = 0

for all choices of $j,k\in\{1,\ldots,m\}$ for which $j\neq k.$

A set of vectors $\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\}$ is called an orthonormal set if it is an orthogonal set and, in addition, every vector in the set is a unit vector. Alternatively, this set is an orthonormal set if we have

\langle \psi_j \vert \psi_k\rangle = \begin{cases} 1 & j = k\\[1mm] 0 & j\neq k \end{cases} \tag{3}

for all choices of $j,k\in\{1,\ldots,m\}.$

Finally, a set $\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\}$ is an orthonormal basis if, in addition to being an orthonormal set, it forms a basis. This is equivalent to $\{ \vert \psi_1\rangle,\ldots,\vert\psi_m\rangle\}$ being an orthonormal set and $m$ being equal to the dimension of the space from which $\vert \psi_1\rangle,\ldots,\vert\psi_m\rangle$ are drawn.

For example, for any classical state set $\Sigma,$ the set of all standard basis vectors

\big\{ \vert a \rangle \,:\, a\in\Sigma\bigr\}

is an orthonormal basis. The set $\{\vert+\rangle,\vert-\rangle\}$ is an orthonormal basis for the $2$ -dimensional space corresponding to a single qubit, and the Bell basis $\{\vert\phi^+\rangle, \vert\phi^-\rangle, \vert\psi^+\rangle, \vert\psi^-\rangle\}$ is an orthonormal basis for the $4$ -dimensional space corresponding to two qubits.

Extending orthonormal sets to orthonormal bases

Suppose that $\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle$ are vectors that live in an $n$ -dimensional space, and assume moreover that $\{\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle\}$ is an orthonormal set. Orthonormal sets are always linearly independent sets, so these vectors necessarily span a subspace of dimension $m.$ From this we conclude that $m\leq n$ because the dimension of the subspace spanned by these vectors cannot be larger than the dimension of the entire space from which they're drawn.

If it is the case that $m<n,$ then it is always possible to choose an additional $n-m$ vectors $\vert \psi_{m+1}\rangle,\ldots,\vert\psi_n\rangle$ so that $\{\vert\psi_1\rangle,\ldots,\vert\psi_n\rangle\}$ forms an orthonormal basis. A procedure known as the Gram–Schmidt orthogonalization process can be used to construct these vectors.

Orthonormal sets and unitary matrices

Orthonormal sets of vectors are closely connected with unitary matrices. One way to express this connection is to say that the following three statements are logically equivalent (meaning that they are all true or all false) for any choice of a square matrix $U$ :

The matrix $U$ is unitary (i.e., $U^{\dagger} U = \mathbb{I} = U U^{\dagger}$ ).
The rows of $U$ form an orthonormal set.
The columns of $U$ form an orthonormal set.

This equivalence is actually pretty straightforward when we think about how matrix multiplication and the conjugate transpose work. Suppose, for instance, that we have a $3\times 3$ matrix like this:

U = \begin{pmatrix} \alpha_{1,1} & \alpha_{1,2} & \alpha_{1,3} \\[1mm] \alpha_{2,1} & \alpha_{2,2} & \alpha_{2,3} \\[1mm] \alpha_{3,1} & \alpha_{3,2} & \alpha_{3,3} \end{pmatrix}

The conjugate transpose of $U$ looks like this:

U^{\dagger} = \begin{pmatrix} \overline{\alpha_{1,1}} & \overline{\alpha_{2,1}} & \overline{\alpha_{3,1}} \\[1mm] \overline{\alpha_{1,2}} & \overline{\alpha_{2,2}} & \overline{\alpha_{3,2}} \\[1mm] \overline{\alpha_{1,3}} & \overline{\alpha_{2,3}} & \overline{\alpha_{3,3}} \end{pmatrix}

Multiplying the two matrices, with the conjugate transpose on the left-hand side, gives us this matrix:

\begin{aligned} &\begin{pmatrix} \overline{\alpha_{1,1}} & \overline{\alpha_{2,1}} & \overline{\alpha_{3,1}} \\[1mm] \overline{\alpha_{1,2}} & \overline{\alpha_{2,2}} & \overline{\alpha_{3,2}} \\[1mm] \overline{\alpha_{1,3}} & \overline{\alpha_{2,3}} & \overline{\alpha_{3,3}} \end{pmatrix} \begin{pmatrix} \alpha_{1,1} & \alpha_{1,2} & \alpha_{1,3} \\[1mm] \alpha_{2,1} & \alpha_{2,2} & \alpha_{2,3} \\[1mm] \alpha_{3,1} & \alpha_{3,2} & \alpha_{3,3} \end{pmatrix}\\[3mm] \quad &= \begin{pmatrix} \overline{\alpha_{1,1}}\alpha_{1,1} + \overline{\alpha_{2,1}}\alpha_{2,1} + \overline{\alpha_{3,1}}\alpha_{3,1} & \overline{\alpha_{1,1}}\alpha_{1,2} + \overline{\alpha_{2,1}}\alpha_{2,2} + \overline{\alpha_{3,1}}\alpha_{3,2} & \overline{\alpha_{1,1}}\alpha_{1,3} + \overline{\alpha_{2,1}}\alpha_{2,3} + \overline{\alpha_{3,1}}\alpha_{3,3} \\[2mm] \overline{\alpha_{1,2}}\alpha_{1,1} + \overline{\alpha_{2,2}}\alpha_{2,1} + \overline{\alpha_{3,2}}\alpha_{3,1} & \overline{\alpha_{1,2}}\alpha_{1,2} + \overline{\alpha_{2,2}}\alpha_{2,2} + \overline{\alpha_{3,2}}\alpha_{3,2} & \overline{\alpha_{1,2}}\alpha_{1,3} + \overline{\alpha_{2,2}}\alpha_{2,3} + \overline{\alpha_{3,2}}\alpha_{3,3} \\[2mm] \overline{\alpha_{1,3}}\alpha_{1,1} + \overline{\alpha_{2,3}}\alpha_{2,1} + \overline{\alpha_{3,3}}\alpha_{3,1} & \overline{\alpha_{1,3}}\alpha_{1,2} + \overline{\alpha_{2,3}}\alpha_{2,2} + \overline{\alpha_{3,3}}\alpha_{3,2} & \overline{\alpha_{1,3}}\alpha_{1,3} + \overline{\alpha_{2,3}}\alpha_{2,3} + \overline{\alpha_{3,3}}\alpha_{3,3} \end{pmatrix} \end{aligned}

If we form three vectors from the columns of $U,$

\vert \psi_1\rangle = \begin{pmatrix} \alpha_{1,1}\\ \alpha_{2,1}\\ \alpha_{3,1} \end{pmatrix}, \quad \vert \psi_2\rangle = \begin{pmatrix} \alpha_{1,2}\\ \alpha_{2,2}\\ \alpha_{3,2} \end{pmatrix}, \quad \vert \psi_3\rangle = \begin{pmatrix} \alpha_{1,3}\\ \alpha_{2,3}\\ \alpha_{3,3} \end{pmatrix},

then we can alternatively express the product above as follows:

U^{\dagger} U = \begin{pmatrix} \langle \psi_1\vert \psi_1 \rangle & \langle \psi_1\vert \psi_2 \rangle & \langle \psi_1\vert \psi_3 \rangle \\ \langle \psi_2\vert \psi_1 \rangle & \langle \psi_2\vert \psi_2 \rangle & \langle \psi_2\vert \psi_3 \rangle \\ \langle \psi_3\vert \psi_1 \rangle & \langle \psi_3\vert \psi_2 \rangle & \langle \psi_3\vert \psi_3 \rangle \end{pmatrix}

Referring to the equation $(3),$ we now see that the condition that this matrix is equal to the identity matrix is equivalent to the orthonormality of the set $\{\vert\psi_1\rangle,\vert\psi_2\rangle,\vert\psi_3\rangle\}.$ This argument generalizes to unitary matrices of any size. The fact that the rows of a matrix form an orthonormal basis if and only if the matrix is unitary then follows from the fact that a matrix is unitary if and only if its transpose is unitary.

Given the equivalence described above, together with the fact that every orthonormal set can be extended to form an orthonormal basis, we conclude the following useful fact: Given any orthonormal set of vectors $\{\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle\}$ drawn from an $n$ -dimensional space, there exists a unitary matrix $U$ whose first $m$ columns are the vectors $\vert\psi_1\rangle,\ldots,\vert\psi_m\rangle.$ Pictorially, we can always find a unitary matrix having this form:

U = \left( \begin{array}{ccccccc} \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt}\\ \vert\psi_1\rangle & \vert\psi_2\rangle & \cdots & \vert\psi_m\rangle & \vert\psi_{m+1}\rangle & \cdots & \vert\psi_n\rangle\\ \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} & \rule{0.4pt}{10pt} & & \rule{0.4pt}{10pt} \end{array} \right).

Here, the last $n-m$ columns are filled in with any choice of vectors $\vert\psi_{m+1}\rangle,\ldots,\vert\psi_n\rangle$ that make $\{\vert\psi_1\rangle,\ldots,\vert\psi_n\rangle\}$ an orthonormal basis.

Projections and projective measurements

Projection matrices

A square matrix $\Pi$ is called a projection if it satisfies two properties:

$\Pi = \Pi^{\dagger}.$
$\Pi^2 = \Pi.$

Matrices that satisfy the first condition — that they are equal to their own conjugate transpose — are called Hermitian matrices, and matrices that satisfy the second condition — that squaring them leaves them unchanged — are called idempotent matrices.

As a word of caution, the word projection is sometimes used to refer to any matrix that satisfies just the second condition but not necessarily the first, and when this is done the term orthogonal projection is typically used to refer to matrices satisfying both properties. In the context of quantum information and computation, however, terms projection and projection matrix more typically refer to matrices satisfying both conditions.

An example of a projection is the matrix

\Pi = \vert \psi \rangle \langle \psi \vert \tag{4}

for any unit vector $\vert \psi\rangle.$ We can see that this matrix is Hermitian as follows:

\Pi^{\dagger} = \bigl( \vert \psi \rangle \langle \psi \vert \bigr)^{\dagger} = \bigl( \langle \psi \vert \bigr)^{\dagger}\bigl( \vert \psi \rangle \bigr)^{\dagger} = \vert \psi \rangle \langle \psi \vert = \Pi.

Here, to obtain the second equality, we have used the formula

(A B)^{\dagger} = B^{\dagger} A^{\dagger},

which is always true, for any two matrices $A$ and $B$ for which the product $AB$ makes sense.

To see that the matrix $\Pi$ in $(4)$ is idempotent, we can use the assumption that $\vert\psi\rangle$ is a unit vector, so that it satisfies $\langle \psi \vert \psi\rangle = 1.$ Thus, we have

\Pi^2 = \bigl( \vert\psi\rangle\langle \psi\vert \bigr)^2 = \vert\psi\rangle\langle \psi\vert\psi\rangle\langle\psi\vert = \vert\psi\rangle\langle\psi\vert = \Pi.

More generally, if $\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\}$ is any orthonormal set of vectors, then the matrix

\Pi = \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert \tag{5}

is a projection. Specifically, we have

\begin{aligned} \Pi^{\dagger} &= \biggl(\sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\biggr)^{\dagger} \\ &= \sum_{k = 1}^m \bigl(\vert\psi_k\rangle\langle\psi_k\vert\bigr)^{\dagger} \\ &= \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\\ &= \Pi, \end{aligned}

and

\begin{aligned} \Pi^2 & = \biggl( \sum_{j = 1}^m \vert \psi_j\rangle \langle \psi_j \vert\Bigr)\Bigl(\sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\biggr) \\ & = \sum_{j = 1}^m\sum_{k = 1}^m \vert \psi_j\rangle \langle \psi_j \vert \psi_k\rangle \langle \psi_k \vert \\ & = \sum_{k = 1}^m \vert \psi_k\rangle \langle \psi_k \vert\\ & = \Pi, \end{aligned}

where the orthonormality of $\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\}$ implies the second-to-last equality.

In fact, this exhausts all of the possibilities; every projection $\Pi$ can be written in the form $(5)$ for some choice of an orthonormal set $\{\vert \psi_1\rangle,\ldots,\vert \psi_m\rangle\}.$ (Technically speaking, the zero matrix $\Pi=0,$ which is a projection, is a special case. To fit it into the general form (5) we must allow the possibility that the sum is empty, resulting in the zero matrix.)

Projective measurements

The notion of a measurement of a quantum system is more general than just standard basis measurements. Projective measurements are measurements that are described by a collection of projections whose sum is equal to the identity matrix. In symbols, a collection $\{\Pi_0,\ldots,\Pi_{m-1}\}$ of projection matrices describes a projective measurement if

\Pi_0 + \cdots + \Pi_{m-1} = \mathbb{I}.

When such a measurement is performed on a system $\mathsf{X}$ while it is in some state $\vert\psi\rangle,$ two things happen:

For each $k\in\{0,\ldots,m-1\},$ the outcome of the measurement is $k$ with probability equal to
$\operatorname{Pr}\bigl(\text{outcome is $k$}\bigr) = \bigl\| \Pi_k \vert \psi \rangle \bigr\|^2.$
For whichever outcome $k$ the measurement produces, the state of $\mathsf{X}$ becomes
$\frac{\Pi_k \vert\psi\rangle}{\bigl\|\Pi_k \vert\psi\rangle\bigr\|}.$

We can also choose outcomes other than $\{0,\ldots,m-1\}$ for projective measurements if we wish. More generally, for any finite and nonempty set $\Sigma,$ if we have a collection of projection matrices

\{\Pi_a:a\in\Sigma\}

that satisfies the condition

\sum_{a\in\Sigma} \Pi_a = \mathbb{I},

then this collection describes a projective measurement whose possible outcomes coincide with the set $\Sigma,$ where the rules are the same as before:

For each $a\in\Sigma,$ the outcome of the measurement is $a$ with probability equal to
$\operatorname{Pr}\bigl(\text{outcome is $a$}\bigr) = \bigl\| \Pi_a \vert \psi \rangle \bigr\|^2.$
For whichever outcome $a$ the measurement produces, the state of $\mathsf{X}$ becomes
$\frac{\Pi_a \vert\psi\rangle}{\bigl\|\Pi_a \vert\psi\rangle\bigr\|}.$

For example, standard basis measurements are equivalent to projective measurements, where $\Sigma$ is the set of classical states of whatever system $\mathsf{X}$ we're talking about and our set of projection matrices is $\{\vert a\rangle\langle a\vert:a\in\Sigma\}.$

Another example of a projective measurement, this time on two qubits $(\mathsf{X},\mathsf{Y}),$ is given by the set $\{\Pi_0,\Pi_1\},$ where

\Pi_0 = \vert \phi^+\rangle\langle \phi^+ \vert + \vert \phi^-\rangle\langle \phi^- \vert + \vert \psi^+\rangle\langle \psi^+ \vert \quad\text{and}\quad \Pi_1 = \vert\psi^-\rangle\langle\psi^-\vert.

If we have multiple systems that are jointly in some quantum state and a projective measurement is performed on just one of the systems, the action is similar to what we had for standard basis measurements — and in fact we can now describe this action in much simpler terms than we could before. To be precise, let us suppose that we have two systems $(\mathsf{X},\mathsf{Y})$ in a quantum state $\vert\psi\rangle,$ and a projective measurement described by a collection $\{\Pi_a : a\in\Sigma\}$ is performed on the system $\mathsf{X},$ while nothing is done to $\mathsf{Y}.$ Doing this is then equivalent to performing the projective measurement described by the collection

\bigl\{ \Pi_a \otimes \mathbb{I} \,:\, a\in\Sigma\bigr\}

on the joint system $(\mathsf{X},\mathsf{Y}).$ Each measurement outcome $a$ results with probability

\bigl\| (\Pi_a \otimes \mathbb{I})\vert \psi\rangle \bigr\|^2,

and conditioned on the result $a$ appearing, the state of the joint system $(\mathsf{X},\mathsf{Y})$ becomes

\frac{(\Pi_a \otimes \mathbb{I})\vert \psi\rangle}{\bigl\| (\Pi_a \otimes \mathbb{I})\vert \psi\rangle \bigr\|}.

Implementing projective measurements using standard basis measurements

Arbitrary projective measurements can be implemented using unitary operations, standard basis measurements, and an extra workspace system, as will now be explained.

Let us suppose that $\mathsf{X}$ is a system and $\{\Pi_0,\ldots,\Pi_{m-1}\}$ is a projective measurement on $\mathsf{X}.$ We can easily generalize this discussion to projective measurements having different sets of outcomes, but in the interest of convenience and simplicity we will assume the set of possible outcomes for our measurement is $\{0,\ldots,m-1\}.$ Let us note explicitly that $m$ is not necessarily equal to the number of classical states of $\mathsf{X}$ — we'll let $n$ be the number of classical states of $\mathsf{X},$ which means that each matrix $\Pi_k$ is an $n\times n$ projection matrix. Because we assume that $\{\Pi_0\ldots,\Pi_{m-1}\}$ represents a projective measurement, it is necessarily the case that

\sum_{k = 0}^{m-1} \Pi_k = \mathbb{I}_n.

Our goal is to perform a process that has the same effect as performing this projective measurement on $\mathsf{X},$ but to do this using only unitary operations and standard basis measurements.

We will make use of an extra workspace system $\mathsf{Y}$ to do this, and specifically we'll take the classical state set of $\mathsf{Y}$ to be $\{0,\ldots,m-1\},$ which is the same as the set of outcomes of the projective measurement. The idea is that we will perform a standard basis measurement on $\mathsf{Y},$ and interpret the outcome of this measurement as being equivalent to the outcome of the projective measurement on $\mathsf{X}.$ We'll need to assume that $\mathsf{Y}$ is initialized to some fixed state, which we'll choose to be $\vert 0\rangle.$ (Any other choice of fixed quantum state vector could be made to work, but choosing $\vert 0\rangle$ makes the explanation to follow much simpler.)

Of course, in order for a standard basis measurement of $\mathsf{Y}$ to tell us anything about $\mathsf{X},$ we will need to allow $\mathsf{X}$ and $\mathsf{Y}$ to interact somehow before measuring $\mathsf{Y},$ by performing a unitary operation on the system $(\mathsf{Y},\mathsf{X}).$ First consider this matrix:

M = \sum_{k = 0}^{m-1} \vert k \rangle \langle 0 \vert \otimes \Pi_k.

Expressed explicitly as a so-called block matrix, which is essentially a matrix of matrices that we interpret as a single, larger matrix, $M$ looks like this:

M = \begin{pmatrix} \Pi_0 & 0 & \cdots & 0\\[1mm] \Pi_1 & 0 & \cdots & 0\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \Pi_{m-1} & 0 & \cdots & 0 \end{pmatrix}.

Here, each $0$ represents an $n\times n$ matrix filled entirely with zeros, so that the entire matrix $M$ is an $nm\times nm$ matrix.

Now, $M$ is certainly not a unitary matrix (unless $m=1,$ in which case $\Pi_0 = \mathbb{I},$ giving $M = \mathbb{I}$ in this trivial case) because unitary matrices cannot have any columns (or rows) that are entirely $0;$ unitary matrices have columns that form orthonormal bases, and the all-zero vector is not a unit vector. However, it is the case that the first $n$ columns of $M$ are orthonormal, and we get this from the assumption that $\{\Pi_0,\ldots,\Pi_{m-1}\}$ is a measurement. To verify this claim, notice that for each $j\in\{0,\ldots,n-1\},$ column number $j$ of $M$ is as follows.

\vert \psi_j\rangle = M \vert 0, j\rangle = \sum_{k = 0}^{m-1} \vert k \rangle \otimes \Pi_k \vert j\rangle.

Note that here we're numbering the columns starting from column $0.$ Taking the inner product of column $i$ with column $j$ when $i,j\in\{0,\ldots,n-1\}$ gives

\begin{aligned} \langle \psi_i \vert \psi_j \rangle & = \biggl(\sum_{k = 0}^{m-1} \vert k \rangle \otimes \Pi_k \vert i\rangle\biggr)^{\dagger} \biggl(\sum_{l = 0}^{m-1} \vert l \rangle \otimes \Pi_l \vert j\rangle\biggr) \\ & = \sum_{k = 0}^{m-1} \sum_{l = 0}^{m-1} \langle k \vert l \rangle \langle i \vert \Pi_k \Pi_l \vert j\rangle\\ & = \sum_{k = 0}^{m-1} \langle i \vert \Pi_k \Pi_k \vert j\rangle\\ & = \sum_{k = 0}^{m-1} \langle i \vert \Pi_k \vert j\rangle\\ & = \langle i \vert \mathbb{I} \vert j \rangle\\ & = \begin{cases} 1 & i = j\\ 0 & i\neq j, \end{cases} \end{aligned}

which is what we needed to show.

Thus, because the first $n$ columns of the matrix $M$ are orthonormal, we can replace all of the remaining zero entries by some different choice of complex number entries so that the entire matrix is unitary:

U = \begin{pmatrix} \Pi_0 & \fbox{?} & \cdots & \fbox{?}\\[1mm] \Pi_1 & \fbox{?} & \cdots & \fbox{?}\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \Pi_{m-1} & \fbox{?} & \cdots & \fbox{?} \end{pmatrix}

If we're given the matrices $\Pi_0,\ldots,\Pi_{m-1},$ we can compute suitable matrices to fill in for the blocks marked $\fbox{?}$ in the equation — using the Gram–Schmidt process — but it does not matter specifically what these matrices are for the sake of this discussion.

Finally we can describe the measurement process: we first perform $U$ on the joint system $(\mathsf{Y},\mathsf{X})$ and then measure $\mathsf{Y}$ with respect to a standard basis measurement. For an arbitrary state $\vert \phi \rangle$ of $\mathsf{X},$ we obtain the state

U \bigl( \vert 0\rangle \vert \phi\rangle\bigr) = M \bigl( \vert 0\rangle \vert \phi\rangle\bigr) = \sum_{k = 0}^{m-1} \vert k\rangle \otimes \Pi_k \vert\phi\rangle,

where the first equality follows from the fact that $U$ and $M$ agree on their first $n$ columns. When we perform a projective measurement on $\mathsf{Y},$ we obtain each outcome $k$ with probability

\bigl\| \Pi_k \vert \phi\rangle \bigr\|^2,

in which case the state of $(\mathsf{Y},\mathsf{X})$ becomes

\vert k\rangle \otimes \frac{\Pi_k \vert \phi\rangle}{\bigl\| \Pi_k \vert \phi\rangle \bigr\|}.

Thus, $\mathsf{Y}$ stores a copy of the measurement outcome and $\mathsf{X}$ changes precisely as it would had the projective measurement described by $\{\Pi_0,\ldots,\Pi_{m-1}\}$ been performed directly on $\mathsf{X}.$

Limitations on quantum information

Despite sharing a common underlying mathematical structure, quantum and classical information have key differences. As a result, there are many examples of tasks that quantum information allows but classical information does not. Before exploring some of these examples, however, we'll take note of some important limitations on quantum information. Understanding things quantum information can't do helps us identify the things it can do.

Irrelevance of global phases

The first limitation we'll cover — which is really more of a slight degeneracy in the way that quantum states are represented by quantum state vectors, as opposed to an actual limitation — concerns the notion of a global phase.

What we mean by a global phase is this. Let $\vert \psi \rangle$ and $\vert \phi \rangle$ be unit vectors representing quantum states of some system, and suppose that there exists a complex number $\alpha$ on the unit circle, meaning that $\vert \alpha \vert = 1,$ or alternatively $\alpha = e^{i\theta}$ for some real number $\theta,$ such that

\vert \phi \rangle = \alpha \vert \psi \rangle.

The vectors $\vert \psi \rangle$ and $\vert \phi \rangle$ are then said to differ by a global phase. We also sometimes refer to $\alpha$ as a global phase, although this is context-dependent; any number on the unit circle can be thought of as a global phase when multiplied to a unit vector.

Consider what happens when a system is in one of the two quantum states $\vert\psi\rangle$ and $\vert\phi\rangle,$ and the system undergoes a standard basis measurement. In the first case, in which the system is in the state $\vert\psi\rangle,$ the probability of measuring any classical state $a$ is

\bigl\vert \langle a \vert \psi \rangle \bigr\vert^2.

In the second case, in which the system is in the state $\vert\phi\rangle,$ the probability of measuring any classical state $a$ is

\bigl\vert \langle a \vert \phi \rangle \bigr\vert^2 = \bigl\vert \alpha \langle a \vert \psi \rangle \bigr\vert^2 = \vert \alpha \vert^2 \bigl\vert \langle a \vert \psi \rangle \bigr\vert^2 = \bigl\vert \langle a \vert \psi \rangle \bigr\vert^2,

because $\vert\alpha\vert = 1.$ That is, the probability of an outcome appearing is the same for both states.

Now consider what happens when we apply an arbitrary unitary operation $U$ to both states. In the first case, in which the initial state is $\vert \psi \rangle,$ the state becomes

U \vert \psi \rangle,

and in the second case, in which the initial state is $\vert \phi\rangle,$ it becomes

U \vert \phi \rangle = \alpha U \vert \psi \rangle.

That is, the two resulting states still differ by the same global phase $\alpha.$

Consequently, two quantum states $\vert\psi\rangle$ and $\vert\phi\rangle$ that differ by a global phase are completely indistinguishable; no matter what operation, or sequence of operations, we apply to the two states, they will always differ by a global phase, and performing a standard basis measurement will produce outcomes with precisely the same probabilities as the other. For this reason, two quantum state vectors that differ by a global phase are considered to be equivalent, and are effectively viewed as being the same state.

For example, the quantum states

\vert - \rangle = \frac{1}{\sqrt{2}} \vert 0 \rangle - \frac{1}{\sqrt{2}} \vert 1 \rangle \quad\text{and}\quad -\vert - \rangle = -\frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{1}{\sqrt{2}} \vert 1 \rangle

differ by a global phase (which is $-1$ in this example), and are therefore considered to be the same state.

On the other hand, the quantum states

\vert + \rangle = \frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{1}{\sqrt{2}} \vert 1 \rangle \quad\text{and}\quad \vert - \rangle = \frac{1}{\sqrt{2}} \vert 0 \rangle - \frac{1}{\sqrt{2}} \vert 1 \rangle

do not differ by a global phase. Although the only difference between the two states is that a plus sign turns into a minus sign, this is not a global phase difference, it is a relative phase difference because it does not affect every vector entry, but only a proper subset of the entries. This is consistent with what we have already observed previously, which is that the states $\vert{+} \rangle$ and $\vert{-}\rangle$ can be discriminated perfectly. In particular, performing a Hadamard operation and then measuring yields outcome probabilities as follows:

\begin{aligned} \bigl\vert \langle 0 \vert H \vert {+} \rangle \bigr\vert^2 = 1 & \hspace{1cm} \bigl\vert \langle 0 \vert H \vert {-} \rangle \bigr\vert^2 = 0 \\[1mm] \bigl\vert \langle 1 \vert H \vert {+} \rangle \bigr\vert^2 = 0 & \hspace{1cm} \bigl\vert \langle 1 \vert H \vert {-} \rangle \bigr\vert^2 = 1. \end{aligned}

No-cloning theorem

The no-cloning theorem shows it is impossible to create a perfect copy of an unknown quantum state.

Theorem (No-cloning theorem). Let $\Sigma$ be a classical state set having at least two elements, and let $\mathsf{X}$ and $\mathsf{Y}$ be systems sharing the same classical state set $\Sigma.$ There does not exist a quantum state $\vert \phi\rangle$ of $\mathsf{Y}$ and a unitary operation $U$ on the pair $(\mathsf{X},\mathsf{Y})$ such that $U \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle$ for every state $\vert \psi \rangle$ of $\mathsf{X}.$

That is, there is no way to initialize the system $\mathsf{Y}$ (to any state $\vert\phi\rangle$ whatsoever) and perform a unitary operation $U$ on the joint system $(\mathsf{X},\mathsf{Y})$ so that the effect is for the state $\vert\psi\rangle$ of $\mathsf{X}$ to be cloned — resulting in $(\mathsf{X},\mathsf{Y})$ being in the state $\vert \psi \rangle \otimes \vert\psi\rangle.$

The proof of this theorem is actually quite simple: it boils down to the observation that the mapping

\vert\psi\rangle \otimes \vert \phi\rangle\mapsto\vert\psi\rangle \otimes \vert \psi\rangle

is not linear in $\vert\psi\rangle.$

In particular, because $\Sigma$ has at least two elements, we may choose $a,b\in\Sigma$ with $a\neq b.$ If there did exist a quantum state $\vert \phi\rangle$ of $\mathsf{Y}$ and a unitary operation $U$ on the pair $(\mathsf{X},\mathsf{Y})$ for which $U \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle$ for every quantum state $\vert\psi\rangle$ of $\mathsf{X},$ then it would be the case that

U \bigl( \vert a \rangle \otimes \vert\phi\rangle\bigr) = \vert a \rangle \otimes \vert a\rangle \quad\text{and}\quad U \bigl( \vert b \rangle \otimes \vert\phi\rangle\bigr) = \vert b \rangle \otimes \vert b\rangle.

By linearity, meaning specifically the linearity of the tensor product in the first argument and the linearity of matrix-vector multiplication in the second (vector) argument, we must therefore have

U \biggl(\biggl( \frac{1}{\sqrt{2}}\vert a \rangle + \frac{1}{\sqrt{2}} \vert b\rangle \biggr) \otimes \vert\phi\rangle\biggr) = \frac{1}{\sqrt{2}} \vert a \rangle \otimes \vert a\rangle + \frac{1}{\sqrt{2}} \vert b \rangle \otimes \vert b\rangle.

However, the requirement that $U \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle$ for every quantum state $\vert\psi\rangle$ demands that

\begin{aligned} & U \biggl(\biggl( \frac{1}{\sqrt{2}}\vert a \rangle + \frac{1}{\sqrt{2}} \vert b\rangle \biggr) \otimes \vert\phi\rangle\biggr)\\ & \qquad = \biggl(\frac{1}{\sqrt{2}} \vert a \rangle + \frac{1}{\sqrt{2}} \vert b \rangle\biggr) \otimes \biggl(\frac{1}{\sqrt{2}} \vert a \rangle + \frac{1}{\sqrt{2}} \vert b \rangle\biggr)\\ & \qquad = \frac{1}{2} \vert a \rangle \otimes \vert a\rangle + \frac{1}{2} \vert a \rangle \otimes \vert b\rangle + \frac{1}{2} \vert b \rangle \otimes \vert a\rangle + \frac{1}{2} \vert b \rangle \otimes \vert b\rangle\\ & \qquad \neq \frac{1}{\sqrt{2}} \vert a \rangle \otimes \vert a\rangle + \frac{1}{\sqrt{2}} \vert b \rangle \otimes \vert b\rangle \end{aligned}

Therefore there cannot exist a state $\vert \phi\rangle$ and a unitary operation $U$ for which $U \bigl( \vert \psi \rangle \otimes \vert\phi\rangle\bigr) = \vert \psi \rangle \otimes \vert\psi\rangle$ for every quantum state vector $\vert \psi\rangle.$

A few remarks concerning the no-cloning theorem are in order. The first one is that the statement of the no-cloning theorem above is absolute, in the sense that it states that perfect cloning is impossible — but it does not say anything about possibly cloning with limited accuracy, where we might succeed in producing an approximate clone (with respect to some way of measuring how similar two different quantum states might be). There are, in fact, statements of the no-cloning theorem that place limitations on approximate cloning, as well as methods to achieve approximate cloning with limited accuracy.

The second remark is that the no-cloning theorem is a statement about the impossibility of cloning an arbitrary state $\vert\psi\rangle.$ In contrast, we can easily create a clone of any standard basis state, for instance. For example, we can clone a qubit standard basis state using a controlled-NOT operation:

Classical copy

While there is no difficulty in creating a clone of a standard basis state, this does not contradict the no-cloning theorem — this approach of using a controlled-NOT gate would not succeed in creating a clone of the state $\vert + \rangle,$ for instance.

One final remark about the no-cloning theorem is that it really isn't unique to quantum information — it's also impossible to clone an arbitrary probabilistic state using a classical (deterministic or probabilistic) process. This is pretty intuitive. Imagine someone hands you a system in some probabilistic state, but you're not sure what that probabilistic state is. For example, maybe they randomly generated a number between $1$ and $10,$ but they didn't tell you how they generated that number. There's certainly no physical process through which you can obtain two independent copies of that same probabilistic state: all you have in your hands is a number between $1$ and $10,$ and there just isn't enough information present for you to somehow reconstruct the probabilities for all of the other outcomes to appear. Mathematically speaking, a version of the no-cloning theorem for probabilistic states can be proved in exactly the same way as the regular no-cloning theorem (for quantum states). That is, cloning an arbitrary probabilistic state is a non-linear process, so it cannot possibly be represented by a stochastic matrix.

Non-orthogonal states cannot be perfectly discriminated

For the final limitation to be covered in this lesson, we'll show that if we have two quantum states $\vert\psi\rangle$ and $\vert\phi\rangle$ that are not orthogonal, which means that $\langle \phi\vert\psi\rangle \neq 0,$ then it's impossible to discriminate them (or, in other words, to tell them apart) perfectly. In fact, we'll show something logically equivalent: if we do have a way to discriminate two states perfectly, without any error, then they must be orthogonal.

We will restrict our attention to quantum circuits that consist of any number of unitary gates, followed by a single standard basis measurement of the top qubit. What we require of a quantum circuit, to say that it perfectly discriminates the states $\vert\psi\rangle$ and $\vert\phi\rangle,$ is that the measurement always yields the value $0$ for one of the two states and always yields $1$ for the other state. To be precise, we shall assume that we have a quantum circuit that operates as the following diagrams suggest:

Discriminate psi

The box labeled $U$ denotes the unitary operation representing the combined action of all of the unitary gates in our circuit, but not including the final measurement. There is no loss of generality in assuming that the measurement outputs $0$ for $\vert\psi\rangle$ and $1$ for $\vert\phi\rangle;$ the analysis would not differ fundamentally if these output values were reversed.

Notice that, in addition to the qubits that initially store either $\vert\psi\rangle$ or $\vert\phi\rangle,$ the circuit is free to make use of any number of additional workspace qubits. These qubits are initially each set to the $\vert 0\rangle$ state — so their combined state is denoted $\vert 0\cdots 0\rangle$ in the figures — and these qubits can be used by the circuit in any way that might be beneficial. It is very common to make use of workspace qubits in quantum circuits like this.

Now, consider what happens when we run our circuit on the state $\vert\psi\rangle$ (along with the initialized workspace qubits). The resulting state, immediately prior to the measurement being performed, can be written as

U \bigl( \vert 0\cdots 0 \rangle \vert \psi \rangle\bigr) = \vert \gamma_0\rangle\vert 0 \rangle + \vert \gamma_1 \rangle\vert 1 \rangle

for two vectors $\vert \gamma_0\rangle$ and $\vert \gamma_1\rangle$ that correspond to all of the qubits except the top qubit. In general, for such a state the probabilities that a measurement of the top qubit yields the outcomes $0$ and $1$ are as follows:

\operatorname{Pr}(\text{outcome is $0$}) = \bigl\| \vert\gamma_0\rangle \bigr\|^2 \qquad\text{and}\qquad \operatorname{Pr}(\text{outcome is $1$}) = \bigl\| \vert\gamma_1\rangle \bigr\|^2.

Because our circuit always outputs $0$ for the state $\vert\psi\rangle,$ it must be that $\vert\gamma_1\rangle = 0,$ and so

U \bigl( \vert 0\cdots 0\rangle\vert \psi \rangle \bigr) = \vert\gamma_0\rangle\vert 0 \rangle.

Multiplying both sides of this equation by $U^{\dagger}$ yields this equation:

\vert 0\cdots 0\rangle\vert \psi \rangle = U^{\dagger} \bigl( \vert \gamma_0\rangle\vert 0 \rangle \bigr). \tag{6}

Reasoning similarly for $\vert\phi\rangle$ in place of $\vert\psi\rangle,$ we conclude that

U \bigl( \vert 0\cdots 0\rangle\vert \phi \rangle \bigr) = \vert \delta_1\rangle\vert 1 \rangle

for some vector $\vert\delta_1\rangle,$ and therefore

\vert 0\cdots 0\rangle\vert \phi \rangle = U^{\dagger} \bigl( \vert \delta_1\rangle\vert 1 \rangle\bigr). \tag{7}

Now let us take the inner product of the vectors represented by the equations $(6)$ and $(7),$ starting with the representations on the right-hand side of each equation. We have

\bigl(U^{\dagger} \bigl( \vert \gamma_0\rangle\vert 0 \rangle \bigr)\bigr)^{\dagger} = \bigl( \langle\gamma_0\vert\langle 0\vert \bigr)U,

so the inner product of the vector $(6)$ with the vector $(7)$ is

\bigl( \langle\gamma_0\vert\langle 0\vert \bigr)U U^{\dagger} \bigl( \vert \delta_1\rangle\vert 1 \rangle\bigr) = \bigl( \langle\gamma_0\vert\langle 0\vert \bigr) \bigl( \vert \delta_1\rangle\vert 1 \rangle\bigr) = \langle \gamma_0 \vert \delta_1\rangle \langle 0 \vert 1 \rangle = 0.

Here we have used the fact that $U U^{\dagger} = \mathbb{I},$ as well as the fact that the inner product of tensor products is the product of the inner products:

\langle u \otimes v \vert w \otimes x\rangle = \langle u \vert w\rangle \langle v \vert x\rangle

for any choices of these vectors (assuming $\vert u\rangle$ and $\vert w\rangle$ have the same number of entries and $\vert v\rangle$ and $\vert x\rangle$ have the same number of entries, so that it makes sense to form the inner products $\langle u\vert w\rangle$ and $\langle v\vert x \rangle$ ). Notice that the value of the inner product $\langle \gamma_0 \vert \delta_1\rangle$ is irrelevant because it is multiplied by $\langle 0 \vert 1 \rangle = 0.$

Finally, taking the inner product of the vectors on the left-hand sides of the equations $(6)$ and $(7)$ must result in the same zero value that we've already calculated, so

0 = \bigl( \vert 0\cdots 0\rangle \vert \psi\rangle\bigr)^{\dagger} \bigl(\vert 0\cdots 0\rangle\vert \phi\rangle\bigr) = \langle 0\cdots 0 \vert 0\cdots 0 \rangle \langle \psi \vert \phi \rangle = \langle \psi \vert \phi \rangle.

We have therefore concluded what we wanted, which is that $\vert \psi\rangle$ and $\vert\phi\rangle$ are orthogonal: $\langle \psi \vert \phi \rangle = 0.$

It is possible, by the way, to perfectly discriminate any two states that are orthogonal, which is the converse to the statement we just proved. Suppose that the two states to be discriminated are $\vert \phi\rangle$ and $\vert \psi\rangle,$ where $\langle \phi\vert\psi\rangle = 0.$ We can then perfectly discriminate these states by performing the projective measurement described by these matrices, for instance:

\bigl\{ \vert\phi\rangle\langle\phi\vert,\,\mathbb{I} - \vert\phi\rangle\langle\phi\vert \bigr\}.

For the state $\vert\phi\rangle,$ the first outcome is always obtained:

\begin{aligned} & \bigl\| \vert\phi\rangle\langle\phi\vert \vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle\langle\phi\vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle \bigr\|^2 = 1,\\[1mm] & \bigl\| (\mathbb{I} - \vert\phi\rangle\langle\phi\vert) \vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle - \vert\phi\rangle\langle\phi\vert\phi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle - \vert\phi\rangle \bigr\|^2 = 0. \end{aligned}

And, for the state $\vert\psi\rangle,$ the second outcome is always obtained:

\begin{aligned} & \bigl\| \vert\phi\rangle\langle\phi\vert \vert\psi\rangle \bigr\|^2 = \bigl\| \vert\phi\rangle\langle\phi\vert\psi\rangle \bigr\|^2 = \bigl\| 0 \bigr\|^2 = 0,\\[1mm] & \bigl\| (\mathbb{I} - \vert\phi\rangle\langle\phi\vert) \vert\psi\rangle \bigr\|^2 = \bigl\| \vert\psi\rangle - \vert\phi\rangle\langle\phi\vert\psi\rangle \bigr\|^2 = \bigl\| \vert\psi\rangle \bigr\|^2 = 1. \end{aligned}

Was this page helpful?

Yes