Learning Home Catalog Composer
Learning
Home Catalog Composer Return to course
Learning

Multiple systems

Download the slides for this lesson.

Introduction

The focus of this lesson is on the basics of quantum information when there are multiple systems being considered. Such situations arise naturally in the context of information processing, both classical and quantum. Large information-carrying systems are often most easily constructed using collections of smaller systems, such as bits or qubits.

A simple, yet critically important, idea to keep in mind going into this lesson is that we can always choose to view multiple systems together as if they form a single, compound system — to which the discussion in the previous lesson applies. Indeed, this idea very directly leads to a description of how quantum states, measurements, and operations work for multiple systems.

There is more to understanding multiple quantum systems, however, than to simply recognize that they may be viewed collectively as single systems. For instance, we may have multiple quantum systems that are collectively in a particular quantum state, and then choose to measure just one (or a ) of the individual systems. In general, this will affect the state of the remaining systems, and it is important to understand exactly how when analyzing quantum algorithms and protocols. An understanding of the sorts of correlations among multiple systems — and particularly a type of correlation known as entanglement — is also important in quantum information and computation.

Classical information

As in the previous lesson, we will begin with a discussion of classical information. Once again, the probabilistic and quantum descriptions are mathematically similar, and recognizing how the mathematics works in the familiar setting of classical information is helpful in understanding why quantum information is described in the way that it is.

Classical states via the Cartesian product

We will start at a very basic level, with classical states of multiple systems. For simplicity, we will begin by discussing just two systems, and then generalize to more than two systems.

To be precise, let us suppose that X\mathsf{X} is a system whose classical state set is Σ,\Sigma, and Y\mathsf{Y} is a second system having classical state set Γ.\Gamma. As in the previous lesson, because we have referred to these sets as classical state sets, our assumption is that Σ\Sigma and Γ\Gamma are both finite and nonempty. It could be that Σ=Γ,\Sigma = \Gamma, but this is not necessarily so — and regardless it is helpful to use different names to refer to these sets in the interest of clarity.

Now imagine that the two systems, X\mathsf{X} and Y,\mathsf{Y}, are placed side-by-side, with X\mathsf{X} on the left and Y\mathsf{Y} on the right. If we so choose, we can view these two systems as if they form a single system, which we can denote by (X,Y)(\mathsf{X},\mathsf{Y}) or XY\mathsf{XY} depending on our preference.

A natural question to ask about this compound system (X,Y)(\mathsf{X},\mathsf{Y}) is, "What are its classical states?"

The answer is that the set of classical states of (X,Y)(\mathsf{X},\mathsf{Y}) is the Cartesian product of Σ\Sigma and Γ,\Gamma, which is the set defined as

Σ×Γ={(a,b):aΣ  and  bΓ}. \Sigma\times\Gamma = \bigl\{(a,b)\,:\,a\in\Sigma\;\text{and}\;b\in\Gamma\bigr\}.

In simple terms, the Cartesian product is precisely the mathematical notion that captures the idea of viewing an element of one set and an element of a second set together, as if they form a single element of a single set.

In the case at hand, to say that (X,Y)(\mathsf{X},\mathsf{Y}) is in the classical state (a,b)Σ×Γ(a,b)\in\Sigma\times\Gamma means that X\mathsf{X} is in the classical state aΣa\in\Sigma and Y\mathsf{Y} is in the classical state bΓ;b\in\Gamma; and if the classical state of X\mathsf{X} is aΣa\in\Sigma and the classical state of Y\mathsf{Y} is bΓ,b\in\Gamma, then the classical state of the joint system (X,Y)(\mathsf{X},\mathsf{Y}) is (a,b).(a,b).

For more than two systems, the situation generalizes in a natural way. If we suppose that X1,,Xn\mathsf{X}_1,\ldots,\mathsf{X}_n are systems having classical state sets Σ1,,Σn,\Sigma_1,\ldots,\Sigma_n, respectively, for any positive integer n,n, the classical state set of the nn-tuple (X1,,Xn),(\mathsf{X}_1,\ldots,\mathsf{X}_n), viewed as a single joint system, is the Cartesian product

Σ1××Σn={(a1,,an):a1Σ1,,anΣn}. \Sigma_1\times\cdots\times\Sigma_n = \bigl\{(a_1,\ldots,a_n)\,:\, a_1\in\Sigma_1,\:\ldots,\:a_n\in\Sigma_n\bigr\}.

Note that we're free to use whatever names we wish for systems, and we're free to order them as we choose. In particular, if we have nn systems like above, we can instead choose to name them Xn1,,X0\mathsf{X}_{n-1},\ldots,\mathsf{X}_0 and order them in this way, meaning as (Xn1,,X0).(\mathsf{X}_{n-1},\ldots,\mathsf{X}_0). Mimicking the same pattern for naming the associated classical states and classical state sets, we might then refer to a classical state (an1,,a0)Σn1××Σ0(a_{n-1},\ldots,a_0) \in \Sigma_{n-1}\times \cdots \times \Sigma_0 of this compound system. Indeed, this is the standard ordering convention used by Qiskit for naming qubits, and we'll come back to this in the next lesson as we turn our focus to quantum circuits.

Representing states as strings

It is often convenient to write a classical state (a1,,an)(a_1,\ldots,a_n) as a a1ana_1\cdots a_n for the sake of brevity, particularly in the (very typical) situation that the classical state sets Σ1,,Σn\Sigma_1,\ldots,\Sigma_n are associated with sets of symbols or characters.

Indeed, the notion of a string, which is a fundamentally important concept in computer science, is formalized in mathematical terms through Cartesian products. The term alphabet is commonly used to refer to sets of symbols used to form strings, but the mathematical definition of an alphabet is precisely the same as the definition of a classical state set: it is a finite and nonempty set.

For example, suppose that X1,,X10\mathsf{X}_1,\ldots,\mathsf{X}_\mathrm{10} are bits, so that the classical state sets of these systems are all the same.

Σ1=Σ2==Σ10={0,1} \Sigma_1 = \Sigma_2 = \cdots = \Sigma_{10} = \{0,1\}

(The set {0,1}\{0,1\} is commonly referred to as the binary alphabet.) There are then 210=10242^{10} = 1024 classical states of the joint system (X1,,X10),(\mathsf{X}_1,\ldots,\mathsf{X}_\mathrm{10}), which are the elements of the set

Σ1×Σ2××Σ10={0,1}10. \Sigma_1\times\Sigma_2\times\cdots\times\Sigma_{10} = \{0,1\}^{10}.

Written as strings, these classical states look like this:

000000000000000000010000000010000000001100000001001111111111 \begin{array}{c} 0000000000\\ 0000000001\\ 0000000010\\ 0000000011\\ 0000000100\\ \vdots\\[1mm] 1111111111 \end{array}

For the classical state 0001010000,0001010000, for instance, we see that X4\mathsf{X}_4 and X6\mathsf{X}_6 are in the state 1,1, while all other systems are in the state 0.0.

Probabilistic states

Recall from the previous lesson that a probabilistic state associates a probability with each classical state of a system. Thus, a probabilistic state of multiple systems — viewed collectively as if they form a single system — associates a probability with each element of the Cartesian product of the classical state sets of the individual systems.

For example, suppose that X\mathsf{X} and Y\mathsf{Y} are both bits, so that their corresponding classical state sets are Σ={0,1}\Sigma = \{0,1\} and Γ={0,1},\Gamma = \{0,1\}, respectively. Here is a probabilistic state of the pair (X,Y):(\mathsf{X},\mathsf{Y}):

Pr((X,Y)=(0,0))=1/2Pr((X,Y)=(0,1))=0Pr((X,Y)=(1,0))=0Pr((X,Y)=(1,1))=1/2 \begin{aligned} \operatorname{Pr}\bigl( (\mathsf{X},\mathsf{Y}) = (0,0)\bigr) & = 1/2 \\[2mm] \operatorname{Pr}\bigl( (\mathsf{X},\mathsf{Y}) = (0,1)\bigr) & = 0\\[2mm] \operatorname{Pr}\bigl( (\mathsf{X},\mathsf{Y}) = (1,0)\bigr) & = 0\\[2mm] \operatorname{Pr}\bigl( (\mathsf{X},\mathsf{Y}) = (1,1)\bigr) & = 1/2 \end{aligned}

This probabilistic state is one in which both X\mathsf{X} and Y\mathsf{Y} are random bits — each is 00 with probability 1/21/2 and 11 with probability 1/21/2 — but the classical states of the two bits always agree. This is an example of a correlation between these systems.

Ordering Cartesian product state sets

Probabilistic states of systems are represented by probability vectors, which are column vectors having indices that have been placed in correspondence with the underlying classical state set of the system being considered.

The same situation arises for multiple systems. To represent a probabilistic state of multiple systems as a Cartesian product, one must decide on an ordering of the product's elements. Assuming the individual classical state sets Σ,Γ\Sigma, \Gamma of systems X,Y\mathsf{X}, \mathsf{Y} are already ordered, there is a simple convention for doing this: alphabetical ordering. More precisely, the entries in each nn-tuple (or, equivalently, the symbols in each string) are viewed as being ordered by significance that decreases from left to right.

For example, according to this convention, the Cartesian product {1,2,3}×{0,1}\{1,2,3\}\times\{0,1\} is ordered like this:

(1,0),  (1,1),  (2,0),  (2,1),  (3,0),  (3,1). (1,0),\; (1,1),\; (2,0),\; (2,1),\; (3,0),\; (3,1).

When nn-tuples are written as strings and ordered in this way, we observe familiar patterns, such as {0,1}×{0,1}\{0,1\}\times\{0,1\} being ordered as 00,01,10,11,00, 01, 10, 11, and the set {0,1}10\{0,1\}^{10} being ordered as was suggested above. We also see {0,1,,9}×{0,1,,9}\{0, 1, \dots, 9\} \times \{0, 1, \dots, 9\} ordered as the numbers 00 through 99.99. You may recognize this is not a coincidence: today's decimal number system uses the same alphabetical ordering. Here, of course, "alphabetical" has a broader meaning that may include a collection of numeric symbols.

Returning to the example of two bits from above, the probabilistic state is represented by the following probability vector (where the entries are labeled explicitly for the sake of clarity).

(120012)probability associated with state 00probability associated with state 01probability associated with state 10probability associated with state 11(1) \begin{pmatrix} \frac{1}{2}\\[1mm] 0\\[1mm] 0\\[1mm] \frac{1}{2} \end{pmatrix} \begin{array}{l} \leftarrow \text{probability associated with state 00}\\[1mm] \leftarrow \text{probability associated with state 01}\\[1mm] \leftarrow \text{probability associated with state 10}\\[1mm] \leftarrow \text{probability associated with state 11} \end{array} \tag{1}

Independence of two systems

A special type of probabilistic state of two systems is one in which the systems are independent. Intuitively speaking, two systems are independent if learning the classical state of either system has no effect on the probabilities associated with the other. That is, learning what classical state one of the systems is in provides no information at all about the classical state of the other.

To define this notion precisely, let us suppose once again that X\mathsf{X} and Y\mathsf{Y} are systems having classical state sets Σ\Sigma and Γ,\Gamma, respectively. With respect to a given probabilistic state of these systems, they are said to be independent if it is the case that

Pr((X,Y)=(a,b))=Pr(X=a)Pr(Y=b)(2) \operatorname{Pr}((\mathsf{X},\mathsf{Y}) = (a,b)) = \operatorname{Pr}(\mathsf{X} = a) \operatorname{Pr}(\mathsf{Y} = b) \tag{2}

for every choice of aΣa\in\Sigma and bΓ.b\in\Gamma.

To express this condition in terms of probability vectors, assume that the given probabilistic state of (X,Y)(\mathsf{X},\mathsf{Y}) is described by a probability vector, written in the Dirac notation as

(a,b)Σ×Γpabab.\sum_{(a,b) \in \Sigma\times\Gamma} p_{ab} \vert a b\rangle.

The condition (2)(2) for independence is then equivalent to the existence of two probability vectors

ϕ=aΣqaaandψ=bΓrbb,(3)\vert \phi \rangle = \sum_{a\in\Sigma} q_a \vert a \rangle \quad\text{and}\quad \vert \psi \rangle = \sum_{b\in\Gamma} r_b \vert b \rangle, \tag{3}

representing the probabilities associated with the classical states of X\mathsf{X} and Y,\mathsf{Y}, respectively, such that

pab=qarb(4)p_{ab} = q_a r_b \tag{4}

for all aΣa\in\Sigma and bΓ.b\in\Gamma.

For example, the probabilistic state of a pair of bits (X,Y)(\mathsf{X},\mathsf{Y}) represented by the vector

1600+11201+1210+1411 \frac{1}{6} \vert 00 \rangle + \frac{1}{12} \vert 01 \rangle + \frac{1}{2} \vert 10 \rangle + \frac{1}{4} \vert 11 \rangle

is one in which X\mathsf{X} and Y\mathsf{Y} are independent. Specifically, the condition required for independence is true for the probability vectors

ϕ=140+341andψ=230+131. \vert \phi \rangle = \frac{1}{4} \vert 0 \rangle + \frac{3}{4} \vert 1 \rangle \quad\text{and}\quad \vert \psi \rangle = \frac{2}{3} \vert 0 \rangle + \frac{1}{3} \vert 1 \rangle.

For example, to match the 0000 entry, we need 16=14×23,\frac{1}{6} = \frac{1}{4} \times \frac{2}{3}, and indeed this is the case. Other entries can be verified in a similar manner.

On the other hand, the probabilistic state (1),(1), which we may write as

1200+1211,(5) \frac{1}{2} \vert 00 \rangle + \frac{1}{2} \vert 11 \rangle, \tag{5}

does not represent independence between the systems X\mathsf{X} and Y.\mathsf{Y}. A simple way to argue this is as follows.

Suppose that there did exist probability vectors ϕ\vert \phi\rangle and ψ,\vert \psi \rangle, as in equation (3)(3) above, for which the condition (4)(4) is satisfied for every choice of aa and b.b. It would then necessarily be that

q0r1=Pr((X,Y)=(0,1))=0. q_0 r_1 = \operatorname{Pr}\bigl((\mathsf{X},\mathsf{Y}) = (0,1)\bigr) = 0.

This implies that either q0=0q_0 = 0 or r1=0,r_1 = 0, because if both were nonzero, the product q0r1q_0 r_1 would also not be zero. This leads to the conclusion that either q0r0=0q_0 r_0 = 0 (in case q0=0q_0 = 0) or q1r1=0q_1 r_1 = 0 (in case r1=0r_1 = 0). We see, however, that neither of those equalities can be true because we must have q0r0=1/2q_0 r_0 = 1/2 and q1r1=1/2.q_1 r_1 = 1/2. Hence, there do not exist vectors ϕ\vert\phi\rangle and ψ\vert\psi\rangle satisfying the property required for independence.

Having defined independence between two systems, we can now define correlation precisely as a lack of independence. For example, because the two bits in the probabilistic state represented by the vector (5)(5) are not independent, they are, by definition, correlated.

Tensor products of vectors

The condition of independence just described can be expressed more succinctly through the notion of a tensor product. Although this is a very general notion that can be defined quite abstractly and applied to a variety of mathematical structures, in the case at hand it can be defined in simple, concrete terms. Given two vectors

ϕ=aΣαaaandψ=bΓβbb,\vert \phi \rangle = \sum_{a\in\Sigma} \alpha_a \vert a \rangle \quad\text{and}\quad \vert \psi \rangle = \sum_{b\in\Gamma} \beta_b \vert b \rangle,

the tensor product ϕψ\vert \phi \rangle \otimes \vert \psi \rangle is a new vector over the joint state set Σ×Γ,\Sigma \times \Gamma, defined as

ϕψ=(a,b)Σ×Γαaβbab. \vert \phi \rangle \otimes \vert \psi \rangle = \sum_{(a,b)\in\Sigma\times\Gamma} \alpha_a \beta_b \vert ab\rangle.

Equivalently, the vector π=ϕψ\vert \pi \rangle = \vert \phi \rangle \otimes \vert \psi \rangle is defined by the equation

abπ=aϕbψ\langle ab \vert \pi \rangle = \langle a \vert \phi \rangle \langle b \vert \psi \rangle

being true for every aΣa\in\Sigma and bΓ.b\in\Gamma.

We can now recast the condition for independence as requiring the probability vector π\vert \pi \rangle of the joint system (X,Y)(\mathsf{X}, \mathsf{Y}) to be representable as a tensor product

π=ϕψ \vert \pi \rangle = \vert \phi \rangle \otimes \vert \psi \rangle

of probability vectors ϕ\vert \phi \rangle and ψ\vert \psi \rangle on each of the subsystems X\mathsf{X} and Y.\mathsf{Y}. In this situation it is said that π\vert \pi \rangle is a product state or product vector.

We often omit the symbol \otimes when taking the tensor product of kets, such as writing ϕψ\vert \phi \rangle \vert \psi \rangle rather than ϕψ.\vert \phi \rangle \otimes \vert \psi \rangle. This convention captures the idea that the tensor product is, in this context, the most natural or default way to take the product of two vectors. Although it is less common, the notation ϕψ\vert \phi\otimes\psi\rangle is also sometimes used.

When we use the alphabetical convention for ordering elements of Cartesian products, we obtain the following specification for the tensor product of two column vectors.

(α1αm)(β1βk)=(α1β1α1βkα2β1α2βkαmβ1αmβk) \begin{pmatrix} \alpha_1\\ \vdots\\ \alpha_m \end{pmatrix} \otimes \begin{pmatrix} \beta_1\\ \vdots\\ \beta_k \end{pmatrix} = \begin{pmatrix} \alpha_1 \beta_1\\ \vdots\\ \alpha_1 \beta_k\\ \alpha_2 \beta_1\\ \vdots\\ \alpha_2 \beta_k\\ \vdots\\ \alpha_m \beta_1\\ \vdots\\ \alpha_m \beta_k \end{pmatrix}

As an important aside, we observe the following expression for tensor products of standard basis vectors:

ab=ab.\vert a \rangle \otimes \vert b \rangle = \vert ab \rangle.

Alternatively, writing (a,b)(a,b) as an ordered pair rather than a string, we could write

ab=(a,b),\vert a \rangle \otimes \vert b \rangle = \vert (a,b) \rangle,

but it is more common to write

ab=a,b\vert a \rangle \otimes \vert b \rangle = \vert a,b \rangle

following a practice in mathematics of removing parentheses that do not add clarity or remove ambiguity.

The tensor product of two vectors has the important property that it is bilinear, which means that it is linear in each of the two arguments separately, assuming that the other argument is fixed. This property can be expressed through these equations:

1. Linearity in the first argument:

(ϕ1+ϕ2)ψ=ϕ1ψ+ϕ2ψ(αϕ)ψ=α(ϕψ)\begin{aligned} \bigl(\vert\phi_1\rangle + \vert\phi_2\rangle\bigr)\otimes \vert\psi\rangle & = \vert\phi_1\rangle \otimes \vert\psi\rangle + \vert\phi_2\rangle \otimes \vert\psi\rangle \\[1mm] \bigl(\alpha \vert \phi \rangle\bigr) \otimes \vert \psi \rangle & = \alpha \bigl(\vert \phi \rangle \otimes \vert \psi \rangle \bigr) \end{aligned}

2. Linearity in the second argument:

ϕ(ψ1+ψ2)=ϕψ1+ϕψ2ϕ(αψ)=α(ϕψ)\begin{aligned} \vert \phi \rangle \otimes \bigl(\vert \psi_1 \rangle + \vert \psi_2 \rangle \bigr) & = \vert \phi \rangle \otimes \vert \psi_1 \rangle + \vert \phi \rangle \otimes \vert \psi_2 \rangle\\[1mm] \vert \phi \rangle \otimes \bigl(\alpha \vert \psi \rangle \bigr) & = \alpha \bigl(\vert\phi\rangle\otimes\vert\psi\rangle\bigr) \end{aligned}

Considering the second equation in each of these pairs of equations, we see that scalars "float freely" within tensor products:

(αϕ)ψ=ϕ(αψ)=α(ϕψ).\bigl(\alpha \vert \phi \rangle\bigr) \otimes \vert \psi \rangle = \vert \phi \rangle \otimes \bigl(\alpha \vert \psi \rangle \bigr) = \alpha \bigl(\vert \phi \rangle \otimes \vert \psi \rangle \bigr).

There is therefore no ambiguity in simply writing αϕψ,\alpha\vert\phi\rangle\otimes\vert\psi\rangle, or alternatively αϕψ\alpha\vert\phi\rangle\vert\psi \rangle or αϕψ,\alpha\vert\phi\otimes\psi\rangle, to refer to this vector.

Independence and tensor products for three or more systems

The notions of independence and tensor products generalize straightforwardly to three or more systems. If X1,,Xn\mathsf{X}_1,\ldots,\mathsf{X}_n are systems having classical state sets Σ1,,Σn,\Sigma_1,\ldots,\Sigma_n, respectively, then a probabilistic state of the combined system (X1,,Xn)(\mathsf{X}_1,\ldots,\mathsf{X}_n) is a product state if the associated probability vector takes the form

ψ=ϕ1ϕn \vert \psi \rangle = \vert \phi_1 \rangle \otimes \cdots \otimes \vert \phi_n \rangle

for probability vectors ϕ1,,ϕn\vert \phi_1 \rangle,\ldots,\vert \phi_n\rangle describing probabilistic states of X1,,Xn.\mathsf{X}_1,\ldots,\mathsf{X}_n.

Here, the definition of the tensor product generalizes in a natural way: the vector

ψ=ϕ1ϕn\vert \psi \rangle = \vert \phi_1 \rangle \otimes \cdots \otimes \vert \phi_n \rangle

is defined by the equation

a1anψ=a1ϕ1anϕn \langle a_1 \cdots a_n \vert \psi \rangle = \langle a_1 \vert \phi_1 \rangle \cdots \langle a_n \vert \phi_n \rangle

being true for every a1Σ1,anΣn.a_1\in\Sigma_1, \ldots a_n\in\Sigma_n. A different, but equivalent, way to define the tensor product of three or more vectors is recursively in terms of tensor products of two vectors:

ϕ1ϕn=(ϕ1ϕn1)ϕn, \vert \phi_1 \rangle \otimes \cdots \otimes \vert \phi_n \rangle = \bigl(\vert \phi_1 \rangle \otimes \cdots \otimes \vert \phi_{n-1} \rangle\bigr) \otimes \vert \phi_n \rangle,

assuming n3.n\geq 3.

Similar to the tensor product of just two vectors, the tensor product of three or more vectors is linear in each of the arguments individually, assuming that all other arguments are fixed. In this case, we say that the tensor product of three or more vectors is multilinear.

As we did in the case of two systems, we could say that the systems X1,,Xn\mathsf{X}_1,\ldots,\mathsf{X}_n are independent when they are in a product state, but the term mutually independent is more precise. There happen to be other notions of independence for three or more systems, such as pairwise independence, that we will not be concerned with at this time.

Generalizing the observation earlier concerning tensor products of standard basis vectors, for any positive integer nn and any classical states a1,,ana_1,\ldots,a_n we have

a1an=a1an=a1,,an.\vert a_1 \rangle \otimes \cdots \otimes \vert a_n \rangle = \vert a_1 \cdots a_n \rangle = \vert a_1,\ldots,a_n \rangle.

Measurements of probabilistic states

Now let us move on to measurements of probabilistic states of multiple systems. By choosing to view multiple systems together as single systems, we immediately obtain a specification of how measurements must work for multiple systems — provided that all systems are measured.

For example, if the probabilistic state of two bits (X,Y)(\mathsf{X},\mathsf{Y}) is described by the probability vector

1200+1211, \frac{1}{2} \vert 00 \rangle + \frac{1}{2} \vert 11 \rangle,

then the outcome 0000 — meaning 00 for the measurement of X\mathsf{X} and 00 for the measurement of Y\mathsf{Y} — is obtained with probability 1/21/2 and the outcome 1111 is also obtained with probability 1/2.1/2. In each case we update the probability vector description of our knowledge accordingly, so that the probabilistic state becomes 00|00\rangle or 11,|11\rangle, respectively.

Partial measurements

Suppose, however, that we choose not to measure every system, but instead we just measure some proper subset of the systems. This will result in a measurement outcome for each system that gets measured, and will also (in general) affect our knowledge of the remaining systems.

Let us focus on the case of two systems, one of which is measured. The more general situation — in which some proper subset of three or more systems is measured — effectively reduces to the case of two systems when we view the systems that are measured collectively as if they form one system and the systems that are not measured as if they form a second system.

To be precise, let us suppose (as usual) that X\mathsf{X} is a system having classical state set Σ,\Sigma, that Y\mathsf{Y} is a system having classical state set Γ,\Gamma, and the two systems together are in some probabilistic state. We will consider what happens when we just measure X\mathsf{X} and do nothing to Y.\mathsf{Y}. The situation where just Y\mathsf{Y} is measured and nothing happens to X\mathsf{X} is handled symmetrically.

First, we know that the probability to observe a particular classical state aΣa\in\Sigma when just X\mathsf{X} is measured must be consistent with the probabilities we would obtain under the assumption that Y\mathsf{Y} was also measured. That is, we must have

Pr(X=a)=bΓPr((X,Y)=(a,b)). \operatorname{Pr}(\mathsf{X} = a) = \sum_{b\in\Gamma} \operatorname{Pr}\bigl( (\mathsf{X},\mathsf{Y}) = (a,b) \bigr).

This is the formula for the so-called reduced (or marginal) probabilistic state of X\mathsf{X} alone.

This formula makes perfect sense at an intuitive level; something very strange would need to happen for it to be wrong. It would mean that the probabilities for X\mathsf{X} measurements are influenced simply by whether or not Y\mathsf{Y} is also measured, irrespective of the outcome on Y.\mathsf{Y}. If Y\mathsf{Y} happened to be in a distant location, say, another galaxy, this would allow for faster-than-light signaling, which we reject based on our understanding of physics. Another way to understand this comes from an interpretation of probability as reflecting a degree of belief about the state of the system. Since a measurement on Y\mathsf{Y} is taken to simply reveal a preexisting state, a different observer looking at X,\mathsf{X}, unaware of the Y\mathsf{Y} measurement, should not have their probabilities changed.

Given the assumption that only X\mathsf{X} is measured and Y\mathsf{Y} is not, there may in general still exist uncertainty over the classical state of Y.\mathsf{Y}. For this reason, rather than updating our description of the probabilistic state of (X,Y)(\mathsf{X},\mathsf{Y}) to ab\vert ab\rangle for some selection of aΣa\in\Sigma and bΓ,b\in\Gamma, we must update our description so that this uncertainty about Y\mathsf{Y} is properly reflected.

The following conditional probability formula reflects this uncertainty.

Pr(Y=bX=a)=Pr((X,Y)=(a,b))Pr(X=a) \operatorname{Pr}(\mathsf{Y} = b \,|\, \mathsf{X} = a) = \frac{ \operatorname{Pr}\bigl((\mathsf{X},\mathsf{Y}) = (a,b)\bigr) }{ \operatorname{Pr}(\mathsf{X} = a) }

Here, the expression Pr(Y=bX=a)\operatorname{Pr}(\mathsf{Y} = b | \mathsf{X} = a) denotes the probability that Y=b\mathsf{Y} = b conditioned on (or given that) X=a.\mathsf{X} = a.

It should be noted that the expression above is only defined if Pr(X=a)\operatorname{Pr}(\mathsf{X}=a) is nonzero, for if

Pr(X=a)=0,\operatorname{Pr}(\mathsf{X}=a) = 0,

then we obtain the indeterminate form 00.\frac{0}{0}. This is not a problem, though, because if the probability associated with aa is zero, then we'll never observe aa as an outcome of a measurement of X,\mathsf{X}, so we don't need to be concerned with this possibility.

To express these formulas in terms of probability vectors, consider a probability vector ψ\vert \psi \rangle describing the joint state of (X,Y).(\mathsf{X},\mathsf{Y}).

ψ=(a,b)Σ×Γpabab \vert\psi\rangle = \sum_{(a,b)\in\Sigma\times\Gamma} p_{ab} \vert ab\rangle

Measuring X\mathsf{X} alone yields each possible outcome with probabilities

Pr(X=a)=bΓpab. \operatorname{Pr}(\mathsf{X} = a) = \sum_{b\in\Gamma} p_{ab}.

Thus, the vector representing the probabilistic state of X\mathsf{X} alone (i.e., the reduced probabilistic state of X\mathsf{X}) is given by

aΣ(cΓpac)a. \sum_{a\in\Sigma} \biggl(\sum_{c\in\Gamma} p_{ac}\biggr) \vert a\rangle.

Having obtained a particular outcome aΣa\in\Sigma of the measurement of X,\mathsf{X}, the probabilistic state of Y\mathsf{Y} is updated according to the formula for conditional probabilities, so that it is represented by this probability vector:

πa=bΓpabbcΓpac. \vert \pi_a \rangle = \frac{\sum_{b\in\Gamma}p_{ab}\vert b\rangle}{\sum_{c\in\Gamma} p_{ac}}.

In the event that the measurement of X\mathsf{X} resulted in the classical state a,a, we therefore update our description of the probabilistic state of the joint system (X,Y)(\mathsf{X},\mathsf{Y}) to aπa.\vert a\rangle \otimes \vert\pi_a\rangle.

One way to think about this definition of πa\vert\pi_a\rangle is to see it as a normalization of the vector bΓpabb,\sum_{b\in\Gamma} p_{ab} \vert b\rangle, where we divide by the sum of the entries in this vector to obtain a probability vector. This normalization effectively accounts for a conditioning on the event that the measurement of X\mathsf{X} has resulted in the outcome a.a.

For a specific example, suppose that classical state set of X\mathsf{X} is Σ={0,1},\Sigma = \{0,1\}, the classical state set of Y\mathsf{Y} is Γ={1,2,3},\Gamma = \{1,2,3\}, and the probabilistic state of (X,Y)(\mathsf{X},\mathsf{Y}) is

ψ=120,1+1120,3+1121,1+161,2+161,3. \vert \psi \rangle = \frac{1}{2} \vert 0,1 \rangle + \frac{1}{12} \vert 0,3 \rangle + \frac{1}{12} \vert 1,1 \rangle + \frac{1}{6} \vert 1,2 \rangle + \frac{1}{6} \vert 1,3 \rangle.

Our goal will be to determine the probabilities of the two possible outcomes (00 and 11), and to calculate what the resulting probabilistic state of Y\mathsf{Y} is for the two outcomes, assuming the system X\mathsf{X} is measured.

Using the bilinearity of the tensor product, and specifically the fact that it is linear in the second argument, we may rewrite the vector ψ\vert \psi \rangle as follows:

ψ=0(121+1123)+1(1121+162+163). \vert \psi \rangle = \vert 0\rangle \otimes \biggl( \frac{1}{2} \vert 1 \rangle + \frac{1}{12} \vert 3 \rangle\biggr) + \vert 1\rangle \otimes \biggl( \frac{1}{12} \vert 1 \rangle + \frac{1}{6} \vert 2\rangle + \frac{1}{6} \vert 3 \rangle\biggr).

We have isolated the distinct standard basis vectors for the system being measured, collecting together all of the terms for the second system. A moment's thought reveals that this is always possible, regardless of what vector we started with.

Having reorganized as such, the measurement outcomes become easy to analyze. The probabilities of the two outcomes are given by

Pr(X=0)=12+112=712Pr(X=1)=112+16+16=512. \begin{aligned} \operatorname{Pr}(\mathsf{X} = 0) & = \frac{1}{2} + \frac{1}{12} = \frac{7}{12}\\[2mm] \operatorname{Pr}(\mathsf{X} = 1) & = \frac{1}{12} + \frac{1}{6} + \frac{1}{6} = \frac{5}{12}. \end{aligned}

Note that these probabilities sum to one as expected, a useful check on our calculations.

Moreover, the probabilistic state of Y,\mathsf{Y}, conditioned on each possible outcome, can also be quickly inferred by normalizing the vectors in parentheses (by dividing by the associated probability just calculated), so that these vectors become probability vectors. That is, conditioned on X\mathsf{X} being 0,0, the probabilistic state of Y\mathsf{Y} becomes

121+1123712=671+173, \frac{\frac{1}{2} \vert 1 \rangle + \frac{1}{12} \vert 3 \rangle}{\frac{7}{12}} = \frac{6}{7} \vert 1 \rangle + \frac{1}{7} \vert 3 \rangle,

and conditioned on the measurement of X\mathsf{X} being 1, the probabilistic state of Y\mathsf{Y} becomes

1121+162+163512=151+252+253. \frac{\frac{1}{12} \vert 1 \rangle + \frac{1}{6} \vert 2\rangle + \frac{1}{6} \vert 3 \rangle}{\frac{5}{12}} = \frac{1}{5} \vert 1 \rangle + \frac{2}{5} \vert 2 \rangle + \frac{2}{5} \vert 3 \rangle.

Operations on probabilistic states

To conclude this discussion of classical information for multiple systems, we will consider operations on multiple systems in probabilistic states. Following the same idea as we did for probabilistic states and measurements, we can view multiple systems collectively as forming single, compound systems and look to the previous lesson to see how this works.

Returning to the typical set-up where we have two systems X\mathsf{X} and Y,\mathsf{Y}, let us consider classical operations on the compound system (X,Y).(\mathsf{X},\mathsf{Y}). Based on the previous lesson and the discussion above, we conclude that any such operation is represented by a stochastic matrix whose rows and columns are indexed by the Cartesian product Σ×Γ.\Sigma\times\Gamma.

For example, suppose that X\mathsf{X} and Y\mathsf{Y} are bits, and consider an operation with the following description.

If X=1,\mathsf{X} = 1, then perform a NOT operation on Y.\mathsf{Y}.
Otherwise do nothing.

This is a deterministic operation known as a controlled-NOT operation, where X\mathsf{X} is the control bit that determines whether or not a NOT operation should be applied to the target bit Y.\mathsf{Y}. Here is the matrix representation of this operation:

(1000010000010010).\begin{pmatrix} 1 & 0 & 0 & 0\\[2mm] 0 & 1 & 0 & 0\\[2mm] 0 & 0 & 0 & 1\\[2mm] 0 & 0 & 1 & 0 \end{pmatrix}.

Its action on standard basis states is as follows.

0000010110111110\begin{aligned} \vert 00 \rangle & \mapsto \vert 00 \rangle\\ \vert 01 \rangle & \mapsto \vert 01 \rangle\\ \vert 10 \rangle & \mapsto \vert 11 \rangle\\ \vert 11 \rangle & \mapsto \vert 10 \rangle \end{aligned}

If we were to exchange the roles of X\mathsf{X} and Y,\mathsf{Y}, taking Y\mathsf{Y} to be the control bit and X\mathsf{X} to be the target bit, then the matrix representation of the operation would become

(1000000100100100)\begin{pmatrix} 1 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 1\\[2mm] 0 & 0 & 1 & 0\\[2mm] 0 & 1 & 0 & 0 \end{pmatrix}

and its action on standard basis states would be like this:

0000011110101101\begin{aligned} \vert 00 \rangle & \mapsto \vert 00 \rangle\\ \vert 01 \rangle & \mapsto \vert 11 \rangle\\ \vert 10 \rangle & \mapsto \vert 10 \rangle\\ \vert 11 \rangle & \mapsto \vert 01 \rangle \end{aligned}

Another example is the operation having this description:

Perform one of the following two operations, each with probability 1/2:1/2:

  1. Set Y\mathsf{Y} to be equal to X.\mathsf{X}.
  2. Set X\mathsf{X} to be equal to Y.\mathsf{Y}.

The matrix representation of this operation is as follows:

(11212000000000012121)=12(1100000000000011)+12(1010000000000101).\begin{pmatrix} 1 & \frac{1}{2} & \frac{1}{2} & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & \frac{1}{2} & \frac{1}{2} & 1 \end{pmatrix} = \frac{1}{2} \begin{pmatrix} 1 & 1 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 1 & 1 \end{pmatrix} + \frac{1}{2} \begin{pmatrix} 1 & 0 & 1 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 1 & 0 & 1 \end{pmatrix}.

The action of this operation on standard basis vectors is as follows:

0000011200+1211101200+12111111\begin{aligned} \vert 00 \rangle & \mapsto \vert 00 \rangle\\[1mm] \vert 01 \rangle & \mapsto \frac{1}{2} \vert 00 \rangle + \frac{1}{2}\vert 11\rangle\\[3mm] \vert 10 \rangle & \mapsto \frac{1}{2} \vert 00 \rangle + \frac{1}{2}\vert 11\rangle\\[2mm] \vert 11 \rangle & \mapsto \vert 11 \rangle \end{aligned}

In these examples, we are simply viewing two systems together as a single system and proceeding as in the previous lesson.

The same thing can be done for any number of systems. For example, imagine that we have three bits, and we increment the three bits modulo 88 — meaning that we think about the three bits as encoding a number between 00 and 77 using binary notation, add 1,1, and then take the remainder after dividing by 8.8. We can write this operation like this:

001000+010001+011010+100011+101100+110101+111110+000111.\begin{aligned} & \vert 001 \rangle \langle 000 \vert + \vert 010 \rangle \langle 001 \vert + \vert 011 \rangle \langle 010 \vert + \vert 100 \rangle \langle 011 \vert\\[1mm] & \quad + \vert 101 \rangle \langle 100 \vert + \vert 110 \rangle \langle 101 \vert + \vert 111 \rangle \langle 110 \vert + \vert 000 \rangle \langle 111 \vert. \end{aligned}

We could also write it like this:

k=07(k+1)mod8k,\sum_{k = 0}^{7} \vert (k+1) \bmod 8 \rangle \langle k \vert,

assuming we have agreed that a number j{0,1,,7}j\in\{0,1,\ldots,7\} inside of a ket refers to that number's three-bit binary encoding. A third option is to express this operation as a matrix.

(0000000110000000010000000010000000010000000010000000010000000010).\begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{pmatrix}.

Independent operations

Now suppose that we have multiple systems and we independently perform separate operations on the systems.

For example, taking our usual set-up of two systems X\mathsf{X} and Y\mathsf{Y} having classical state sets Σ\Sigma and Γ,\Gamma, respectively, let us suppose that we perform one operation on X\mathsf{X} and, completely independently, another operation on Y.\mathsf{Y}. As we know from the previous lesson, these operations are represented by stochastic matrices — and to be precise, let us say that the operation on X\mathsf{X} is represented by the matrix MM and the operation on Y\mathsf{Y} is represented by the matrix N.N. Thus, the rows and columns of MM have indices that are placed in correspondence with the elements of Σ\Sigma and, likewise, the rows and columns of NN correspond to the elements of Γ.\Gamma.

A natural question to ask is this: if we view X\mathsf{X} and Y\mathsf{Y} together as a single, compound system (X,Y),(\mathsf{X},\mathsf{Y}), what is the matrix that represents the combined action of the two operations on this compound system? To answer this question, we must first introduce the tensor product of matrices — which is similar to the tensor product of vectors and is defined analogously.

Tensor products of matrices

The tensor product MNM\otimes N of the matrices

M=a,bΣαabab M = \sum_{a,b\in\Sigma} \alpha_{ab} \vert a\rangle \langle b\vert

and

N=c,dΓβcdcd N = \sum_{c,d\in\Gamma} \beta_{cd} \vert c\rangle \langle d\vert

is the matrix

MN=a,bΣc,dΓαabβcdacbd M \otimes N = \sum_{a,b\in\Sigma} \sum_{c,d\in\Gamma} \alpha_{ab} \beta_{cd} \vert ac \rangle \langle bd \vert

Equivalently, MM and NN is defined by the equation

acMNbd=aMbcNd\langle ac \vert M \otimes N \vert bd\rangle = \langle a \vert M \vert b\rangle \langle c \vert N \vert d\rangle

being true for every selection of a,bΣa,b\in\Sigma and c,dΓ.c,d\in\Gamma.

An alternative, but equivalent, way to describe MNM\otimes N is that it is the unique matrix that satisfies the equation

(MN)(ϕψ)=(Mϕ)(Nψ) (M \otimes N) \bigl( \vert \phi \rangle \otimes \vert \psi \rangle \bigr) = \bigl(M \vert\phi\rangle\bigr) \otimes \bigl(N \vert\psi\rangle\bigr)

for every possible choice of vectors ϕ\vert\phi\rangle and ψ.\vert\psi\rangle. Here we are assuming that the indices of ϕ\vert\phi\rangle correspond to the elements of Σ\Sigma and the indices of ψ\vert\psi\rangle correspond to Γ.\Gamma.

Following the convention described previously for ordering the elements of Cartesian products, we can also write the tensor product of two matrices explicitly as follows:

(α11α1mαm1αmm)(β11β1kβk1βkk)=(α11β11α11β1kα1mβ11α1mβ1kα11βk1α11βkkα1mβk1α1mβkkαm1β11αm1β1kαmmβ11αmmβ1kαm1βk1αm1βkkαmmβk1αmmβkk)\begin{gathered} \begin{pmatrix} \alpha_{11} & \cdots & \alpha_{1m} \\ \vdots & \ddots & \vdots \\ \alpha_{m1} & \cdots & \alpha_{mm} \end{pmatrix} \otimes \begin{pmatrix} \beta_{11} & \cdots & \beta_{1k} \\ \vdots & \ddots & \vdots\\ \beta_{k1} & \cdots & \beta_{kk} \end{pmatrix} \hspace{6cm}\\[8mm] \hspace{1cm} = \begin{pmatrix} \alpha_{11}\beta_{11} & \cdots & \alpha_{11}\beta_{1k} & & \alpha_{1m}\beta_{11} & \cdots & \alpha_{1m}\beta_{1k} \\ \vdots & \ddots & \vdots & \hspace{2mm}\cdots\hspace{2mm} & \vdots & \ddots & \vdots \\ \alpha_{11}\beta_{k1} & \cdots & \alpha_{11}\beta_{kk} & & \alpha_{1m}\beta_{k1} & \cdots & \alpha_{1m}\beta_{kk} \\[2mm] & \vdots & & \ddots & & \vdots & \\[2mm] \alpha_{m1}\beta_{11} & \cdots & \alpha_{m1}\beta_{1k} & & \alpha_{mm}\beta_{11} & \cdots & \alpha_{mm}\beta_{1k} \\ \vdots & \ddots & \vdots & \hspace{2mm}\cdots\hspace{2mm} & \vdots & \ddots & \vdots \\ \alpha_{m1}\beta_{k1} & \cdots & \alpha_{m1}\beta_{kk} & & \alpha_{mm}\beta_{k1} & \cdots & \alpha_{mm}\beta_{kk} \end{pmatrix} \end{gathered}

Tensor products of three or more matrices are defined in an analogous way. If M1,,MnM_1, \ldots, M_n are matrices whose indices correspond to classical state sets Σ1,,Σn,\Sigma_1,\ldots,\Sigma_n, then the tensor product M1MnM_1\otimes\cdots\otimes M_n is defined by the condition that

a1anM1Mnb1bn=a1M1b1anMnbn\langle a_1\cdots a_n \vert M_1\otimes\cdots\otimes M_n \vert b_1\cdots b_n\rangle = \langle a_1 \vert M_1 \vert b_1 \rangle \cdots\langle a_n \vert M_n \vert b_n \rangle

for every choice of classical states a1,b1Σ1,,an,bnΣn.a_1,b_1\in\Sigma_1,\ldots,a_n,b_n\in\Sigma_n.

Alternatively, we could also define the tensor product of three or more matrices recursively, in terms of tensor products of two matrices, similar to what we observed for vectors.

The tensor product of matrices is sometimes said to be multiplicative because the equation

(M1Mn)(N1Nn)=(M1N1)(MnNn) (M_1\otimes\cdots\otimes M_n)(N_1\otimes\cdots\otimes N_n) = (M_1 N_1)\otimes\cdots\otimes (M_n N_n)

is always true, for any choice of matrices M1,,MnM_1,\ldots,M_n and N1,,Nn,N_1,\ldots,N_n, provided that the products M1N1,,MnNnM_1 N_1, \ldots, M_n N_n make sense.

Independent operations (continued)

To summarize the above discussion, we found that if MM is a probabilistic operation on X,\mathsf{X}, NN is a probabilistic operation on Y,\mathsf{Y}, and the two operations are performed independently, then the resulting operation on the compound system (X,Y)(\mathsf{X},\mathsf{Y}) is the tensor product MN.M\otimes N.

What we see, both here and for probabilistic states, is that tensor products represent independence: if we have two systems X\mathsf{X} and Y\mathsf{Y} that are independently in the probabilistic states ϕ\vert\phi\rangle and π,\vert\pi\rangle, then the compound system (X,Y)(\mathsf{X},\mathsf{Y}) is in the probabilistic state ϕπ;\vert\phi\rangle\otimes\vert\pi\rangle; and if we independently apply probabilistic operations MM and NN to the two systems independently, then the resulting action on the compound system (X,Y)(\mathsf{X},\mathsf{Y}) is described by the operation MN.M\otimes N.

Let us take a look at an example, which recalls a probabilistic operation on a single bit from the previous lesson: if the classical state of the bit is 0,0, it is left alone; and if the classical state of the bit is 1,1, it is flipped to 0 with probability 1/2.1/2. As we observed, this operation is represented by the matrix

(112012). \begin{pmatrix} 1 & \frac{1}{2}\\[1mm] 0 & \frac{1}{2} \end{pmatrix}.

If this operation is performed on a bit X,\mathsf{X}, and a NOT operation is (independently) performed on a second bit Y,\mathsf{Y}, then the joint operation on the compound system (X,Y)(\mathsf{X},\mathsf{Y}) has the matrix representation

(112012)(0110)=(01012101200001200120). \begin{pmatrix} 1 & \frac{1}{2}\\[1mm] 0 & \frac{1}{2} \end{pmatrix} \otimes \begin{pmatrix} 0 & 1\\[1mm] 1 & 0 \end{pmatrix} = \begin{pmatrix} 0 & 1 & 0 & \frac{1}{2} \\[1mm] 1 & 0 & \frac{1}{2} & 0 \\[1mm] 0 & 0 & 0 & \frac{1}{2} \\[1mm] 0 & 0 & \frac{1}{2} & 0 \end{pmatrix}.

By inspection, we see that this is a stochastic matrix.

This will always be the case: the tensor product of two or more stochastic matrices is always stochastic.

A common situation that we encounter is one in which one operation is performed on one system and nothing is done to another. In such a case, exactly the same prescription is followed, noting that doing nothing is represented by the identity matrix. For example, resetting the bit X\mathsf{X} to the 00 state and doing nothing to Y\mathsf{Y} yields the probabilistic (and in fact deterministic) operation on (X,Y)(\mathsf{X},\mathsf{Y}) represented by the matrix

(1100)(1001)=(1010010100000000). \begin{pmatrix} 1 & 1\\[1mm] 0 & 0 \end{pmatrix} \otimes \begin{pmatrix} 1 & 0\\[1mm] 0 & 1 \end{pmatrix} = \begin{pmatrix} 1 & 0 & 1 & 0 \\[1mm] 0 & 1 & 0 & 1 \\[1mm] 0 & 0 & 0 & 0 \\[1mm] 0 & 0 & 0 & 0 \end{pmatrix}.

Quantum information

We are now prepared to move on to quantum information in the setting of multiple systems. Much like in the previous lesson on single systems, the mathematical description of quantum information for multiple systems is quite similar to the probabilistic case and makes use of similar concepts and techniques.

Quantum states

Multiple systems can be viewed collectively as single, compound systems. We have already observed this in the probabilistic setting, and the quantum setting is analogous.

That is, quantum states of multiple systems are represented by column vectors having complex number entries and Euclidean norm equal to 11 — just like quantum states of single systems. In the multiple system case, the indices of these vectors are placed in correspondence with the Cartesian product of the classical state sets associated with each of the individual systems (because that is the classical state set of the compound system).

For instance, if X\mathsf{X} and Y\mathsf{Y} are qubits, then the classical state set of the pair of qubits (X,Y),(\mathsf{X},\mathsf{Y}), viewed collectively as a single system, is the Cartesian product {0,1}×{0,1}.\{0,1\}\times\{0,1\}. By representing pairs of binary values as binary strings of length two, we associate this Cartesian product set with the set {00,01,10,11}.\{00,01,10,11\}. The following vectors are therefore all examples of quantum state vectors of the pair (X,Y):(\mathsf{X},\mathsf{Y}):

12001601+i610+1611,35004511,and01. \frac{1}{\sqrt{2}} \vert 00 \rangle - \frac{1}{\sqrt{6}} \vert 01\rangle + \frac{i}{\sqrt{6}} \vert 10\rangle + \frac{1}{\sqrt{6}} \vert 11\rangle, \quad \frac{3}{5} \vert 00\rangle - \frac{4}{5} \vert 11\rangle, \quad \text{and} \quad \vert 01 \rangle.

There are variations on how quantum state vectors of multiple systems are expressed, and we can choose whichever variation suits our preferences. Here are some examples, which are for the first quantum state vector above.

  1. We may use the fact that ab=ab\vert ab\rangle = \vert a\rangle \vert b\rangle (for any classical states aa and bb) to instead write

    12001601+i610+1611.\frac{1}{\sqrt{2}} \vert 0\rangle\vert 0 \rangle - \frac{1}{\sqrt{6}} \vert 0\rangle\vert 1\rangle + \frac{i}{\sqrt{6}} \vert 1\rangle\vert 0\rangle + \frac{1}{\sqrt{6}} \vert 1\rangle\vert 1\rangle.
  2. We may choose to write the tensor product symbol explicitly like this:

    12001601+i610+1611.\frac{1}{\sqrt{2}} \vert 0\rangle\otimes\vert 0 \rangle - \frac{1}{\sqrt{6}} \vert 0\rangle\otimes\vert 1\rangle + \frac{i}{\sqrt{6}} \vert 1\rangle\otimes\vert 0\rangle + \frac{1}{\sqrt{6}} \vert 1\rangle\otimes\vert 1\rangle.
  3. We may subscript the kets to indicate how they correspond to the systems being considered, like this:

    120X0Y160X1Y+i61X0Y+161X1Y.\frac{1}{\sqrt{2}} \vert 0\rangle_{\mathsf{X}}\vert 0 \rangle_{\mathsf{Y}} - \frac{1}{\sqrt{6}} \vert 0\rangle_{\mathsf{X}}\vert 1\rangle_{\mathsf{Y}} + \frac{i}{\sqrt{6}} \vert 1\rangle_{\mathsf{X}}\vert 0\rangle_{\mathsf{Y}} + \frac{1}{\sqrt{6}} \vert 1\rangle_{\mathsf{X}}\vert 1\rangle_{\mathsf{Y}}.

Of course, we may also write quantum state vectors explicitly as column vectors:

(1216i616). \begin{pmatrix} \frac{1}{\sqrt{2}}\\[2mm] - \frac{1}{\sqrt{6}}\\[2mm] \frac{i}{\sqrt{6}}\\[2mm] \frac{1}{\sqrt{6}} \end{pmatrix}.

Depending upon the context in which it appears, one of these variations may be preferred — but they are all equivalent in the sense that they describe the same vector.

Tensor products of quantum state vectors

Similar to what we have for probability vectors, tensor products of quantum state vectors are also quantum state vectors — and again they represent independence among systems.

In greater detail, and beginning with the case of two systems, suppose that ϕ\vert \phi \rangle is a quantum state vector of a system X\mathsf{X} and ψ\vert \psi \rangle is a quantum state vector of a system Y.\mathsf{Y}. The tensor product ϕψ,\vert \phi \rangle \otimes \vert \psi \rangle, which may alternatively be written as ϕψ\vert \phi \rangle \vert \psi \rangle or as ϕψ,\vert \phi \otimes \psi \rangle, is then a quantum state vector of the joint system (X,Y).(\mathsf{X},\mathsf{Y}). We refer to a state of this form as a being a product state.

Intuitively speaking, when a pair of systems (X,Y)(\mathsf{X},\mathsf{Y}) is in a product state ϕψ,\vert \phi \rangle \otimes \vert \psi \rangle, we may interpret this as meaning that X\mathsf{X} is in the quantum state ϕ,\vert \phi \rangle, Y\mathsf{Y} is in the quantum state ψ,\vert \psi \rangle, and the states of the two systems have nothing to do with one another.

The fact that the tensor product vector ϕψ\vert \phi \rangle \otimes \vert \psi \rangle is indeed a quantum state vector is consistent with the Euclidean norm being multiplicative with respect to tensor products:

ϕψ=(a,b)Σ×Γabϕψ2=aΣbΓaϕbψ2=(aΣaϕ2)(bΓbψ2)=ϕψ.\begin{aligned} \bigl\| \vert \phi \rangle \otimes \vert \psi \rangle \bigr\| & = \sqrt{ \sum_{(a,b)\in\Sigma\times\Gamma} \bigl\vert\langle ab \vert \phi\otimes\psi \rangle \bigr\vert^2 }\\[1mm] & = \sqrt{ \sum_{a\in\Sigma} \sum_{b\in\Gamma} \bigl\vert\langle a \vert \phi \rangle \langle b \vert \psi \rangle \bigr\vert^2 }\\[1mm] & = \sqrt{ \biggl(\sum_{a\in\Sigma} \bigl\vert \langle a \vert \phi \rangle \bigr\vert^2 \biggr) \biggl(\sum_{b\in\Gamma} \bigl\vert \langle b \vert \psi \rangle \bigr\vert^2 \biggr) }\\[1mm] & = \bigl\| \vert \phi \rangle \bigr\| \bigl\| \vert \psi \rangle \bigr\|. \end{aligned}

Thus, because ϕ\vert \phi \rangle and ψ\vert \psi \rangle are quantum state vectors, we have ϕ=1\|\vert \phi \rangle\| = 1 and ψ=1,\|\vert \psi \rangle\| = 1, and therefore ϕψ=1,\|\vert \phi \rangle \otimes \vert \psi \rangle\| = 1, so ϕψ\vert \phi \rangle \otimes \vert \psi \rangle is also a quantum state vector.

This discussion may be generalized to more than two systems. If ψ1,,ψn\vert \psi_1 \rangle,\ldots,\vert \psi_n \rangle are quantum state vectors of systems X1,,Xn,\mathsf{X}_1,\ldots,\mathsf{X}_n, then ψ1ψn\vert \psi_1 \rangle\otimes\cdots\otimes \vert \psi_n \rangle is a quantum state vector representing a product state of the joint system (X1,,Xn).(\mathsf{X}_1,\ldots,\mathsf{X}_n). Again, we know that this is a quantum state vector because

ψ1ψn=ψ1ψn=1n=1. \bigl\| \vert \psi_1 \rangle\otimes\cdots\otimes \vert \psi_n \rangle \bigr\| = \bigl\|\vert \psi_1 \rangle\bigl\| \cdots \bigl\|\vert \psi_n \rangle \bigr\| = 1^n = 1.

Entangled states

Not all quantum state vectors of multiple systems are product states. For example, the quantum state vector

1200+1211(6) \frac{1}{\sqrt{2}} \vert 00\rangle + \frac{1}{\sqrt{2}} \vert 11\rangle \tag{6}

of two qubits is not a product state. To reason this, we may follow exactly the same argument that we used to prove that the probabilistic state represented by the vector (5)(5) is not a product state.

That is, if (6)(6) was a product state, there would exist quantum state vectors ϕ\vert\phi\rangle and ψ\vert\psi\rangle for which

ϕψ=1200+1211. \vert\phi\rangle\otimes\vert\psi\rangle = \frac{1}{\sqrt{2}} \vert 00\rangle + \frac{1}{\sqrt{2}} \vert 11\rangle.

But then it would necessarily be the case that

0ϕ1ψ=01ϕψ=0 \langle 0 \vert \phi\rangle \langle 1 \vert \psi\rangle = \langle 01 \vert \phi\otimes\psi\rangle = 0

implying that 0ϕ=0\langle 0 \vert \phi\rangle = 0 or 1ψ=0\langle 1 \vert \psi\rangle = 0 (or both). That contradicts the fact that

0ϕ0ψ=00ϕψ=12 \langle 0 \vert \phi\rangle \langle 0 \vert \psi\rangle = \langle 00 \vert \phi\otimes\psi\rangle = \frac{1}{\sqrt{2}}

and

1ϕ1ψ=11ϕψ=12 \langle 1 \vert \phi\rangle \langle 1 \vert \psi\rangle = \langle 11 \vert \phi\otimes\psi\rangle = \frac{1}{\sqrt{2}}

are both nonzero.

Notice that the specific value 1/21/\sqrt{2} is not important to this argument — what is important is that this value is nonzero. Thus, for instance, the quantum state

3500+4511 \frac{3}{5} \vert 00\rangle + \frac{4}{5} \vert 11\rangle

is also not a product state, by the same argument.

It follows that the quantum state vector (6)(6) represents a correlation between two systems, and specifically we say that the systems are entangled.

Entanglement is a quintessential feature of quantum information that will be discussed in much greater detail in later lessons. Entanglement can be complicated, particularly for the sorts of noisy quantum states that can be described in the general, density matrix formulation of quantum information that was mentioned in Lesson 1 — but for quantum state vectors in the simplified formulation that we are focusing on in this unit, entanglement is equivalent to correlation. That is, any quantum state vector that is not a product vector represents an entangled state.

In contrast, the quantum state vector

1200+i2011210i211 \frac{1}{2} \vert 00\rangle + \frac{i}{2} \vert 01\rangle - \frac{1}{2} \vert 10\rangle - \frac{i}{2} \vert 11\rangle

is an example of a product state:

1200+i2011210i211=(120121)(120+i21). \frac{1}{2} \vert 00\rangle + \frac{i}{2} \vert 01\rangle - \frac{1}{2} \vert 10\rangle - \frac{i}{2} \vert 11\rangle = \biggl( \frac{1}{\sqrt{2}}\vert 0\rangle - \frac{1}{\sqrt{2}}\vert 1\rangle \biggr) \otimes \biggl( \frac{1}{\sqrt{2}}\vert 0\rangle + \frac{i}{\sqrt{2}}\vert 1\rangle \biggr).

Hence, this state is not entangled.

Bell states

We will now take a look as some important examples of multiple-qubit quantum states, beginning with the Bell states. These are the following four two-qubit states:

ϕ+=1200+1211ϕ=12001211ψ+=1201+1210ψ=12011210\begin{aligned} \vert \phi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 00 \rangle + \frac{1}{\sqrt{2}} \vert 11 \rangle \\[1mm] \vert \phi^- \rangle & = \frac{1}{\sqrt{2}} \vert 00 \rangle - \frac{1}{\sqrt{2}} \vert 11 \rangle \\[1mm] \vert \psi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 01 \rangle + \frac{1}{\sqrt{2}} \vert 10 \rangle \\[1mm] \vert \psi^- \rangle & = \frac{1}{\sqrt{2}} \vert 01 \rangle - \frac{1}{\sqrt{2}} \vert 10 \rangle \end{aligned}

The Bell states are so-named in honor of .

Notice that the same argument that establishes that ϕ+\vert\phi^+\rangle is not a product state reveals that none of the other Bell states is a product state either — all four of the Bell states represent entanglement between two qubits.

The collection of all four Bell states

{ϕ+,ϕ,ψ+,ψ} \bigl\{\vert \phi^+ \rangle, \vert \phi^- \rangle, \vert \psi^+ \rangle, \vert \psi^- \rangle\bigr\}

is known as the Bell basis; any quantum state vector of two qubits, or indeed any complex vector at all having entries corresponding to the four classical states of two bits, can be expressed as a linear combination of the four Bell states. For example,

00=12ϕ++12ϕ. \vert 0 0 \rangle = \frac{1}{\sqrt{2}} \vert \phi^+\rangle + \frac{1}{\sqrt{2}} \vert \phi^-\rangle.

GHZ and W states

Next we will consider two interesting examples of states of three qubits.

The first example, which we will consider represents a quantum of three qubits (X,Y,Z),(\mathsf{X},\mathsf{Y},\mathsf{Z}), is the GHZ state (so named in honor of Daniel Greenberger, Michael Horne, and Anton Zeilinger, who first studied some of its properties):

12000+12111. \frac{1}{\sqrt{2}} \vert 000\rangle + \frac{1}{\sqrt{2}} \vert 111\rangle.

The second example is the so-called W state:

13001+13010+13100. \frac{1}{\sqrt{3}} \vert 001\rangle + \frac{1}{\sqrt{3}} \vert 010\rangle + \frac{1}{\sqrt{3}} \vert 100\rangle.

Neither of these states is a product state, meaning that they cannot be written as a tensor product of three qubit quantum state vectors.

We will examine both of these two states further when we discuss partial measurements of quantum states of multiple systems.

Additional examples

The examples of quantum states of multiple systems we have seen so far are states of two or three qubits, but we can also have quantum states of multiple systems having different classical state sets.

For example, here is a quantum state of three systems, X,\mathsf{X}, Y,\mathsf{Y}, and Z,\mathsf{Z}, where the classical state set of X\mathsf{X} is the binary alphabet (so X\mathsf{X} is a qubit) and the classical state set of Y\mathsf{Y} and Z\mathsf{Z} is {,,,}:\{\clubsuit,\diamondsuit,\heartsuit,\spadesuit\}:

120+121120. \frac{1}{2} \vert 0 \rangle \vert \heartsuit\rangle \vert \heartsuit \rangle + \frac{1}{2} \vert 1 \rangle \vert \spadesuit\rangle \vert \heartsuit \rangle - \frac{1}{\sqrt{2}} \vert 0 \rangle \vert \heartsuit\rangle \vert \diamondsuit \rangle.

And, here is an example of a quantum state of three systems (X,Y,Z),(\mathsf{X}, \mathsf{Y}, \mathsf{Z}), where X,\mathsf{X}, Y,\mathsf{Y}, and Z\mathsf{Z} all share the same classical state set {0,1,2}:\{0,1,2\}:

012021+120102+2012106. \frac{ \vert 012 \rangle - \vert 021 \rangle + \vert 120 \rangle - \vert 102 \rangle + \vert 201 \rangle - \vert 210 \rangle }{\sqrt{6}}.

Systems having the classical state set {0,1,2}\{0,1,2\} are often called trits or, assuming we consider the possibility that they are in quantum states, qutrits. The term qudit refers to a system having classical state set {0,,d1}\{0,\ldots,d-1\} for an arbitrary choice of d.d.

Measurements of quantum states

Standard basis measurements of quantum states of single systems were discussed in the previous lesson: if a system having classical state set Σ\Sigma is in a quantum state represented by the vector ψ,\vert \psi \rangle, and that system is measured (with respect to a standard basis measurement), then each classical state aΣa\in\Sigma appears with probability aψ2.\vert \langle a \vert \psi \rangle\vert^2.

This tells us what happens when we have a quantum state of multiple systems and choose to measure the entire compound system (which is equivalent to measuring all of the systems). To state this precisely, let us suppose that X1,,Xn\mathsf{X}_1,\ldots,\mathsf{X}_n are systems having classical state sets Σ1,,Σn,\Sigma_1,\ldots,\Sigma_n, respectively. We may then view (X1,,Xn)(\mathsf{X}_1,\ldots,\mathsf{X}_n) collectively as a single system whose classical state set is the Cartesian product Σ1××Σn.\Sigma_1\times\cdots\times\Sigma_n. If a quantum state of this system is represented by the quantum state vector ψ,\vert\psi\rangle, and all of the systems are measured, then each possible outcome (a1,,an)Σ1××Σn(a_1,\ldots,a_n)\in\Sigma_1\times\cdots\times\Sigma_n appears with probability a1anψ2.\vert\langle a_1\cdots a_n\vert \psi\rangle\vert^2.

For example, if systems X\mathsf{X} and Y\mathsf{Y} are jointly in the quantum state

3504i51,\frac{3}{5} \vert 0\rangle \vert \heartsuit \rangle - \frac{4i}{5} \vert 1\rangle \vert \spadesuit \rangle,

then measuring both systems with respect to a standard basis measurement yields the outcome (0,)(0,\heartsuit) with probability 9/259/25 and the outcome (1,)(1,\spadesuit) with probability 16/25.16/25.

Partial measurements for two systems

Now let us consider the situation in which we have multiple systems in some quantum state, and we measure a proper subset of the systems. As before, we will begin with two systems X\mathsf{X} and Y\mathsf{Y} having classical state sets Σ\Sigma and Γ,\Gamma, respectively.

In general, a quantum state vector of (X,Y)(\mathsf{X},\mathsf{Y}) takes the form

ψ=(a,b)Σ×Γαabab, \vert \psi \rangle = \sum_{(a,b)\in\Sigma\times\Gamma} \alpha_{ab} \vert ab\rangle,

where {αab:(a,b)Σ×Γ}\{\alpha_{ab} : (a,b)\in\Sigma\times\Gamma\} is a collection of complex numbers satisfying

(a,b)Σ×Γαab2=1 \sum_{(a,b)\in\Sigma\times\Gamma} \vert \alpha_{ab} \vert^2 = 1

(which is equivalent to ψ\vert \psi \rangle being a unit vector).

We already know, from the discussion above, that if both X\mathsf{X} and Y\mathsf{Y} were measured, then each possible outcome (a,b)Σ×Γ(a,b)\in\Sigma\times\Gamma would appear with probability

abψ2=αab2. \bigl\vert \langle ab \vert \psi \rangle \bigr\vert^2 = \vert\alpha_{ab}\vert^2.

Supposing that just the first system X\mathsf{X} is measured, the probability for each outcome aΣa\in\Sigma to appear must therefore be equal to

bΓabψ2=bΓαab2. \sum_{b\in\Gamma} \bigl\vert \langle ab \vert \psi \rangle \bigr\vert^2 = \sum_{b\in\Gamma} \vert\alpha_{ab}\vert^2.

This is consistent with what we already saw in the probabilistic setting, and is once again consistent with our understanding of physics. That is, the probability for each particular outcome to appear when X\mathsf{X} is measured cannot possibly depend on whether or not Y\mathsf{Y} was also measured, as that would otherwise allow for faster-than-light communication.

Having obtained a particular outcome aΣa\in\Sigma of this measurement of X,\mathsf{X}, we expect that the quantum state of X\mathsf{X} changes so that it is equal to a,\vert a\rangle, like we had for single systems. But what happens to the quantum state of Y\mathsf{Y}?

To answer this question, let us describe the joint quantum state of (X,Y)(\mathsf{X},\mathsf{Y}) under the assumption that X\mathsf{X} was measured (with respect to a standard basis measurement) and the result was the classical state a.a.

First we express the vector ψ\vert\psi\rangle as

ψ=aΣaϕa, \vert\psi\rangle = \sum_{a\in\Sigma} \vert a \rangle \otimes \vert \phi_a \rangle,

where

ϕa=bΓαabb \vert \phi_a \rangle = \sum_{b\in\Gamma} \alpha_{ab} \vert b\rangle

for each aΣ.a\in\Sigma. Notice that the probability that the standard basis measurement of X\mathsf{X} results in each outcome aa may be written as follows:

bΓαab2=ϕa2. \sum_{b\in\Gamma} \vert\alpha_{ab}\vert^2 = \bigl\| \vert \phi_a \rangle \bigr\|^2.

Now, as a result of the standard basis measurement of X\mathsf{X} resulting in the outcome a,a, we have that the quantum state of the pair (X,Y)(\mathsf{X},\mathsf{Y}) together becomes

aϕaϕa. \vert a \rangle \otimes \frac{\vert \phi_a \rangle}{\|\vert \phi_a \rangle\|}.

That is, the state "collapses" like in the single-system case, but only as far as is required for the state to be consistent with the measurement of X\mathsf{X} having produced the outcome a.a.

Informally speaking, aϕa\vert a \rangle \otimes \vert \phi_a\rangle represents the component of ψ\vert \psi\rangle that is consistent with the a measurement of X\mathsf{X} resulting in the outcome a.a. We normalize this vector — by dividing it by its Euclidean norm, which is equal to ϕa\||\phi_a\rangle\| — to yield a valid quantum state vector having Euclidean norm equal to 1.1. This normalization step is analogous to what we did in the probabilistic setting when we divided vectors by the sum of their entries to obtain a probability vector.

As an example, let us consider the state of two qubits (X,Y)(\mathsf{X},\mathsf{Y}) from the beginning of the section:

ψ=12001601+i610+1611. \vert \psi \rangle = \frac{1}{\sqrt{2}} \vert 00 \rangle - \frac{1}{\sqrt{6}} \vert 01 \rangle + \frac{i}{\sqrt{6}} \vert 10 \rangle + \frac{1}{\sqrt{6}} \vert 11 \rangle.

To understand what happens when the first system X\mathsf{X} is measured, we begin by writing

ψ=0(120161)+1(i60+161). \vert \psi \rangle = \vert 0 \rangle \otimes \biggl( \frac{1}{\sqrt{2}} \vert 0 \rangle - \frac{1}{\sqrt{6}} \vert 1 \rangle \biggr) + \vert 1 \rangle \otimes \biggl( \frac{i}{\sqrt{6}} \vert 0 \rangle + \frac{1}{\sqrt{6}} \vert 1 \rangle \biggr).

We now see, based on the description above, that the probability for the measurement to result in the outcome 00 is

1201612=12+16=23 \biggl\|\frac{1}{\sqrt{2}} \vert 0 \rangle -\frac{1}{\sqrt{6}} \vert 1 \rangle\biggr\|^2 = \frac{1}{2} + \frac{1}{6} = \frac{2}{3}

in which case the state of (X,Y)(\mathsf{X},\mathsf{Y}) becomes

012016123=0(340121); \vert 0\rangle \otimes \frac{\frac{1}{\sqrt{2}} \vert 0 \rangle -\frac{1}{\sqrt{6}} \vert 1 \rangle}{\sqrt{\frac{2}{3}}} = \vert 0\rangle \otimes \Biggl( \sqrt{\frac{3}{4}} \vert 0 \rangle -\frac{1}{2} \vert 1\rangle\Biggr);

and the probability for the measurement to result in the outcome 11 is

i60+1612=16+16=13, \biggl\|\frac{i}{\sqrt{6}} \vert 0 \rangle + \frac{1}{\sqrt{6}} \vert 1 \rangle\biggr\|^2 = \frac{1}{6} + \frac{1}{6} = \frac{1}{3},

in which case the state of (X,Y)(\mathsf{X},\mathsf{Y}) becomes

1i60+16113=1(i20+121). \vert 1\rangle \otimes \frac{\frac{i}{\sqrt{6}} \vert 0 \rangle +\frac{1}{\sqrt{6}} \vert 1 \rangle}{\sqrt{\frac{1}{3}}} = \vert 1\rangle \otimes \Biggl( \frac{i}{\sqrt{2}} \vert 0 \rangle +\frac{1}{\sqrt{2}} \vert 1\rangle\Biggr).

The same technique, used in a symmetric way, describes what happens if the second system Y\mathsf{Y} is measured rather than the first. We rewrite the vector ψ\vert \psi \rangle as

ψ=(120+i61)0+(160+161)1. \vert \psi \rangle = \biggl( \frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{i}{\sqrt{6}} \vert 1 \rangle \biggr) \otimes \vert 0\rangle + \biggl( -\frac{1}{\sqrt{6}} \vert 0 \rangle +\frac{1}{\sqrt{6}} \vert 1\rangle \biggr) \otimes \vert 1\rangle.

The probability that the measurement of Y\mathsf{Y} yields the outcome 00 is

120+i612=12+16=23,\biggl\| \frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{i}{\sqrt{6}} \vert 1 \rangle \biggr\|^2 = \frac{1}{2} + \frac{1}{6} = \frac{2}{3},

in which case the state of (X,Y)(\mathsf{X},\mathsf{Y}) becomes

120+i61230=(340+i21)0; \frac{\frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{i}{\sqrt{6}} \vert 1 \rangle}{\sqrt{\frac{2}{3}}} \otimes \vert 0 \rangle = \biggl(\sqrt{\frac{3}{4}} \vert 0 \rangle + \frac{i}{2} \vert 1 \rangle\biggr) \otimes\vert 0 \rangle;

and the probability that the measurement outcome is 11 is

160+1612=16+16=13, \biggl\| -\frac{1}{\sqrt{6}} \vert 0 \rangle +\frac{1}{\sqrt{6}} \vert 1\rangle \biggr\|^2 = \frac{1}{6} + \frac{1}{6} = \frac{1}{3},

in which case the state of (X,Y)(\mathsf{X},\mathsf{Y}) becomes

160+161131=(120+121)1.\frac{ -\frac{1}{\sqrt{6}} \vert 0 \rangle +\frac{1}{\sqrt{6}} \vert 1\rangle }{\frac{1}{\sqrt{3}}} \otimes \vert 1\rangle = \biggl(-\frac{1}{\sqrt{2}} \vert 0\rangle + \frac{1}{\sqrt{2}} \vert 1\rangle\biggr) \otimes \vert 1\rangle.

Remark on reduced quantum states

This example shows a limitation of the simplified description of quantum information: it offers us no way to describe the reduced (or marginal) quantum state of just one of two systems (or a proper subset of any number of systems) like we did in the probabilistic case.

Specifically, we said that for a probabilistic state of two systems (X,Y)(\mathsf{X},\mathsf{Y}) described by a probability vector

ψ=(a,b)Σ×Γpabab, \vert \psi \rangle = \sum_{(a,b)\in\Sigma\times\Gamma} p_{ab} \vert ab\rangle,

the reduced (or marginal) state of X\mathsf{X} alone is described by the probability vector

(a,b)Σ×Γpaba. \sum_{(a,b)\in\Sigma\times\Gamma} p_{ab} \vert a\rangle.

For quantum state vectors, there is no analog — for a quantum state vector

ϕ=(a,b)Σ×Γαabab, \vert \phi \rangle = \sum_{(a,b)\in\Sigma\times\Gamma} \alpha_{ab} \vert ab\rangle,

the vector

ϕ=(a,b)Σ×Γαaba \vert \phi \rangle = \sum_{(a,b)\in\Sigma\times\Gamma} \alpha_{ab} \vert a\rangle

is not a quantum state vector in general, and does not properly represent the concept of a reduced or marginal state. It could be, in fact, that this vector is the zero vector.

So, what we must do instead is turn to the general description of quantum information. As we will describe in Unit 3, the general description of quantum information provides with a meaningful way to define reduced quantum states that is analogous to the probabilistic setting.

Partial measurements for three or more systems

Partial measurements for three or more systems, where some proper subset of the systems are measured, can be reduced to the case of two systems by dividing the systems into two collections: those that are measured and those that are not.

Here is a specific example that illustrates how this can be done. It demonstrates how subscripting kets by the names of the systems they represent can be useful — in this case because it gives us a simple way to describe permutations of the systems.

For the example, we have a quantum state of 5 systems X1,,X5,\mathsf{X}_1,\ldots,\mathsf{X}_5, all sharing the same classical state set {,,,}:\{\clubsuit,\diamondsuit,\heartsuit,\spadesuit\}:

17+27+17i2717.\begin{gathered} \sqrt{\frac{1}{7}} \vert\heartsuit\rangle \vert\clubsuit\rangle \vert\diamondsuit\rangle \vert\spadesuit\rangle \vert\spadesuit\rangle + \sqrt{\frac{2}{7}} \vert\diamondsuit\rangle \vert\clubsuit\rangle \vert\diamondsuit\rangle \vert\spadesuit\rangle \vert\clubsuit\rangle + \sqrt{\frac{1}{7}} \vert\spadesuit\rangle \vert\spadesuit\rangle \vert\clubsuit\rangle \vert\diamondsuit\rangle \vert\clubsuit\rangle \\ -i \sqrt{\frac{2}{7}} \vert\heartsuit\rangle \vert\clubsuit\rangle \vert\diamondsuit\rangle \vert\heartsuit\rangle \vert\heartsuit\rangle - \sqrt{\frac{1}{7}} \vert\spadesuit\rangle \vert\heartsuit\rangle \vert\clubsuit\rangle \vert\spadesuit\rangle \vert\clubsuit\rangle. \end{gathered}

We will consider the situation in which the first and third systems are measured, and the remaining systems are left alone. Conceptually speaking, there is no fundamental difference between this situation and one in which one of two systems is measured — but unfortunately, because the measured systems are interspersed with the unmeasured systems, we face a hurdle in writing down the expressions needed to perform these calculations. A way to proceed is, as mentioned above, to subscript the kets to indicate which systems they refer to. This gives us the freedom to change their ordering, as we will now describe.

First, the quantum state vector above can alternatively be written as

1712345+2712345+1712345i27123451712345.\begin{gathered} \sqrt{\frac{1}{7}} \vert\heartsuit\rangle_1 \vert\clubsuit\rangle_2 \vert\diamondsuit\rangle_3 \vert\spadesuit\rangle_4 \vert\spadesuit\rangle_5 + \sqrt{\frac{2}{7}} \vert\diamondsuit\rangle_1 \vert\clubsuit\rangle_2 \vert\diamondsuit\rangle_3 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5\\ + \sqrt{\frac{1}{7}} \vert\spadesuit\rangle_1 \vert\spadesuit\rangle_2 \vert\clubsuit\rangle_3 \vert\diamondsuit\rangle_4 \vert\clubsuit\rangle_5 -i \sqrt{\frac{2}{7}} \vert\heartsuit\rangle_1 \vert\clubsuit\rangle_2 \vert\diamondsuit\rangle_3 \vert\heartsuit\rangle_4 \vert\heartsuit\rangle_5\\ - \sqrt{\frac{1}{7}} \vert\spadesuit\rangle_1 \vert\heartsuit\rangle_2 \vert\clubsuit\rangle_3 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5. \end{gathered}

Nothing here has changed except that each ket now has a subscript indicating which system it corresponds to. Here we have used the subscripts 1,,5,1,\ldots,5, but the names of the systems themselves could also be used (in a situation where we have system names such as X,\mathsf{X}, Y,\mathsf{Y}, and Z,\mathsf{Z}, for instance).

We can then re-order the kets and collect terms as follows:

1713245+2713245+1713245i27132451713245=13(17245i27245)+13(27245)+13(1724517245).\begin{aligned} & \sqrt{\frac{1}{7}} \vert\heartsuit\rangle_1 \vert\diamondsuit\rangle_3 \vert\clubsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\spadesuit\rangle_5 + \sqrt{\frac{2}{7}} \vert\diamondsuit\rangle_1 \vert\diamondsuit\rangle_3 \vert\clubsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5\\ & \quad + \sqrt{\frac{1}{7}} \vert\spadesuit\rangle_1 \vert\clubsuit\rangle_3 \vert\spadesuit\rangle_2 \vert\diamondsuit\rangle_4 \vert\clubsuit\rangle_5 -i \sqrt{\frac{2}{7}} \vert\heartsuit\rangle_1 \vert\diamondsuit\rangle_3 \vert\clubsuit\rangle_2 \vert\heartsuit\rangle_4 \vert\heartsuit\rangle_5\\ & \quad -\sqrt{\frac{1}{7}} \vert\spadesuit\rangle_1 \vert\clubsuit\rangle_3 \vert\heartsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5\\[2mm] & \hspace{1.5cm} = \vert\heartsuit\rangle_1 \vert\diamondsuit\rangle_3 \biggl( \sqrt{\frac{1}{7}} \vert\clubsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\spadesuit\rangle_5 -i \sqrt{\frac{2}{7}} \vert\clubsuit\rangle_2 \vert\heartsuit\rangle_4 \vert\heartsuit\rangle_5 \biggr)\\ & \hspace{1.5cm} \quad + \vert\diamondsuit\rangle_1 \vert\diamondsuit\rangle_3 \biggl( \sqrt{\frac{2}{7}} \vert\clubsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5 \biggr)\\ & \hspace{1.5cm} \quad + \vert\spadesuit\rangle_1 \vert\clubsuit\rangle_3 \biggl( \sqrt{\frac{1}{7}} \vert\spadesuit\rangle_2 \vert\diamondsuit\rangle_4 \vert\clubsuit\rangle_5 - \sqrt{\frac{1}{7}} \vert\heartsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5\biggr). \end{aligned}

(The tensor products are still implicit, even when parentheses are used, as in this example.)

We now see that if the systems X1\mathsf{X}_1 and X3\mathsf{X}_3 are measured, the (nonzero) probabilities of the different outcomes are as follow:

  • The measurement outcome (,)(\heartsuit,\diamondsuit) occurs with probability
17245i272452=17+27=37\biggl\| \sqrt{\frac{1}{7}} \vert\clubsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\spadesuit\rangle_5 -i \sqrt{\frac{2}{7}} \vert\clubsuit\rangle_2 \vert\heartsuit\rangle_4 \vert\heartsuit\rangle_5 \biggr\|^2 = \frac{1}{7} + \frac{2}{7} = \frac{3}{7}
  • The measurement outcome (,)(\diamondsuit,\diamondsuit) occurs with probability
272452=27\biggl\| \sqrt{\frac{2}{7}} \vert\clubsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5 \biggr\|^2 = \frac{2}{7}
  • The measurement outcome (,)(\spadesuit,\clubsuit) occurs with probability
17245172452=17+17=27.\biggl\| \sqrt{\frac{1}{7}} \vert\spadesuit\rangle_2 \vert\diamondsuit\rangle_4 \vert\clubsuit\rangle_5 - \sqrt{\frac{1}{7}} \vert\heartsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\clubsuit\rangle_5 \biggr\|^2 = \frac{1}{7} + \frac{1}{7} = \frac{2}{7}.

If the measurement outcome is (,),(\heartsuit,\diamondsuit), for instance, we have that the state of (X1,,X5)(\mathsf{X}_1,\ldots,\mathsf{X}_5) becomes

1317245i2724537=1312345i2312345.\begin{aligned} & \vert \heartsuit\rangle_1 \vert \diamondsuit \rangle_3 \otimes \frac{ \sqrt{\frac{1}{7}} \vert\clubsuit\rangle_2 \vert\spadesuit\rangle_4 \vert\spadesuit\rangle_5 - i \sqrt{\frac{2}{7}} \vert\clubsuit\rangle_2 \vert\heartsuit\rangle_4 \vert\heartsuit\rangle_5} {\sqrt{\frac{3}{7}}}\\ & \qquad = \sqrt{\frac{1}{3}} \vert \heartsuit\rangle_1 \vert\clubsuit\rangle_2 \vert \diamondsuit \rangle_3\vert\spadesuit\rangle_4 \vert\spadesuit\rangle_5 -i \sqrt{\frac{2}{3}} \vert \heartsuit\rangle_1 \vert\clubsuit\rangle_2 \vert \diamondsuit \rangle_3\vert\heartsuit\rangle_4 \vert\heartsuit\rangle_5. \end{aligned}

For other measurement outcomes the state can be determined in a similar way.

Now, it must be understood that the tensor product is not commutative: if ϕ\vert \phi\rangle and π\vert \pi \rangle are vectors, then, in general, ϕπ\vert \phi\rangle\otimes\vert \pi \rangle is different from πϕ,\vert \pi\rangle\otimes\vert \phi \rangle, and likewise for tensor products of three or more vectors. For instance, \vert\heartsuit\rangle \vert\clubsuit\rangle \vert\diamondsuit\rangle \vert\spadesuit\rangle \vert\spadesuit\rangle is a different vector than .\vert\heartsuit\rangle \vert\diamondsuit\rangle \vert\clubsuit\rangle \vert\spadesuit\rangle \vert\spadesuit\rangle. The technique just described of re-ordering kets should not be interpreted as suggesting otherwise. Rather, for the sake of performing calculations and expressing the results, we are simply making a decision that it is more convenient to collect the systems X1,,X5\mathsf{X}_1,\ldots,\mathsf{X}_5 together as (X1,X3,X2,X4,X5)(\mathsf{X}_1,\mathsf{X}_3,\mathsf{X}_2,\mathsf{X}_4,\mathsf{X}_5) rather than (X1,X2,X3,X4,X5).(\mathsf{X}_1,\mathsf{X}_2,\mathsf{X}_3,\mathsf{X}_4,\mathsf{X}_5). The subscripts on the kets serve to keep this all straight.

Analogously, in the closely related but simpler setting of Cartesian products and ordered pairs, if aa and bb are different classical states, then (a,b)(a,b) and (b,a)(b,a) are also different. Nevertheless, saying that the classical state of two bits (X,Y)(\mathsf{X},\mathsf{Y}) is (1,0)(1,0) is equivalent to saying that the classical state of (Y,X)(\mathsf{Y},\mathsf{X}) is (0,1);(0,1); when every system has its own unique name, it doesn't really matter what order we choose to list them, so long as the ordering is made clear.

Finally, here are two examples involving the GHZ and W states, as promised earlier. First let us consider the GHZ state

12000+12111.\frac{1}{\sqrt{2}} \vert 000\rangle + \frac{1}{\sqrt{2}} \vert 111\rangle.

If just the first system is measured, we obtain the outcome 00 with probability 1/2,1/2, in which case the state of the three qubits becomes 000;\vert 000\rangle; and we also obtain the outcome 11 with probability 1/2,1/2, in which case the state of the three qubits becomes 111.\vert 111\rangle.

Next let us consider a W state, which can be written like this:

13001+13010+13100=0(1301+1310)+1(1300).\begin{aligned} & \frac{1}{\sqrt{3}} \vert 001\rangle + \frac{1}{\sqrt{3}} \vert 010\rangle + \frac{1}{\sqrt{3}} \vert 100\rangle \\ & \qquad = \vert 0 \rangle \biggl( \frac{1}{\sqrt{3}} \vert 01\rangle + \frac{1}{\sqrt{3}} \vert 10\rangle\biggr) + \vert 1 \rangle \biggl(\frac{1}{\sqrt{3}}\vert 00\rangle\biggr). \end{aligned}

The probability that a measurement of the first qubit results in the outcome 0 is therefore equal to

1301+13102=23,\biggl\| \frac{1}{\sqrt{3}} \vert 01\rangle + \frac{1}{\sqrt{3}} \vert 10\rangle \biggr\|^2 = \frac{2}{3},

and conditioned upon the measurement producing this outcome, the quantum state of the three qubits becomes

01301+131023=0(1201+1210)=0ψ+.\vert 0\rangle\otimes \frac{ \frac{1}{\sqrt{3}} \vert 01\rangle + \frac{1}{\sqrt{3}} \vert 10\rangle }{ \sqrt{\frac{2}{3}} } = \vert 0\rangle \biggl(\frac{1}{\sqrt{2}} \vert 01\rangle + \frac{1}{\sqrt{2}} \vert 10\rangle \biggr) = \vert 0\rangle\vert \psi^+\rangle.

The probability that the measurement outcome is 1 is 1/3,1/3, in which case the state of the three qubits becomes 100.\vert 100\rangle.

Unitary operations

In previous sections of this lesson, we used the Cartesian product to treat individual systems as a larger, single system. Following the same line of thought, we can represent operations on multiple systems as unitary matrices acting on the state vector of this larger system.

In principle, any unitary matrix whose rows and columns correspond to the classical states of whatever system we're thinking about represents a valid quantum operation — and this holds true for compound systems whose classical state sets happen to be Cartesian products of the classical state sets of the individual systems.

Focusing on two systems, if X\mathsf{X} is a system having classical state set Σ\Sigma and Y\mathsf{Y} is a system having classical state set Γ,\Gamma, then the classical state set of the joint system (X,Y)(\mathsf{X},\mathsf{Y}) is Σ×Γ\Sigma\times\Gamma — and therefore the set of operations that can be performed on this joint system are represented by unitary matrices whose rows and columns are placed in correspondence with the set Σ×Γ.\Sigma\times\Gamma. The ordering of the rows and columns of these matrices is the same as the ordering used for quantum state vectors of the system (X,Y).(\mathsf{X},\mathsf{Y}).

For example, let us suppose that Σ={1,2,3}\Sigma = \{1,2,3\} and Γ={0,1},\Gamma = \{0,1\}, and recall that the standard convention for ordering the elements of the Cartesian product {1,2,3}×{0,1}\{1,2,3\}\times\{0,1\} is (1,0),(1,0), (1,1),(1,1), (2,0),(2,0), (2,1),(2,1), (3,0),(3,0), (3,1).(3,1). Here is an example of a unitary matrix representing an operation on (X,Y):(\mathsf{X},\mathsf{Y}):

U=(121212001212i21200i212121200120001212012i21200i200012120).U = \begin{pmatrix} \frac{1}{2} & \frac{1}{2} & \frac{1}{2} & 0 & 0 & \frac{1}{2} \\[2mm] \frac{1}{2} & \frac{i}{2} & -\frac{1}{2} & 0 & 0 & -\frac{i}{2} \\[2mm] \frac{1}{2} & -\frac{1}{2} & \frac{1}{2} & 0 & 0 & -\frac{1}{2} \\[2mm] 0 & 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0\\[2mm] \frac{1}{2} & -\frac{i}{2} & -\frac{1}{2} & 0 & 0 & \frac{i}{2} \\[2mm] 0 & 0 & 0 & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 \end{pmatrix}.

This unitary operation isn't important, it's just an example. To check that UU is unitary, it suffices to compute: UU=I.U^{\dagger} U = \mathbb{I}.

The action of UU on the standard basis vector 11,\vert 11 \rangle, for instance, is

U11=1210+i2111220i230,U \vert 11\rangle = \frac{1}{2} \vert 10 \rangle + \frac{i}{2} \vert 11 \rangle - \frac{1}{2} \vert 20 \rangle - \frac{i}{2} \vert 30\rangle,

which we can see by examining the second column of U,U, considering our ordering of the set {1,2,3}×{0,1}.\{1,2,3\}\times\{0,1\}.

As with any matrix, it's possible to express UU using the Dirac notation using 20 terms for the 20 nonzero entries of U.U. However, if we did write down all of these terms rather than writing a 6×66\times 6 matrix, we might miss certain patterns that are evident from the matrix expression. Simply put, the Dirac notation is not always the best choice for how to represent matrices.

Unitary operations on three or more systems work in a similar way, with the unitary matrices having rows and columns corresponding to the Cartesian product of the classical state sets of the systems.

We have already seen an example in this lesson: the three-qubit operation

k=07(k+1)mod8k\sum_{k = 0}^{7} \vert (k+1) \bmod 8 \rangle \langle k \vert

from before, where j\vert j \rangle means the three bit binary encoding of the number j,j, is unitary. Operations that are both unitary and represent deterministic operations are called reversible operations. The conjugate transpose of this matrix can be written like this:

k=07k(k+1)mod8=k=07(k1)mod8k.\sum_{k = 0}^{7} \vert k \rangle \langle (k+1) \bmod 8 \vert = \sum_{k = 0}^{7} \vert (k-1) \bmod 8 \rangle \langle k \vert.

This matrix represents the reverse, or in mathematical terms the inverse, of the original operation — which is what we expect from the conjugate transpose of a unitary matrix.

We will see other examples of unitary operations on multiple systems as the lesson continues.

Unitary operations performed independently on individual systems

When unitary operations are performed independently on a collection of individual systems, the combined action of these independent operations is described by the tensor product of the unitary matrices that represent them. That is, if X1,,Xn\mathsf{X}_1,\ldots,\mathsf{X}_n are quantum systems, U1,,UnU_1,\ldots, U_n are unitary matrices representing operations on these systems, and the operations are performed independently on the systems, the combined action on (X1,,Xn)(\mathsf{X}_1,\ldots,\mathsf{X}_n) is represented by the matrix U1Un.U_1\otimes\cdots\otimes U_n. Once again, we find that the probabilistic and quantum settings are analogous in this regard.

One would naturally expect, from reading the previous paragraph, that the tensor product of any collection of unitary matrices is unitary. Indeed this is true, and we can verify it as follows.

Notice first that the conjugate transpose operation satisfies

(M1Mn)=M1Mn (M_1 \otimes \cdots \otimes M_n)^{\dagger} = M_1^{\dagger} \otimes \cdots \otimes M_n^{\dagger}

for any collection of matrices M1,,Mn.M_1,\ldots,M_n. This can be checked by going back to the definition of the tensor product and of the conjugate transpose, and checking that each entry of the two sides of the equation are in agreement. This means that

(U1Un)(U1Un)=(U1Un)(U1Un). (U_1 \otimes \cdots \otimes U_n)^{\dagger} (U_1\otimes\cdots\otimes U_n) = (U_1^{\dagger} \otimes \cdots \otimes U_n^{\dagger}) (U_1\otimes\cdots\otimes U_n).

Because the tensor product of matrices is multiplicative, we find that

(U1Un)(U1Un)=(U1U1)(UnUn)=I1In. (U_1^{\dagger} \otimes \cdots \otimes U_n^{\dagger}) (U_1\otimes\cdots\otimes U_n) = (U_1^{\dagger} U_1) \otimes \cdots \otimes (U_n^{\dagger} U_n) = \mathbb{I}_1 \otimes \cdots \otimes \mathbb{I}_n.

Here we have written I1,,In\mathbb{I}_1,\ldots,\mathbb{I}_n to refer to the matrices representing the identity operation on the systems X1,,Xn\mathsf{X}_1,\ldots,\mathsf{X}_n — which is to say that these are identity matrices whose sizes agree with the number of classical states of X1,,Xn.\mathsf{X}_1,\ldots,\mathsf{X}_n.

Finally, the tensor product I1In\mathbb{I}_1 \otimes \cdots \otimes \mathbb{I}_n is equal to the identity matrix, where we have a number of rows and columns that agrees with the product of the number of rows and columns of the matrices I1,,In.\mathbb{I}_1,\ldots,\mathbb{I}_n. We may view this larger identity matrix as representing the identity operation on the joint system (X1,,Xn).(\mathsf{X}_1,\ldots,\mathsf{X}_n).

In summary, we have the following sequence of equalities:

(U1Un)(U1Un)=(U1Un)(U1Un)=(U1U1)(UnUn)=I1In=I.\begin{aligned} & (U_1 \otimes \cdots \otimes U_n)^{\dagger} (U_1\otimes\cdots\otimes U_n) \\ & \quad = (U_1^{\dagger} \otimes \cdots \otimes U_n^{\dagger}) (U_1\otimes\cdots\otimes U_n) \\ & \quad = (U_1^{\dagger} U_1) \otimes \cdots \otimes (U_n^{\dagger} U_n)\\ & \quad = \mathbb{I}_{1} \otimes \cdots \otimes \mathbb{I}_{n}\\ & \quad = \mathbb{I}. \end{aligned}

We therefore conclude that U1UnU_1 \otimes \cdots \otimes U_n is unitary.

An important situation that often arises is one in which a unitary operation is applied to just one system — or a proper subset of systems — within a larger joint system. For instance, suppose that X\mathsf{X} and Y\mathsf{Y} are systems that we can view together as forming a single, compound system (X,Y),(\mathsf{X},\mathsf{Y}), and we perform an operation just on the system X.\mathsf{X}. To be precise, let us suppose that UU is a unitary matrix representing an operation on X,\mathsf{X}, so that its rows and columns have been placed in correspondence with the classical states of X.\mathsf{X}.

To say that we perform the operation represented by UU just on the system X\mathsf{X} implies that we do nothing to Y,\mathsf{Y}, meaning that we independently perform UU on X\mathsf{X} and the identity operation on Y.\mathsf{Y}. That is, "doing nothing" to Y\mathsf{Y} is equivalent to performing the identity operation on Y,\mathsf{Y}, which is represented by the identity matrix IY.\mathbb{I}_\mathsf{Y}. (Here, by the way, the subscript Y\mathsf{Y} tells us that IY\mathbb{I}_\mathsf{Y} refers to the identity matrix having a number of rows and columns in agreement with the classical state set of Y.\mathsf{Y}.) The operation on (X,Y)(\mathsf{X},\mathsf{Y}) that is obtained when we perform UU on X\mathsf{X} and do nothing to Y\mathsf{Y} is therefore represented by the unitary matrix

UIY. U \otimes \mathbb{I}_{\mathsf{Y}}.

For example, if X\mathsf{X} and Y\mathsf{Y} are qubits, performing a Hadamard operation on X\mathsf{X} (and doing nothing to Y\mathsf{Y}) is equivalent to performing the operation

HIY=(12121212)(1001)=(120120012012120120012012) H \otimes \mathbb{I}_{\mathsf{Y}} = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix} \otimes \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix} = \begin{pmatrix} \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} & 0\\[2mm] 0 & \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}}\\[2mm] \frac{1}{\sqrt{2}} & 0 & -\frac{1}{\sqrt{2}} & 0\\[2mm] 0 & \frac{1}{\sqrt{2}} & 0 & -\frac{1}{\sqrt{2}} \end{pmatrix}

on the joint system (X,Y).(\mathsf{X},\mathsf{Y}).

Along similar lines, we may consider that an operation represented by a unitary matrix UU is applied to Y\mathsf{Y} and nothing is done to X,\mathsf{X}, in which case the resulting operation on (X,Y)(\mathsf{X},\mathsf{Y}) is represented by the unitary matrix

IXU. \mathbb{I}_{\mathsf{X}} \otimes U.

For example, if we again consider the situation in which both X\mathsf{X} and Y\mathsf{Y} are qubits and UU is a Hadamard operation, the resulting operation on (X,Y)(\mathsf{X},\mathsf{Y}) is represented by the matrix

(1001)(12121212)=(121200121200001212001212). \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix} \otimes \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix} = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 & 0\\[2mm] \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\[2mm] 0 & 0 & \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix}.

Not every unitary operation on a collection of systems X1,,Xn\mathsf{X}_1,\ldots,\mathsf{X}_n can be written as a tensor product of unitary operations U1Un,U_1\otimes\cdots\otimes U_n, just as not every quantum state vector of these systems is a product state. For example, neither the swap operation nor the controlled-NOT operation on two qubits, which are described below, can be expressed as a tensor product of unitary operations.

The swap operation

To conclude the lesson, let's take a look at two classes of examples of unitary operations on multiple systems, beginning with the swap operation.

Suppose that X\mathsf{X} and Y\mathsf{Y} are systems that share the same classical state set Σ.\Sigma. The swap operation on the pair (X,Y)(\mathsf{X},\mathsf{Y}) is the operation that exchanges the contents of the two systems, but otherwise leaves the systems alone (so that X\mathsf{X} remains on the left and Y\mathsf{Y} remains on the right).

We will denote this operation as SWAP.\operatorname{SWAP}. It operates like this for every choice of classical states a,bΣ:a,b\in\Sigma:

SWAPab=ba.\operatorname{SWAP} \vert a \rangle \vert b \rangle = \vert b \rangle \vert a \rangle.

One way to write the matrix associated with this operation using the Dirac notation is as follows:

SWAP=c,dΣcddc.\mathrm{SWAP} = \sum_{c,d\in\Sigma} \vert c \rangle \langle d \vert \otimes \vert d \rangle \langle c \vert.

It may not be immediately clear that this matrix represents SWAP,\operatorname{SWAP}, but we can check it satisfies the condition SWAPab=ba\operatorname{SWAP} \vert a \rangle \vert b \rangle = \vert b \rangle \vert a \rangle for every choice of classical states a,bΣ.a,b\in\Sigma.

As a simple example, when X\mathsf{X} and Y\mathsf{Y} are qubits, we find that

SWAP=(1000001001000001). \operatorname{SWAP} = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}.

Controlled-unitary operations

Now let us suppose that Q\mathsf{Q} is a qubit and R\mathsf{R} is an arbitrary system, having whatever classical state set we wish.

For every unitary operation UU acting on the system R,\mathsf{R}, a controlled UU operation is a unitary operation on the pair (Q,R)(\mathsf{Q},\mathsf{R}) defined as follows:

CU=00IR+11U.CU = \vert 0\rangle \langle 0\vert \otimes \mathbb{I}_{\mathsf{R}} + \vert 1\rangle \langle 1\vert \otimes U.

For example, if R\mathsf{R} is also a qubit and we think about the Pauli XX operation on R,\mathrm{R}, then a controlled-XX operation is given by

CX=00IR+11X=(1000010000010010). CX = \vert 0\rangle \langle 0\vert \otimes \mathbb{I}_{\mathsf{R}} + \vert 1\rangle \langle 1\vert \otimes X = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0 \end{pmatrix}.

We already encountered this operation in the context of classical information and probabilistic operations earlier in the lesson.

If instead we consider the Pauli ZZ operation on R\mathsf{R} in place of the XX operation, we obtain this operation:

CZ=00IR+11Z=(1000010000100001). CZ = \vert 0\rangle \langle 0\vert \otimes \mathbb{I}_{\mathsf{R}} + \vert 1\rangle \langle 1\vert \otimes Z = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & -1 \end{pmatrix}.

If instead we take R\mathsf{R} to be two qubits, and we take UU to be the swap operation between these two qubits, we obtain this operation:

CSWAP=(1000000001000000001000000001000000001000000000100000010000000001). \operatorname{CSWAP} = \begin{pmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix}.

This operation is also known as a Fredkin operation (or, more commonly, a Fredkin gate), named for Edward Fredkin. Its action on standard basis states can be described as follows:

CSWAP0bc=0bcCSWAP1bc=1cb \begin{aligned} \operatorname{CSWAP} \vert 0 b c \rangle & = \vert 0 b c \rangle \\[1mm] \operatorname{CSWAP} \vert 1 b c \rangle & = \vert 1 c b \rangle \end{aligned}

Finally, controlled-controlled-NOT operation, which we may denote as CCX,CCX, is called a Toffoli operation (or Toffoli gate), named for Tommaso Toffoli. Its matrix representation looks like this:

CCX=(1000000001000000001000000001000000001000000001000000000100000010). CCX = \begin{pmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{pmatrix}.

We may alternatively express it using the Dirac notation as follows:

CCX=(0000+0101+1010)I+1111X. CCX = \bigl( \vert 00 \rangle \langle 00 \vert + \vert 01 \rangle \langle 01 \vert + \vert 10 \rangle \langle 10 \vert \bigr) \otimes \mathbb{I} + \vert 11 \rangle \langle 11 \vert \otimes X.

Qiskit examples

In the previous lesson, we learned about Qiskit's Statevector and Operator classes, and used them to simulate quantum systems. In this section, we'll use them to explore the behavior of multiple systems. We'll start by importing these classes, as well as the square root function from NumPy.

Copy to clipboard

No output produced

Tensor products

The Statevector class has a tensor method which returns the tensor product of itself and another Statevector.

For example, below we create two state vectors representing 0|0\rangle and 1,|1\rangle, and use the tensor method to create a new vector, 01.|0\rangle \otimes |1\rangle.

Copy to clipboard

Output:

01 |01\rangle

In another example below, we create state vectors representing the +|{+}\rangle and 12(0+i1)\tfrac{1}{\sqrt{2}}(|0\rangle + i|1\rangle) states, and combine them to create a new state vector. We'll assign this new vector to the variable psi.

Copy to clipboard

Output:

1200+i201+1210+i211\frac{1}{2} |00\rangle+\frac{i}{2} |01\rangle+\frac{1}{2} |10\rangle+\frac{i}{2} |11\rangle

The Operator class also has a tensor method. In the example below, we create the XX and II gates and display their tensor product.

Copy to clipboard

Output:

Operator([[0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
          [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j],
          [1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
          [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]],
         input_dims=(2, 2), output_dims=(2, 2))

We can then treat these compound states and operations as we did single systems in the previous lesson. For example, in the cell below we calculate

(IX)ψ(I\otimes X)|\psi\rangle

for the state psi we defined above. (The ^ operator tensors matrices together.)

Copy to clipboard

Output:

i200+1201+i210+1211\frac{i}{2} |00\rangle+\frac{1}{2} |01\rangle+\frac{i}{2} |10\rangle+\frac{1}{2} |11\rangle

Below, we create a CXCX operator and calculate CXψ.CX \vert\psi\rangle.

Copy to clipboard

Output:

1200+i201+i210+1211\frac{1}{2} |00\rangle+\frac{i}{2} |01\rangle+\frac{i}{2} |10\rangle+\frac{1}{2} |11\rangle

Partial measurements

In the previous page, we used the measure method to simulate a measurement of the quantum state vector. This method returns two items: the simulated measurement result, and the new Statevector given this measurement.

By default, measure measures all qubits in the state vector, but we can provide a list of integers to only measure the qubits at those indices. To demonstrate, the cell below creates the state

W=13(001+010+100).W = \tfrac{1}{\sqrt{3}}(|001\rangle + |010\rangle + |100\rangle).

(Note that Qiskit is primarily designed for use with qubit-based quantum computers. As such, Statevector will try to interpret any vector with 2n2^n elements as a system of nn qubits. You can override this by passing a dims argument to the constructor. For example, dims=(4,2) would tell Qiskit the system has one four-level system, and one two-level system (qubit).)

Copy to clipboard

Output:

33001+33010+33100\frac{\sqrt{3}}{3} |001\rangle+\frac{\sqrt{3}}{3} |010\rangle+\frac{\sqrt{3}}{3} |100\rangle

The cell below simulates a measurement on the rightmost qubit (which has index 0). The other two qubits are not measured.

Copy to clipboard

Output:

Measured: 0
State after measurement:

22010+22100\frac{\sqrt{2}}{2} |010\rangle+\frac{\sqrt{2}}{2} |100\rangle

Try running the cell a few times to see different results. Notice that measuring a 1 means that we know both the other qubits are 0|0\rangle, but measuring a 0 means the remaining two qubits are in the state 12(01+10)\tfrac{1}{\sqrt{2}}(|01\rangle + |10\rangle).

Was this page helpful?