Learning Home Catalog Composer
Learning
Home Catalog Composer Return to course
Learning

Density matrices

Download the slides for this lesson.

Introduction

In the general formulation of quantum information, quantum states are not represented by vectors like we have in the simplified formulation, but instead are represented by a special class of matrices called density matrices.

At first glance it may seem peculiar that quantum states are represented by matrices, which more typically represent actions or operations as opposed to states. For example, unitary matrices describe quantum operations in the simplified formulation of quantum information and stochastic matrices describe probabilistic operations in the context of classical information. In contrast, although density matrices are indeed matrices, they represent states — not actions or operations to which we typically associate an intuitive meaning.

Nevertheless, the fact that density matrices can (like all matrices) be associated with linear mappings is a critically important aspect of them. For example, the eigenvalues of density matrices describe the randomness or uncertainty inherent to the states they represent.

Before we proceed to the definition of density matrices, here are a few key points that motivate their use.

  • Density matrices can represent a broader class of quantum states than quantum state vectors. This includes states that arise in practical settings, such as states of quantum systems that have been subjected to noise, as well as random choices of quantum states.

  • Density matrices allow us to describe states of isolated parts of systems, such as the state of one system that happens to be entangled with another system that we wish to ignore. This isn't easily done in the simplified formulation of quantum information.

  • Classical (probabilistic) states can also be represented by density matrices, specifically ones that are . This is important because it allows quantum and classical information to be described together within a single mathematical framework, with classical information essentially being a special case of quantum information.

Pre-course Survey

Before we begin, please take a moment to complete our pre-course survey, which is important to help improve our content offerings and user experience.

Basics

We'll begin by describing what density matrices are in mathematical terms, and then we'll take a look at some examples. After that, we'll discuss a few basic aspects of how density matrices work and how they relate to quantum state vectors in the simplified formulation of quantum information.

Definition

Suppose that we have a quantum system named X,\mathsf{X}, and let Σ\Sigma be the (finite and nonempty) classical state set of this system. Here we're mirroring the naming conventions used in the Basics of quantum information course, which we'll continue to do when the opportunity arises.

In the general formulation of quantum information, a quantum state of the system X\mathsf{X} is described by a density matrix ρ\rho whose entries are complex numbers and whose indices (for both its rows and columns) have been placed in correspondence with the classical state set Σ.\Sigma. The lowercase Greek letter ρ\rho is a conventional first choice for the name of a density matrix and σ\sigma and ξ\xi are also common choices.

Here are a few examples of density matrices that describe states of qubits:

(1000),(12121212),(34i8i814),and(120012).\begin{pmatrix} 1 & 0\\ 0 & 0 \end{pmatrix}, \quad \begin{pmatrix} \frac{1}{2} & \frac{1}{2}\\[2mm] \frac{1}{2} & \frac{1}{2} \end{pmatrix}, \quad \begin{pmatrix} \frac{3}{4} & \frac{i}{8}\\[2mm] -\frac{i}{8} & \frac{1}{4} \end{pmatrix}, \quad\text{and}\quad \begin{pmatrix} \frac{1}{2} & 0\\[2mm] 0 & \frac{1}{2} \end{pmatrix}.

To say that ρ\rho is a density matrix means that these two conditions, which will be explained momentarily, are both satisfied:

  1. Unit trace: Tr(ρ)=1.\operatorname{Tr}(\rho) = 1.
  2. Positive semidefiniteness: ρ0.\rho \geq 0.

The first condition refers to the trace of a matrix. This is a function that is defined for all square matrices as the sum of the diagonal entries:

Tr(α0,0α0,1α0,n1α1,0α1,1α1,n1αn1,0αn1,1αn1,n1)=α0,0+α1,1++αn1,n1.\operatorname{Tr} \begin{pmatrix} \alpha_{0,0} & \alpha_{0,1} & \cdots & \alpha_{0,n-1}\\[1.5mm] \alpha_{1,0} & \alpha_{1,1} & \cdots & \alpha_{1,n-1}\\[1.5mm] \vdots & \vdots & \ddots & \vdots\\[1.5mm] \alpha_{n-1,0} & \alpha_{n-1,1} & \cdots & \alpha_{n-1,n-1} \end{pmatrix} = \alpha_{0,0} + \alpha_{1,1} + \cdots + \alpha_{n-1,n-1}.

The trace is a linear function: for any two square matrices AA and BB of the same size and any two complex numbers α\alpha and β,\beta, the following equation is always true.

Tr(αA+βB)=αTr(A)+βTr(B)\operatorname{Tr}(\alpha A + \beta B) = \alpha \operatorname{Tr}(A) + \beta\operatorname{Tr}(B)

The trace is an extremely important function and there's a lot more that can be said about it, but we'll wait until the need arises to say more.

The second condition refers to the property of a matrix being positive semidefinite, which is a truly fundamental concept in quantum information theory (and in many other subjects). A matrix PP is positive semidefinite if there exists a matrix MM such that

P=MM.P = M^{\dagger} M.

Here we can either demand that MM is a square matrix of the same size as PP or allow it to be non-square — we obtain the same class of matrices either way.

There are several alternative (but equivalent) ways to define this condition, including these:

  • A matrix PP is positive semidefinite if and only if PP is Hermitian (i.e., equal to its own conjugate transpose) and all of its eigenvalues are nonnegative real numbers. Checking that a matrix is Hermitian and all of its eigenvalues are nonnegative is a simple computational way to verify that it's positive semidefinite.

  • A matrix PP is positive semidefinite if and only if ψPψ0\langle \psi \vert P \vert \psi \rangle \geq 0 for every complex vector ψ\vert\psi\rangle having the same indices as P.P.

An intuitive way to think about positive semidefinite matrices is that they're like matrix analogues of nonnegative real numbers. That is, positive semidefinite matrices are to complex square matrices as nonnegative real numbers are to complex numbers. For example, a complex number α\alpha is a nonnegative real number if and only if

α=ββ\alpha = \overline{\beta} \beta

for some complex number β,\beta, which matches the definition of positive semidefiniteness when we replace matrices with scalars. While matrices are more complicated objects than scalars in general, this is nevertheless a helpful way to think about positive semidefinite matrices. This also explains why the notation P0P\geq 0 is used to mean that PP is positive semidefinite. (Notice in particular that the notation P0P\geq 0 does not mean that each entry of PP is nonnegative in this context. There are positive semidefinite matrices having negative entries as well as matrices whose entries are all positive that are not positive semidefinite.)

At this point the definition of density matrices may seem rather arbitrary and abstract, as we have not yet associated any meaning with these matrices or their entries. The way density matrices work and can be interpreted will be clarified as the lesson continues, but for now it may be helpful to think about the entries of density matrices in the following (rather informal) way.

  • The diagonal entries of a density matrix give us the probabilities for each classical state to appear if we perform a standard basis measurement — so we can think about these entries as describing the "weight" associated with each classical state.

  • The off-diagonal entries of a density matrix describe the degree to which the two classical states corresponding to that entry (meaning the one corresponding to the row and the one corresponding to the column) are in quantum superposition, as well as the relative phase between them.

It is certainly not obvious a priori that quantum states should be represented by density matrices. Indeed, there is a sense in which the choice to represent quantum states by density matrices leads naturally to the entire mathematical description of quantum information. Everything else about quantum information actually follows pretty logically from this one choice!

Random examples

We'll see several examples of density matrices throughout the lesson, including ones that represent states encountered earlier in the series. To begin, let's take a look as some randomly generated examples. We'll begin with some random examples of positive semidefinite matrices, and from these examples we can obtain examples of density matrices by simply normalizing — which in this context means dividing by the trace.

Copy to clipboard

Output:

Imports loaded.

The code cell that follows randomly generates a positive semidefinite matrix by first generating an n×nn\times n matrix MM whose entries have real and imaginary parts chosen independently and uniformly from the set {9,,9}\{-9,\ldots,9\} and then outputting the positive semidefinite matrix P=MM.P = M^{\dagger} M. Through this method we'll only obtain matrices whose entries have integer real and imaginary parts that aren't too large, which will make the examples more readable — but be aware that not every positive semidefinite matrix has this property. Changing the dimension nn and running the cells multiple times may help to develop a sense for what positive semidefinite matrices look like.

Copy to clipboard

Output:

[20176+31i54+36i7631i55617i5436i6+17i43] \begin{bmatrix} 201 & -76 + 31 i & 54 + 36 i \\ -76 - 31 i & 55 & -6 - 17 i \\ 54 - 36 i & -6 + 17 i & 43 \\ \end{bmatrix}

(Here we're using Qiskit's array_to_latex function to obtain a more human-readable output format for the matrix. Substituting display(P) for display(array_to_latex(P)) shows the result using the standard matrix representation in Python.)

The fact that each randomly generated positive semidefinite matrix is Hermitian can be checked by inspection. We can also compute the eigenvalues to see that they're always nonnegative real numbers.

Copy to clipboard

Output:

[258.355657264131.48848368289.1558590531] \begin{bmatrix} 258.3556572641 & 31.4884836828 & 9.1558590531 \\ \end{bmatrix}

Random examples generated in this way are naturally limited in what they can tell us, but we can observe some features that are true in general for positive semidefinite matrices. In particular, the diagonal entries are always nonnegative real numbers, and the off-diagonal entries are never "too large" in comparison to the two corresponding diagonal entries (meaning the diagonal entries in the same row and the same column as the chosen off-diagonal entry).

As was already suggested, to generate a random density matrix we can use the same procedure to generate a random positive semidefinite matrix and then divide this matrix by its trace. The following code cell does this. (Note that the cell will throw a warning if by chance PP is the all-zero matrix, which is possible but unlikely — this can only happen when MM is the all-zero matrix.)

Copy to clipboard

Output:

[0.49477351920.0121951220.0139372822i0.00348432060.0087108014i0.012195122+0.0139372822i0.20905923340.0609756098+0.2212543554i0.0034843206+0.0087108014i0.06097560980.2212543554i0.2961672474] \begin{bmatrix} 0.4947735192 & 0.012195122 - 0.0139372822 i & -0.0034843206 - 0.0087108014 i \\ 0.012195122 + 0.0139372822 i & 0.2090592334 & -0.0609756098 + 0.2212543554 i \\ -0.0034843206 + 0.0087108014 i & -0.0609756098 - 0.2212543554 i & 0.2961672474 \\ \end{bmatrix}

(The array_to_latex function does its best to provide a symbolic representation of the matrix, but it's not exact and will occasionally produce unusual expressions that happen to closely approximate the actual results.)

Notice that the diagonal entries are always nonnegative and sum to 1,1, so they form a probability vector. This probability vector specifies the probabilities for obtaining each possible classical state from a standard basis measurement, as was already suggested.

We can also compute the eigenvalues of these randomly generated density matrices. Although the eigenvalues are usually different from the diagonal entries, they also form a probability vector. This is a consequence of the following basic fact from matrix theory.

Theorem. The trace of a square matrix is equal to the sum of its eigenvalues, with each eigenvalue being included in the sum a number of times equal to its multiplicity.

Copy to clipboard

Output:

[0.49774785180.48409077840.0181613698] \begin{bmatrix} 0.4977478518 & 0.4840907784 & 0.0181613698 \\ \end{bmatrix}

Qiskit also includes a DensityMatrix class that includes some useful methods for working with density matrices.

Copy to clipboard

Output:

[34i8i814] \begin{bmatrix} \frac{3}{4} & \frac{i}{8} \\ - \frac{i}{8} & \frac{1}{4} \\ \end{bmatrix}

Connection to quantum state vectors

Recall that a quantum state vector ψ\vert\psi\rangle describing a quantum state of X\mathsf{X} is a column vector having Euclidean norm equal to 11 whose entries have been placed in correspondence with the classical state set Σ.\Sigma. The density matrix representation ρ\rho of the same state is defined as follows.

ρ=ψψ\rho = \vert\psi\rangle\langle\psi\vert

To be clear, we're multiplying a column vector to a row vector, so the result is a square matrix whose rows and columns correspond to Σ.\Sigma. Matrices of this form, in addition to being density matrices, are always projections and have equal to 1.1.

For example, let us define two qubit state vectors as follows.

 ⁣+ ⁣i=120+i21=(12i2) ⁣ ⁣i=120i21=(12i2)\begin{aligned} \vert \!+\!i \rangle & = \frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{i}{\sqrt{2}} \vert 1 \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] \frac{i}{\sqrt{2}} \end{pmatrix} \\[5mm] \vert \!-\!i \rangle & = \frac{1}{\sqrt{2}} \vert 0 \rangle - \frac{i}{\sqrt{2}} \vert 1 \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] -\frac{i}{\sqrt{2}} \end{pmatrix} \end{aligned}

The density matrices corresponding to these two vectors are as follows.

 ⁣+ ⁣i+i=(12i2)(12i2)=(12i2i212) ⁣ ⁣ii=(12i2)(12i2)=(12i2i212)\begin{aligned} \vert \!+\! i\rangle\langle + i\vert & = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] \frac{i}{\sqrt{2}}\end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt{2}} & - \frac{i}{\sqrt{2}}\end{pmatrix} = \begin{pmatrix} \frac{1}{2} & -\frac{i}{2}\\[2mm] \frac{i}{2} & \frac{1}{2} \end{pmatrix}\\[5mm] \vert \!-\! i\rangle\langle -i\vert & = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] -\frac{i}{\sqrt{2}}\end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{i}{\sqrt{2}}\end{pmatrix} = \begin{pmatrix} \frac{1}{2} & \frac{i}{2}\\[2mm] -\frac{i}{2} & \frac{1}{2} \end{pmatrix} \end{aligned}

Here these examples are included with a few other basic examples: 0,\vert 0\rangle, 1,\vert 1\rangle, +,\vert +\rangle, and .\vert -\rangle. We'll see these six states again later in the lesson.

State vectorDensity matrix
0=(10)\vert 0\rangle = \begin{pmatrix} 1 \\[1mm] 0 \end{pmatrix}00=(1000)\vert 0\rangle\langle 0\vert = \begin{pmatrix} 1 & 0\\[1mm] 0 & 0 \end{pmatrix}
1=(01)\vert 1\rangle = \begin{pmatrix} 0 \\[1mm] 1 \end{pmatrix}11=(0001)\vert 1\rangle\langle 1\vert = \begin{pmatrix} 0 & 0\\[1mm] 0 & 1 \end{pmatrix}
+=(1212)\vert +\rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] \frac{1}{\sqrt{2}} \end{pmatrix}++=(12121212)\vert +\rangle\langle +\vert = \begin{pmatrix} \frac{1}{2} & \frac{1}{2}\\[2mm] \frac{1}{2} & \frac{1}{2} \end{pmatrix}
=(1212)\vert -\rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] -\frac{1}{\sqrt{2}}\end{pmatrix}=(12121212)\vert -\rangle\langle - \vert = \begin{pmatrix} \frac{1}{2} & -\frac{1}{2}\\[2mm] -\frac{1}{2} & \frac{1}{2} \end{pmatrix}
 ⁣+ ⁣i=(12i2)\vert \!+\! i\rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] \frac{i}{\sqrt{2}} \end{pmatrix} ⁣+ ⁣i+i=(12i21212)\vert \!+\! i \rangle\langle +i \vert = \begin{pmatrix} \frac{1}{2} & -\frac{i}{2}\\[2mm] \frac{1}{2} & \frac{1}{2} \end{pmatrix}
 ⁣ ⁣i=(12i2)\vert \!-\! i\rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\[2mm] -\frac{i}{\sqrt{2}}\end{pmatrix} ⁣ ⁣ii=(12i2i212)\vert \!-\! i\rangle\langle -i \vert = \begin{pmatrix} \frac{1}{2} & \frac{i}{2}\\[2mm] -\frac{i}{2} & \frac{1}{2} \end{pmatrix}

For one more example, here's a state from the Single systems lesson, including both its state vector and density matrix representations.

v=1+2i30231vv=(5924i92+4i949)\vert v\rangle = \frac{1 + 2 i}{3}\,\vert 0\rangle - \frac{2}{3}\,\vert 1\rangle \qquad \vert v\rangle\langle v\vert = \begin{pmatrix} \frac{5}{9} & \frac{-2 - 4 i}{9}\\[2mm] \frac{-2 + 4 i}{9} & \frac{4}{9} \end{pmatrix}

To check these density matrix representations, we can compute by hand or ask Qiskit to perform the conversion using the .to_operator method from the Statevector class. Here we also use the .from_label method to define the first six state vectors for convenience.

Copy to clipboard

Output:

[1000] \begin{bmatrix} 1 & 0 \\ 0 & 0 \\ \end{bmatrix}[0001] \begin{bmatrix} 0 & 0 \\ 0 & 1 \\ \end{bmatrix}[12121212] \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} \\ \end{bmatrix}[12121212] \begin{bmatrix} \frac{1}{2} & - \frac{1}{2} \\ - \frac{1}{2} & \frac{1}{2} \\ \end{bmatrix}[12i2i212] \begin{bmatrix} \frac{1}{2} & - \frac{i}{2} \\ \frac{i}{2} & \frac{1}{2} \\ \end{bmatrix}[12i2i212] \begin{bmatrix} \frac{1}{2} & \frac{i}{2} \\ - \frac{i}{2} & \frac{1}{2} \\ \end{bmatrix}[59294i929+4i949] \begin{bmatrix} \frac{5}{9} & - \frac{2}{9} - \frac{4 i}{9} \\ - \frac{2}{9} + \frac{4 i}{9} & \frac{4}{9} \\ \end{bmatrix}

We can also use the .from_label method for DensityMatrix elements for the first 6.

Copy to clipboard

Output:

[1000] \begin{bmatrix} 1 & 0 \\ 0 & 0 \\ \end{bmatrix}[0001] \begin{bmatrix} 0 & 0 \\ 0 & 1 \\ \end{bmatrix}[12121212] \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} \\ \end{bmatrix}[12121212] \begin{bmatrix} \frac{1}{2} & - \frac{1}{2} \\ - \frac{1}{2} & \frac{1}{2} \\ \end{bmatrix}[12i2i212] \begin{bmatrix} \frac{1}{2} & - \frac{i}{2} \\ \frac{i}{2} & \frac{1}{2} \\ \end{bmatrix}[12i2i212] \begin{bmatrix} \frac{1}{2} & \frac{i}{2} \\ - \frac{i}{2} & \frac{1}{2} \\ \end{bmatrix}

Density matrices that take the form ρ=ψψ\rho = \vert \psi \rangle \langle \psi \vert for some quantum state vector ψ\vert \psi \rangle are known as pure states. Not every density matrix can be written in this form; some states are not pure.

As density matrices, pure states always have one eigenvalue equal to 11 and all other eigenvalues equal to 0.0. This is consistent with the interpretation that the eigenvalues of a density matrix describe the randomness or uncertainty inherent to that state. A way to think about this is that there's no uncertainty for a pure state ρ=ψψ\rho = \vert \psi \rangle \langle \psi \vert — the state is definitely ψ.\vert \psi \rangle.

In general, for a quantum state vector

ψ=(α0α1αn1)\vert\psi\rangle = \begin{pmatrix} \alpha_0\\ \alpha_1\\ \vdots\\ \alpha_{n-1} \end{pmatrix}

for a system with nn classical states, the density matrix representation of the same state is as follows.

ψψ=(α0α0α0α1α0αn1α1α0α1α1α1αn1αn1α0αn1α1αn1αn1)=(α02α0α1α0αn1α1α0α12α1αn1αn1α0αn1α1αn12)\vert\psi\rangle\langle\psi\vert = \begin{pmatrix} \alpha_0 \overline{\alpha_0} & \alpha_0 \overline{\alpha_1} & \cdots & \alpha_0 \overline{\alpha_{n-1}}\\[1mm] \alpha_1 \overline{\alpha_0} & \alpha_1 \overline{\alpha_1} & \cdots & \alpha_1 \overline{\alpha_{n-1}}\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \alpha_{n-1} \overline{\alpha_0} & \alpha_{n-1} \overline{\alpha_1} & \cdots & \alpha_{n-1} \overline{\alpha_{n-1}} \end{pmatrix} = \begin{pmatrix} \vert\alpha_0\vert^2 & \alpha_0 \overline{\alpha_1} & \cdots & \alpha_0 \overline{\alpha_{n-1}}\\[1mm] \alpha_1 \overline{\alpha_0} & \vert\alpha_1\vert^2 & \cdots & \alpha_1 \overline{\alpha_{n-1}}\\[1mm] \vdots & \vdots & \ddots & \vdots\\[1mm] \alpha_{n-1} \overline{\alpha_0} & \alpha_{n-1} \overline{\alpha_1} & \cdots & \vert\alpha_{n-1}\vert^2 \end{pmatrix}

Thus, for the special case of pure states, we can verify that the diagonal entries of a density matrix describe the probabilities that a standard basis measurement would output each possible classical state.

A final remark about pure states is that density matrices eliminate the degeneracy concerning global phases found for quantum state vectors. Suppose we have two quantum state vectors that differ by a global phase: ψ\vert \psi \rangle and ϕ=eiθψ,\vert \phi \rangle = e^{i \theta} \vert \psi \rangle, for some real number θ.\theta. Because they differ by a global phase, these vectors represent exactly the same quantum state, despite the fact that the vectors may be different. The density matrices that we obtain from these two state vectors, on the other hand, are identical.

ϕϕ=(eiθψ)(eiθψ)=ei(θθ)ψψ=ψψ\vert \phi \rangle \langle \phi \vert = \bigl( e^{i\theta} \vert \psi \rangle \bigr) \bigl( e^{i\theta} \vert \psi \rangle \bigr)^{\dagger} = e^{i(\theta - \theta)} \vert \psi \rangle \langle \psi \vert = \vert \psi \rangle \langle \psi \vert

In general, density matrices provide a unique representation of quantum states: two quantum states are identical, generating exactly the same outcome statistics for every possible measurement that can be performed on them, if and only if their density matrix representations are equal. Using mathematical parlance, we can express this by saying that density matrices offer a faithful representation of quantum states.

Convex combinations of density matrices

A key aspect of density matrices is that probabilistic selections of quantum states are represented by convex combinations of their associated density matrices.

For example, if we have two density matrices, ρ\rho and σ,\sigma, representing quantum states of a system X,\mathsf{X}, and we prepare the system in the state ρ\rho with probability p[0,1]p\in[0,1] and σ\sigma with probability 1p,1 - p, then the resulting quantum state is represented by the density matrix

pρ+(1p)σ.p \rho + (1 - p) \sigma.

More generally, if we have mm quantum states represented by density matrices ρ0,,ρm1,\rho_0,\ldots,\rho_{m-1}, and a system is prepared in the state ρk\rho_k with probability pkp_k for some probability vector (p0,,pm1),(p_0,\ldots,p_{m-1}), the resulting state is represented by the density matrix

k=0m1pkρk.\sum_{k = 0}^{m-1} p_k \rho_k.

This is a convex combination of the density matrices ρ0,,ρm1.\rho_0,\ldots,\rho_{m-1}.

If we suppose that we have mm quantum state vectors ψ0,,ψm1,\vert\psi_0\rangle,\ldots,\vert\psi_{m-1}\rangle, and we prepare a system in the state ψk\vert\psi_k\rangle with probability pkp_k for each k{0,,m1},k\in\{0,\ldots,m-1\}, the state we obtain is represented by the density matrix

k=0m1pkψkψk.\sum_{k = 0}^{m-1} p_k \vert\psi_k\rangle\langle\psi_k\vert.

For example, if a qubit is prepared in the state 0\vert 0\rangle with probability 1/21/2 and in the state +\vert + \rangle with probability 1/2,1/2, the density matrix representation of the state we obtain is given by

1200+12++=12(1000)+12(12121212)=(34141414).\frac{1}{2} \vert 0\rangle\langle 0 \vert + \frac{1}{2} \vert +\rangle\langle + \vert = \frac{1}{2} \begin{pmatrix} 1 & 0\\[1mm] 0 & 0 \end{pmatrix} + \frac{1}{2} \begin{pmatrix} \frac{1}{2} & \frac{1}{2}\\[2mm] \frac{1}{2} & \frac{1}{2} \end{pmatrix} = \begin{pmatrix} \frac{3}{4} & \frac{1}{4}\\[2mm] \frac{1}{4} & \frac{1}{4} \end{pmatrix}.

In the simplified formulation of quantum information, averaging quantum state vectors like this doesn't work. For instance, the vector

120+12+=12(10)+12(1212)=(2+2424)\frac{1}{2} \vert 0\rangle + \frac{1}{2} \vert + \rangle = \frac{1}{2} \begin{pmatrix}1\\[1mm] 0\end{pmatrix} + \frac{1}{2} \begin{pmatrix}\frac{1}{\sqrt{2}}\\[2mm]\frac{1}{\sqrt{2}}\end{pmatrix} = \begin{pmatrix}\frac{2 + \sqrt{2}}{4}\\[2mm]\frac{\sqrt{2}}{4}\end{pmatrix}

is not a valid quantum state vector because its Euclidean norm is not equal to 1.1.

A more extreme example that shows that this doesn't work for quantum state vectors is that we fix any quantum state vector ψ\vert\psi\rangle that we wish, and then we take our state to be ψ\vert\psi\rangle with probability 1/21/2 and ψ-\vert\psi\rangle with probability 1/2.1/2. These states differ by a global phase, so they're actually the same state — but averaging gives us the zero vector, which is not a valid quantum state vector.

The completely mixed state

Suppose we set the state of a qubit to be 0\vert 0\rangle or 1\vert 1\rangle randomly, each with probability 1/2.1/2. The density matrix representing the resulting state is as follows. (In this equation the symbol I\mathbb{I} denotes the 2×22\times 2 identity matrix.)

1200+1211=12(1000)+12(0001)=(120012)=12I\frac{1}{2} \vert 0\rangle\langle 0\vert + \frac{1}{2} \vert 1\rangle\langle 1\vert = \frac{1}{2} \begin{pmatrix} 1 & 0\\[1mm] 0 & 0 \end{pmatrix} + \frac{1}{2} \begin{pmatrix} 0 & 0\\[1mm] 0 & 1 \end{pmatrix} = \begin{pmatrix} \frac{1}{2} & 0\\[1mm] 0 & \frac{1}{2} \end{pmatrix} = \frac{1}{2} \mathbb{I}

This is a special state known as the completely mixed state. It represents complete uncertainty about the state of a qubit, similar to a uniform random bit in the probabilistic setting.

Now suppose that we change the procedure: in place of the states 0\vert 0\rangle and 1\vert 1\rangle we'll use the states +\vert + \rangle and .\vert - \rangle. We can compute the density matrix that describes the resulting state in a similar way.

12+++12=12(12121212)+12(12121212)=(120012)=12I\frac{1}{2} \vert +\rangle\langle +\vert + \frac{1}{2} \vert -\rangle\langle -\vert = \frac{1}{2} \begin{pmatrix} \frac{1}{2} & \frac{1}{2}\\[2mm] \frac{1}{2} & \frac{1}{2} \end{pmatrix} + \frac{1}{2} \begin{pmatrix} \frac{1}{2} & -\frac{1}{2}\\[2mm] -\frac{1}{2} & \frac{1}{2} \end{pmatrix} = \begin{pmatrix} \frac{1}{2} & 0\\[2mm] 0 & \frac{1}{2} \end{pmatrix} = \frac{1}{2} \mathbb{I}

It's the same density matrix as before, even though we changed the states. We would again obtain the same result — the completely mixed state — by substituting any two orthogonal qubit state vectors for 0\vert 0\rangle and 1.\vert 1\rangle.

This is a feature not a bug! We do in fact obtain exactly the same state either way. That is, there's no way to distinguish the two procedures by measuring the qubit they produce, even in a statistical sense — so we've simply described the same state in two different ways.

We can verify that this makes sense by thinking about what we could hope to learn given a random selection of a state from one of the two possible state sets {0,1}\{\vert 0\rangle,\vert 1\rangle\} and {+,}.\{\vert +\rangle,\vert -\rangle\}. To keep things simple, let's suppose that we perform a unitary operation UU on our qubit and then measure in the standard basis.

In the first scenario, the state of the qubit is chosen uniformly from the set {0,1}.\{\vert 0\rangle,\vert 1\rangle\}. If the state is 0,\vert 0\rangle, we obtain the outcomes 00 and 11 with probabilities

0U02and1U02\vert \langle 0 \vert U \vert 0 \rangle \vert^2 \quad\text{and}\quad \vert \langle 1 \vert U \vert 0 \rangle \vert^2

respectively. If the state is 1,\vert 1\rangle, we obtain the outcomes 00 and 11 with probabilities

0U12and1U12.\vert \langle 0 \vert U \vert 1 \rangle \vert^2 \quad\text{and}\quad \vert \langle 1 \vert U \vert 1 \rangle \vert^2.

Because the two possibilities each happen with probability 1/2,1/2, we obtain the outcome 00 with probability

120U02+120U12\frac{1}{2}\vert \langle 0 \vert U \vert 0 \rangle \vert^2 + \frac{1}{2}\vert \langle 0 \vert U \vert 1 \rangle \vert^2

and the outcome 11 with probability

121U02+121U12.\frac{1}{2}\vert \langle 1 \vert U \vert 0 \rangle \vert^2 + \frac{1}{2}\vert \langle 1 \vert U \vert 1 \rangle \vert^2.

Both of these expressions are equal to 1/2.1/2. One way to argue this is to use a fact from linear algebra that can be seen as a generalization of the Pythagorean theorem.

Theorem. Suppose {ψ1,,ψn}\{\vert\psi_1\rangle,\ldots,\vert\psi_n\rangle\} is an orthonormal basis of a (real or complex) vector space V.\mathcal{V}. For every vector ϕV\vert \phi\rangle \in \mathcal{V} we have ψ1ϕ2++ψnϕ2=ϕ2.\vert \langle \psi_1\vert\phi\rangle\vert^2 + \cdots + \vert \langle \psi_n \vert \phi \rangle\vert^2 = \| \vert\phi\rangle \|^2.

We can apply this theorem to determine the probabilities as follows. The probability to get 00 is

120U02+120U12=12(0U02+0U12)=12(0U02+1U02)=12U02\begin{aligned} \frac{1}{2}\vert \langle 0 \vert U \vert 0 \rangle \vert^2 + \frac{1}{2}\vert \langle 0 \vert U \vert 1 \rangle \vert^2 & = \frac{1}{2} \Bigl( \vert \langle 0 \vert U \vert 0 \rangle \vert^2 + \vert \langle 0 \vert U \vert 1 \rangle \vert^2 \Bigr) \\[2mm] & = \frac{1}{2} \Bigl( \vert \langle 0 \vert U^{\dagger} \vert 0 \rangle \vert^2 + \vert \langle 1 \vert U^{\dagger} \vert 0 \rangle \vert^2 \Bigr)\\[2mm] & = \frac{1}{2} \bigl\| U^{\dagger} \vert 0 \rangle \bigr\|^2 \end{aligned}

and the probability to get 11 is

121U02+121U12=12(1U02+1U12)=12(0U12+1U12)=12U12.\begin{aligned} \frac{1}{2}\vert \langle 1 \vert U \vert 0 \rangle \vert^2 + \frac{1}{2}\vert \langle 1 \vert U \vert 1 \rangle \vert^2 & = \frac{1}{2} \Bigl( \vert \langle 1 \vert U \vert 0 \rangle \vert^2 + \vert \langle 1 \vert U \vert 1 \rangle \vert^2 \Bigr) \\[2mm] & = \frac{1}{2} \Bigl( \vert \langle 0 \vert U^{\dagger} \vert 1 \rangle \vert^2 + \vert \langle 1 \vert U^{\dagger} \vert 1 \rangle \vert^2 \Bigr)\\[2mm] & = \frac{1}{2} \bigl\| U^{\dagger} \vert 1 \rangle \bigr\|^2. \end{aligned}

Because UU is unitary we know that UU^{\dagger} is unitary as well, implying that both U0U^{\dagger} \vert 0 \rangle and U1U^{\dagger} \vert 1 \rangle are unit vectors. Both probabilities are therefore equal to 1/2.1/2. This means that no matter how we choose U,U, we're just going to get a uniform random bit from the measurement.

We can perform a similar verification for any other pair of orthonormal states in place of 0\vert 0\rangle and 1.\vert 1\rangle. For example, because {+,}\{\vert + \rangle, \vert - \rangle\} is an orthonormal basis, the probability to obtain the measurement outcome 00 is in the second procedure is

120U+2+120U2=12U02=12\frac{1}{2}\vert \langle 0 \vert U \vert + \rangle \vert^2 + \frac{1}{2}\vert \langle 0 \vert U \vert - \rangle \vert^2 = \frac{1}{2} \bigl\| U^{\dagger} \vert 0 \rangle \bigr\|^2 = \frac{1}{2}

and the probability to get 11 is

121U+2+121U2=12U12=12.\frac{1}{2}\vert \langle 1 \vert U \vert + \rangle \vert^2 + \frac{1}{2}\vert \langle 1 \vert U \vert - \rangle \vert^2 = \frac{1}{2} \bigl\| U^{\dagger} \vert 1 \rangle \bigr\|^2 = \frac{1}{2}.

In particular, we obtain exactly the same output statistics as we did for the states 0\vert 0\rangle and 1.\vert 1\rangle.

Probabilistic states

Classical states can be represented by density matrices. In particular, for each classical state aa of a system X,\mathsf{X}, the density matrix

ρ=aa\rho = \vert a\rangle \langle a \vert

represents X\mathsf{X} being definitively in the classical state a.a. For qubits we have

00=(1000)and11=(0001),\vert 0\rangle \langle 0 \vert = \begin{pmatrix}1 & 0 \\ 0 & 0\end{pmatrix} \quad\text{and}\quad \vert 1\rangle \langle 1 \vert = \begin{pmatrix}0 & 0 \\ 0 & 1\end{pmatrix},

and in general we have a single 11 on the diagonal in the position corresponding to the classical state we have in mind, with all other entries zero.

We can then take convex combinations of these density matrices to represent probabilistic states. Supposing for simplicity that our classical state set is {0,,n1},\{0,\ldots,n-1\}, if we have that X\mathsf{X} is in the state aa with probability pap_a for each a{0,,n1},a\in\{0,\ldots,n-1\}, then the density matrix we obtain is

ρ=a=0n1paaa=(p0000p1000pn1).\rho = \sum_{a = 0}^{n-1} p_a \vert a\rangle \langle a \vert = \begin{pmatrix} p_0 & 0 & \cdots & 0\\ 0 & p_1 & \ddots & \vdots\\ \vdots & \ddots & \ddots & 0\\ 0 & \cdots & 0 & p_{n-1} \end{pmatrix}.

Going in the other direction, any diagonal density matrix can naturally be identified with the probabilistic state we obtain by simply reading the probability vector off from the diagonal. To be clear, when a density matrix is diagonal, it's not necessarily the case that we're talking about a classical system, or that the system must have been prepared through the random selection of a classical state, but rather that the state could have been obtained through the random selection of a classical state.

The fact that probabilistic states are represented by diagonal density matrices is consistent with the intuition suggested at the start of the lesson that off-diagonal entries describe the degree to which the two classical states corresponding to the row and column of that entry are in quantum superposition. Here all of the off-diagonal entries are zero, so we just have classical randomness and nothing is in quantum superposition.

Density matrices and the spectral theorem

We've seen that if we take a convex combination of pure states,

ρ=k=0m1pkψkψk,\rho = \sum_{k = 0}^{m-1} p_k \vert \psi_k\rangle \langle \psi_k \vert,

we obtain a density matrix. Every density matrix ρ,\rho, in fact, can be expressed as a convex combination of pure states like this. That is, there will always exist a collection of unit vectors {ψ0,,ψm1}\{\vert\psi_0\rangle,\ldots,\vert\psi_{m-1}\rangle\} and a probability vector (p0,,pm1)(p_0,\ldots,p_{m-1}) for which the equation above is true.

We can, moreover, always choose the number mm so that it agrees with the number of classical states of the system being considered, and we can select the quantum state vectors to be orthogonal. The spectral theorem allows us to conclude this. (The statement of this theorem that follows refers to a normal matrix M.M. This is a matrix that satisfies MM=MM,M^{\dagger} M = M M^{\dagger}, or in words, commutes with its own conjugate transpose.)

Theorem (spectral theorem). Let MM be a normal n×nn\times n complex matrix. There exists an orthonormal basis of nn dimensional complex vectors {ψ0,,ψn1}\{\vert\psi_0\rangle,\ldots,\vert\psi_{n-1}\rangle \} along with complex numbers λ0,,λn1\lambda_0,\ldots,\lambda_{n-1} such that M=λ0ψ0ψ0++λn1ψn1ψn1.M = \lambda_0 \vert \psi_0\rangle\langle \psi_0\vert + \cdots + \lambda_{n-1} \vert \psi_{n-1}\rangle\langle \psi_{n-1}\vert.

We can apply this theorem to a given density matrix ρ\rho because density matrices are Hermitian and therefore normal, which allows us to write

ρ=λ0ψ0ψ0++λn1ψn1ψn1\rho = \lambda_0 \vert \psi_0\rangle\langle \psi_0\vert + \cdots + \lambda_{n-1} \vert \psi_{n-1}\rangle\langle \psi_{n-1}\vert

for some orthonormal basis {ψ0,,ψn1}.\{\vert\psi_0\rangle,\ldots,\vert\psi_{n-1}\rangle\}. It remains to verify that (λ0,,λn1)(\lambda_0,\ldots,\lambda_{n-1}) is a probability vector, which we can then rename to (p0,,pn1)(p_0,\ldots,p_{n-1}) if we wish.

The numbers λ0,,λn1\lambda_0,\ldots,\lambda_{n-1} are the eigenvalues of ρ,\rho, and because ρ\rho is positive semidefinite these numbers must therefore be nonnegative real numbers. We can conclude that λ0++λn1=1\lambda_0 + \cdots + \lambda_{n-1} = 1 from the fact that ρ\rho has trace equal to 1.1. Going through the details will give us an opportunity to point out an important and useful property of the trace.

Theorem (cyclic property of the trace). For any two matrices AA and BB that give us a square matrix ABAB by multiplying, the equality Tr(AB)=Tr(BA)\operatorname{Tr}(AB) = \operatorname{Tr}(BA) is true.

Note that this theorem works even if AA and BB are not themselves square matrices — we may have that AA is n×mn\times m and BB is m×n,m\times n, for some choice of positive integers nn and m,m, so that ABAB is an n×nn\times n square matrix and BABA is m×m.m\times m. So, if we let AA be a column vector ϕ\vert\phi\rangle and let BB be the row vector ϕ,\langle \phi\vert, then we see that

Tr(ϕϕ)=Tr(ϕϕ)=ϕϕ.\operatorname{Tr}\bigl(\vert\phi\rangle\langle\phi\vert\bigr) = \operatorname{Tr}\bigl(\langle\phi\vert\phi\rangle\bigr) = \langle\phi\vert\phi\rangle.

The second equality follows from the fact that ϕϕ\langle\phi\vert\phi\rangle is a scalar, which we can also think of as a 1×11\times 1 matrix whose trace is its single entry. Using this fact, we can conclude that λ0++λn1=1\lambda_0 + \cdots + \lambda_{n-1} = 1 by the linearity of the trace function.

1=Tr(ρ)=Tr(λ0ψ0ψ0++λn1ψn1ψn1)=λ0Tr(ψ0ψ0)++λn1Tr(ψn1ψn1)=λ0++λn1\begin{gathered} 1 = \operatorname{Tr}(\rho) = \operatorname{Tr}\bigl(\lambda_0 \vert \psi_0\rangle\langle \psi_0\vert + \cdots + \lambda_{n-1} \vert \psi_{n-1}\rangle\langle \psi_{n-1}\vert\bigr)\\[2mm] = \lambda_0 \operatorname{Tr}\bigl(\vert \psi_0\rangle\langle \psi_0\vert\bigr) + \cdots + \lambda_{n-1} \operatorname{Tr}\bigl(\vert \psi_{n-1}\rangle\langle \psi_{n-1}\vert\bigr) = \lambda_0 + \cdots + \lambda_{n-1} \end{gathered}

Alternatively, we can use a fact that was mentioned previously, which is that the trace of a square matrix is equal to the sum of its eigenvalues, to reach the same conclusion.

We have therefore concluded that any given density matrix ρ\rho can be expressed as a convex combination of pure states. We also see that we can, moreover, take the pure states to be orthogonal. This means, in particular, that we never need the number nn to be larger than the size of the classical state set of X.\mathsf{X}.

It must be understood that there will in general be many different ways to write a density matrix as a convex combination of pure states, not just the ways that the spectral theorem provides. A previous example illustrates this.

1200+12++=(34141414)\frac{1}{2} \vert 0\rangle\langle 0 \vert + \frac{1}{2} \vert +\rangle\langle + \vert = \begin{pmatrix} \frac{3}{4} & \frac{1}{4}\\[2mm] \frac{1}{4} & \frac{1}{4} \end{pmatrix}

This is not a spectral decomposition of this matrix because 0\vert 0\rangle and +\vert + \rangle are not orthogonal. Here's a spectral decomposition:

(34141414)=cos2(π/8)ψπ/8ψπ/8+sin2(π/8)ψ5π/8ψ5π/8,\begin{pmatrix} \frac{3}{4} & \frac{1}{4}\\[2mm] \frac{1}{4} & \frac{1}{4} \end{pmatrix} = \cos^2(\pi/8) \vert \psi_{\pi/8} \rangle \langle \psi_{\pi/8}\vert + \sin^2(\pi/8) \vert \psi_{5\pi/8} \rangle \langle \psi_{5\pi/8}\vert,

where ψθ=cos(θ)0+sin(θ)1.\vert \psi_{\theta} \rangle = \cos(\theta)\vert 0\rangle + \sin(\theta)\vert 1\rangle. The eigenvalues are numbers we've seen before.

cos2(π/8)=2+240.85andsin2(π/8)=2240.15\cos^2(\pi/8) = \frac{2+\sqrt{2}}{4} \approx 0.85 \quad\text{and}\quad \sin^2(\pi/8) = \frac{2-\sqrt{2}}{4} \approx 0.15

The eigenvectors can be written explicitly like this.

ψπ/8=2+220+2221ψ5π/8=2220+2+221\begin{aligned} \vert\psi_{\pi/8}\rangle & = \frac{\sqrt{2 + \sqrt{2}}}{2}\vert 0\rangle + \frac{\sqrt{2 - \sqrt{2}}}{2}\vert 1\rangle \\[3mm] \vert\psi_{5\pi/8}\rangle & = -\frac{\sqrt{2 - \sqrt{2}}}{2}\vert 0\rangle + \frac{\sqrt{2 + \sqrt{2}}}{2}\vert 1\rangle \end{aligned}

As another, more general example, suppose ϕ0,,ϕ99\vert \phi_0\rangle,\ldots,\vert \phi_{99} \rangle are quantum state vectors representing states of a qubit, chosen arbitrarily — so we're not assuming any particular relationships among these vectors. We could then consider the state we obtain by choosing one of these 100100 states uniformly at random:

ρ=1100k=099ϕkϕk.\rho = \frac{1}{100} \sum_{k = 0}^{99} \vert \phi_k\rangle\langle \phi_k \vert.

Because we're talking about a qubit, the density matrix ρ\rho is 2×2,2\times 2, so by the spectral theorem we could alternatively write

ρ=pψ0ψ0+(1p)ψ1ψ1\rho = p \vert\psi_0\rangle\langle\psi_0\vert + (1 - p) \vert\psi_1\rangle\langle\psi_1\vert

for some real number p[0,1]p\in[0,1] and an orthonormal basis {ψ0,ψ1}\{\vert\psi_0\rangle,\vert\psi_1\rangle\} — but naturally the existence of this expression doesn't prohibit us from writing ρ\rho as an average of 100 pure states if we choose to do that.

Bloch sphere

There's a useful geometric way to represent pure states of qubits known as the Bloch sphere. It's very convenient, but unfortunately it only works for qubits — once we have three or more classical states in our system, the analogous representation no longer corresponds to a spherical object.

Let's start by thinking about a quantum state vector of a qubit: α0+β1.\alpha \vert 0\rangle + \beta \vert 1\rangle. We can restrict our attention to vectors for which α\alpha is a nonnegative real number because every qubit state vector is equivalent up to a global phase to one for which α0.\alpha \geq 0. This allows us to write

ψ=cos(θ/2)0+eiϕsin(θ/2)1\vert\psi\rangle = \cos\bigl(\theta/2\bigr) \vert 0\rangle + e^{i\phi} \sin\bigl(\theta/2\bigr) \vert 1\rangle

for two real numbers θ[0,π]\theta \in [0,\pi] and ϕ[0,2π).\phi\in[0,2\pi). Here we're allowing θ\theta to range from 00 to π\pi and dividing by 22 in the expression of the vector because this is a conventional way to parameterize vectors of this sort, and it will make things simpler a bit later on.

It isn't quite the case that the numbers θ\theta and ϕ\phi are uniquely determined by a given quantum state vector α0+β1,\alpha \vert 0\rangle + \beta \vert 1\rangle, but it is nearly so. In particular, if θ=0,\theta = 0, then we have ψ=0,\vert\psi\rangle = \vert 0\rangle, and it doesn't make any difference what value ϕ\phi takes, so it can be chosen arbitrarily. Similarly, if θ=π,\theta = \pi, then we have ψ=eiϕ1,\vert\psi\rangle = e^{i\phi}\vert 1\rangle, which is equivalent up to a global phase to 1,\vert 1\rangle, so once again ϕ\phi is irrelevant. If, however, neither α\alpha nor β\beta is 0,0, then there's a unique choice for the pair (θ,ϕ)(\theta,\phi) for which ψ\vert\psi\rangle is equivalent to α0+β1\alpha\vert 0\rangle + \beta\vert 1\rangle up to a global phase.

Now let's think about the density matrix representation of this state.

ψψ=(cos2(θ/2)eiϕcos(θ/2)sin(θ/2)eiϕcos(θ/2)sin(θ/2)sin2(θ/2))\vert\psi\rangle\langle\psi\vert = \begin{pmatrix} \cos^2(\theta/2) & e^{-i\phi}\cos(\theta/2)\sin(\theta/2)\\[2mm] e^{i\phi}\cos(\theta/2)\sin(\theta/2) & \sin^2(\theta/2) \end{pmatrix}

We can use some trigonometric identities,

cos2(θ/2)=1+cos(θ)2,sin2(θ/2)=1cos(θ)2,cos(θ/2)sin(θ/2)=sin(θ)2,\begin{gathered} \cos^2(\theta/2) = \frac{1 + \cos(\theta)}{2},\\[2mm] \sin^2(\theta/2) = \frac{1 - \cos(\theta)}{2},\\[2mm] \cos(\theta/2) \sin(\theta/2) = \frac{\sin(\theta)}{2}, \end{gathered}

as well as the formula eiϕ=cos(ϕ)+isin(ϕ),e^{i\phi} = \cos(\phi) + i\sin(\phi), to simplify the density matrix as follows.

ψψ=12(1+cos(θ)(cos(ϕ)isin(ϕ))sin(θ)(cos(ϕ)+isin(ϕ))sin(θ)1cos(θ))\vert\psi\rangle\langle\psi\vert = \frac{1}{2} \begin{pmatrix} 1 + \cos(\theta) & (\cos(\phi) - i \sin(\phi)) \sin(\theta)\\[1mm] (\cos(\phi) + i \sin(\phi)) \sin(\theta) & 1 - \cos(\theta) \end{pmatrix}

This makes it easy to express this density matrix as a linear combination of the Pauli matrices:

I=(1001),σx=(0110),σy=(0ii0),σz=(1001). \mathbb{I} = \begin{pmatrix} 1 & 0\\[1mm] 0 & 1 \end{pmatrix}, \quad \sigma_x = \begin{pmatrix} 0 & 1\\[1mm] 1 & 0 \end{pmatrix}, \quad \sigma_y = \begin{pmatrix} 0 & -i\\[1mm] i & 0 \end{pmatrix}, \quad \sigma_z = \begin{pmatrix} 1 & 0\\[1mm] 0 & -1 \end{pmatrix}.

Specifically, we conclude that

ψψ=I+sin(θ)cos(ϕ)σx+sin(θ)sin(ϕ)σy+cos(θ)σz2.\vert\psi\rangle\langle\psi\vert = \frac{\mathbb{I} + \sin(\theta) \cos(\phi)\sigma_x + \sin(\theta)\sin(\phi) \sigma_y + \cos(\theta) \sigma_z}{2}.

Next let's take a look at the three coefficients of σx,\sigma_x, σy,\sigma_y, and σz\sigma_z in the numerator of this expression. They're all real numbers and we can collect them together to form a 33-dimensional vector.

(sin(θ)cos(ϕ),sin(θ)sin(ϕ),cos(θ))\bigl(\sin(\theta) \cos(\phi), \sin(\theta)\sin(\phi), \cos(\theta)\bigr)

This vector is written (1,θ,ϕ)(1,\theta,\phi) in spherical coordinates: the first coordinate 11 represents the radius or radial distance, θ\theta represents the polar angle, and ϕ\phi represents the azimuthal angle. In words, the polar angle θ\theta is how far we rotate south from the north pole, from 00 to π=180,\pi = 180^{\circ}, while the azimuthal angle ϕ\phi is how far we rotate east from the prime meridian, from 00 to 2π=360,2\pi = 360^{\circ}, assuming that the prime meridian is defined to be the curve on the surface of the sphere from one pole to the other that passes through the positive xx-axis.

Illustration of a point on the unit 2-sphere in terms of its spherical coordinates.

We can describe every point on the sphere in this way, which is to say that the points we obtain when we range over all possible pure states of a qubit correspond precisely to a sphere in 33 real dimensions. (This sphere is typically called the unit 22-sphere because the surface of this sphere is two-dimensional.) When we associate points on the unit 22-sphere with pure states of qubits, we obtain the Bloch sphere representation these states.

Six important states

  1. The standard basis {0,1}.\{\vert 0\rangle,\vert 1\rangle\}. Let's start with the state 0.\vert 0\rangle. As a density matrix it can be written like this.

    00=I+σz2\vert 0 \rangle \langle 0 \vert = \frac{\mathbb{I} + \sigma_z}{2}

    By collecting the coefficients of the Pauli matrices in the numerator, we see that the corresponding point on the unit 22-sphere using Cartesian coordinates is (0,0,1).(0,0,1). In spherical coordinates this point is (1,0,ϕ)(1,0,\phi) where ϕ\phi can be any angle. This is consistent with the expression

    0=cos(0)0+eiϕsin(0)1,\vert 0\rangle = \cos(0) \vert 0\rangle + e^{i \phi} \sin(0) \vert 1\rangle,

    which also works for any ϕ.\phi. Intuitively speaking, the polar angle θ\theta is zero, so we're at the north pole of the Bloch sphere, where the azimuthal angle is irrelevant. Along similar lines, a density matrix for the state 1\vert 1\rangle can be written like so.

    11=Iσz2\vert 1 \rangle \langle 1 \vert = \frac{\mathbb{I} - \sigma_z}{2}

    This time the Cartesian coordinates are (0,0,1).(0,0,-1). In spherical coordinates this point is (1,π,ϕ)(1,\pi,\phi) where ϕ\phi can be any angle. Intuitively speaking, the polar angle is all the way to π,\pi, so we're at the south pole where the azimuthal angle is again irrelevant.

  2. The basis {+,}.\{\vert + \rangle, \vert - \rangle\}. This time we have these expressions.

    ++=I+σx2=Iσx2\begin{aligned} \vert + \rangle\langle + \vert & = \frac{\mathbb{I} + \sigma_x}{2}\\[2mm] \vert - \rangle\langle - \vert & = \frac{\mathbb{I} - \sigma_x}{2} \end{aligned}

    The corresponding points on the unit 22-sphere have Cartesian coordinates (1,0,0)(1,0,0) and (1,0,0),(-1,0,0), and spherical coordinates (1,π/2,0)(1,\pi/2,0) and (1,π/2,π),(1,\pi/2,\pi), respectively. In words, +\vert +\rangle corresponds to the point where the positive xx-axis intersects the unit 22-sphere and \vert -\rangle to the point where the negative xx-axis intersects it. More intuitively, +\vert +\rangle is on the equator of the Bloch sphere where it meets the prime meridian, and \vert - \rangle is on the equator at the opposite side of the sphere.

  3. The basis { ⁣+ ⁣i, ⁣ ⁣i}.\{\vert \!+\! i\, \rangle, \vert \!-\! i\, \rangle\}. As we saw earlier in the lesson, these two states are defined like this:

     ⁣+ ⁣i=120+i21 ⁣ ⁣i=120i21.\begin{aligned} \vert \!+\! i\, \rangle & = \frac{1}{\sqrt{2}} \vert 0 \rangle + \frac{i}{\sqrt{2}} \vert 1 \rangle\\[2mm] \vert \!-\! i\, \rangle & = \frac{1}{\sqrt{2}} \vert 0 \rangle - \frac{i}{\sqrt{2}} \vert 1 \rangle. \end{aligned}

    This time we have these expressions.

     ⁣+ ⁣i+i=I+σy2 ⁣ ⁣ii=Iσy2\begin{aligned} \vert \!+\! i\, \rangle\langle + i\, \vert & = \frac{\mathbb{I} + \sigma_y}{2}\\[2mm] \vert \!-\! i\, \rangle\langle - i\, \vert & = \frac{\mathbb{I} - \sigma_y}{2} \end{aligned}

    The corresponding points on the unit 22-sphere have Cartesian coordinates (0,1,0)(0,1,0) and (0,1,0),(0,-1,0), and spherical coordinates (1,π/2,π/2)(1,\pi/2,\pi/2) and (1,π/2,3π/2),(1,\pi/2,3\pi/2), respectively. In words,  ⁣+ ⁣i\vert \!+\! i\,\rangle corresponds to the point where the positive yy-axis intersects the unit 22-sphere and  ⁣ ⁣i\vert \!-\! i\,\rangle to the point where the negative yy-axis intersects it.

Illustration of six examples of pure states on the Bloch sphere

Here's another class of quantum state vectors that has appeared from time to time throughout this series, including previously in this lesson.

ψα=cos(α)0+sin(α)1(for α[0,π))\vert \psi_{\alpha} \rangle = \cos(\alpha) \vert 0\rangle + \sin(\alpha) \vert 1\rangle \qquad \text{(for $\alpha \in [0,\pi)$)}

The density matrix representation of each of these states is as follows.

ψαψα=(cos2(α)cos(α)sin(α)cos(α)sin(α)sin2(α))=I+sin(2α)σx+cos(2α)σz2\vert \psi_{\alpha} \rangle \langle \psi_{\alpha} \vert = \begin{pmatrix} \cos^2(\alpha) & \cos(\alpha)\sin(\alpha)\\[2mm] \cos(\alpha)\sin(\alpha) & \sin^2(\alpha) \end{pmatrix} = \frac{\mathbb{I} + \sin(2\alpha) \sigma_x + \cos(2\alpha) \sigma_z}{2}

The following figure illustrates the corresponding points on the Bloch sphere for a few choices for α.\alpha.

Illustration of real-valued qubit state vectors on the Bloch sphere

Convex combinations of points

Similar to what we already discussed for density matrices, we can take convex combinations of points on the Bloch sphere to obtain representations of qubit density matrices. In general this results in points inside of the Bloch sphere, which represent density matrices of states that are not pure. Sometimes we refer to the Bloch ball when we wish to be explicit about the inclusion of points inside of the Bloch sphere as representations of qubit density matrices.

For example, we have seen that the density matrix 12I,\frac{1}{2}\mathbb{I}, which represents the completely mixed state of a qubit, can be written in these two alternative ways:

12I=1200+1211and12I=12+++12.\frac{1}{2} \mathbb{I} = \frac{1}{2} \vert 0\rangle\langle 0\vert + \frac{1}{2} \vert 1\rangle\langle 1\vert \quad\text{and}\quad \frac{1}{2} \mathbb{I} = \frac{1}{2} \vert +\rangle\langle +\vert + \frac{1}{2} \vert -\rangle\langle -\vert.

We also have

12I=12 ⁣+ ⁣i+i+12 ⁣ ⁣ii,\frac{1}{2} \mathbb{I} = \frac{1}{2} \vert \!+\! i\,\rangle\langle + i \vert + \frac{1}{2} \vert \!-\! i \rangle\langle - i\vert,

and more generally we can use any two orthogonal qubit state vectors (which will always correspond to two antipodal points on the Bloch sphere). If we average the corresponding points on the Bloch sphere in a similar way we obtain the same point, which in this case is at the center of the sphere. This is consistent with the observation that

12I=I+0σx+0σy+0σz2,\frac{1}{2} \mathbb{I} = \frac{\mathbb{I} + 0 \cdot \sigma_x + 0 \cdot \sigma_y + 0 \cdot \sigma_z}{2},

giving us the Cartesian coordinates (0,0,0).(0,0,0).

A different example concerning convex combinations of Bloch sphere points is the one discussed in the previous subsection.

1200+12++=(34141414)=cos2(π/8)ψπ/8ψπ/8+sin2(π/8)ψ5π/8ψ5π/8\frac{1}{2} \vert 0\rangle\langle 0 \vert + \frac{1}{2} \vert +\rangle\langle + \vert = \begin{pmatrix} \frac{3}{4} & \frac{1}{4}\\[2mm] \frac{1}{4} & \frac{1}{4} \end{pmatrix} = \cos^2(\pi/8) \vert \psi_{\pi/8} \rangle \langle \psi_{\pi/8}\vert + \sin^2(\pi/8) \vert \psi_{5\pi/8} \rangle \langle \psi_{5\pi/8}\vert

The following figure illustrates these two different ways of obtaining this density matrix as a convex combination of pure states.

Illustration of the average of the zero state and the plus state on the Bloch sphere

Plotting Bloch sphere points in Qiskit

Qiskit provides two functions for plotting points in the Bloch ball: plot_bloch_vector and plot_bloch_multivector.

Copy to clipboard

Output:

Imports loaded.

The function plot_bloch_vector displays a Bloch ball point using either Cartesian or spherical coordinates.

Copy to clipboard

Output:

The center of the ball is indicated by the lack of an arrow (or an arrow of length zero).

Copy to clipboard

Output:

The plot_bloch_multivector function takes a Statevector or DensityMatrix as input and outputs a Bloch sphere illustration (for each qubit in isolation).

Copy to clipboard

Output:

This can equivalently be done through the draw method for Statevector and DensityMatrix objects.

Copy to clipboard

Output:

Multiple systems and reduced states

Now we'll turn our attention to how density matrices work for multiple systems, including examples of different types of correlations they can express and how they can be used to describe the states of isolated parts of compound systems.

Multiple systems

Density matrices can represent states of multiple systems in an analogous way to state vectors in the simplified formulation of quantum information, following the same basic idea that multiple systems can be viewed as if they're single, compound systems. In mathematical terms, the rows and columns of density matrices representing states of multiple systems are placed in correspondence with the Cartesian product of the classical state sets of the individual systems.

For example, recall the state vector representations of the four Bell states.

ϕ+=1200+1211ϕ=12001211ψ+=1201+1210ψ=12011210\begin{aligned} \vert \phi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 00 \rangle + \frac{1}{\sqrt{2}} \vert 11 \rangle \\[2mm] \vert \phi^- \rangle & = \frac{1}{\sqrt{2}} \vert 00 \rangle - \frac{1}{\sqrt{2}} \vert 11 \rangle \\[2mm] \vert \psi^+ \rangle & = \frac{1}{\sqrt{2}} \vert 01 \rangle + \frac{1}{\sqrt{2}} \vert 10 \rangle \\[2mm] \vert \psi^- \rangle & = \frac{1}{\sqrt{2}} \vert 01 \rangle - \frac{1}{\sqrt{2}} \vert 10 \rangle \end{aligned}

The density matrix representations of these states are as follows.

ϕ+ϕ+=(12001200000000120012)\vert \phi^+ \rangle \langle \phi^+ \vert = \begin{pmatrix} \frac{1}{2} & 0 & 0 & \frac{1}{2}\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] \frac{1}{2} & 0 & 0 & \frac{1}{2} \end{pmatrix}ϕϕ=(12001200000000120012)\vert \phi^- \rangle \langle \phi^- \vert = \begin{pmatrix} \frac{1}{2} & 0 & 0 & -\frac{1}{2}\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] -\frac{1}{2} & 0 & 0 & \frac{1}{2} \end{pmatrix}ψ+ψ+=(00000121200121200000)\vert \psi^+ \rangle \langle \psi^+ \vert = \begin{pmatrix} 0 & 0 & 0 & 0\\[2mm] 0 & \frac{1}{2} & \frac{1}{2} & 0\\[2mm] 0 & \frac{1}{2} & \frac{1}{2} & 0\\[2mm] 0 & 0 & 0 & 0 \end{pmatrix}ψψ=(00000121200121200000)\vert \psi^- \rangle \langle \psi^- \vert = \begin{pmatrix} 0 & 0 & 0 & 0\\[2mm] 0 & \frac{1}{2} & -\frac{1}{2} & 0\\[2mm] 0 & -\frac{1}{2} & \frac{1}{2} & 0\\[2mm] 0 & 0 & 0 & 0 \end{pmatrix}

Product states

Similar to what we had for state vectors, tensor products of density matrices represent independence between the states of multiple systems. For instance, if X\mathsf{X} is prepared in the state represented by the density matrix ρ\rho and Y\mathsf{Y} is independently prepared in the state represented by σ,\sigma, then the density matrix describing the state of (X,Y)(\mathsf{X},\mathsf{Y}) is the tensor product ρσ.\rho\otimes\sigma.

The same terminology is used here as in the simplified formulation of quantum information: states of this form are referred to as product states.

Correlated and entangled states

States that cannot be expressed as product states represent correlations between systems. There are, in fact, different types of correlations that can be represented by density matrices. Here are a few examples.

  1. Correlated classical states. For example, we can express the situation in which Alice and Bob share a random bit like this:

    120000+121111=(120000000000000012)\frac{1}{2} \vert 0 \rangle \langle 0 \vert \otimes \vert 0 \rangle \langle 0 \vert + \frac{1}{2} \vert 1 \rangle \langle 1 \vert \otimes \vert 1 \rangle \langle 1 \vert = \begin{pmatrix} \frac{1}{2} & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & \frac{1}{2} \end{pmatrix}
  2. Ensembles of quantum states. Suppose we have mm density matrices ρ0,,ρm1,\rho_0,\ldots,\rho_{m-1}, all representing states of a system X,\mathsf{X}, and we randomly choose one of these states according to a probability vector (p0,,pm1).(p_0,\ldots,p_{m-1}). Such a process is represented by an ensemble of states, which includes the specification of the density matrices ρ0,,ρm1\rho_0,\ldots,\rho_{m-1} as well as the probabilities (p0,,pm1).(p_0,\ldots,p_{m-1}). We can associate an ensemble of states with a single density matrix, describing both the random choice of kk and the corresponding density matrix ρk,\rho_k, like this:

    k=0m1pkkkρk\sum_{k = 0}^{m-1} p_k \vert k\rangle \langle k \vert \otimes \rho_k

    To be clear, this is the state of a pair (Y,X)(\mathsf{Y},\mathsf{X}) where Y\mathsf{Y} represents the classical selection of kk — so we're assuming its classical state set is {0,,m1}.\{0,\ldots,m-1\}. States of this form are sometimes called classical-quantum states.

  3. Separable states. We can imagine situations in which we have a classical correlation among the quantum states of two systems like this.

    k=0m1pkρkσk\sum_{k = 0}^{m-1} p_k \rho_k \otimes \sigma_k

    In words, for each kk from 00 to m1,m-1, we have that with probability pkp_k the system on the left is in the state ρk\rho_k and the system on the right is in the state σk.\sigma_k. States like this are called separable states. This concept can also be extended to more than two systems.

  4. Entangled states. Not all states of pairs of systems are separable. In the general formulation of quantum information this is how entanglement is defined: states that are not separable are said to be entangled. This terminology is consistent with the terminology we used in Basics of quantum information. There we said that quantum state vectors that are not product states represent entangled states — and indeed, for any quantum state vector ψ\vert\psi\rangle that is not a product state, we find that the state represented by the density matrix ψψ\vert\psi\rangle\langle\psi\vert is not separable. Entanglement is much more complicated than this for states that are not pure.

Reduced states and the partial trace

There's a simple but important thing we can do with density matrices in the context of multiple systems, which is to describe the states we obtain by ignoring some of the systems. When multiple systems are in a quantum state, and we discard or choose to ignore one or more of the systems, the state of the remaining systems is called the reduced state of those systems. Density matrix descriptions of reduced states are easily obtained through a mapping, known as the partial trace, from the density matrix describing the state of the whole.

Example: reduced states for an e-bit

Suppose that we have a pair of qubits (A,B)(\mathsf{A},\mathsf{B}) that are together in the state

ϕ+=1200+1211.\vert\phi^+\rangle = \frac{1}{\sqrt{2}} \vert 00 \rangle + \frac{1}{\sqrt{2}} \vert 11 \rangle.

We can imagine that Alice holds the qubit A\mathsf{A} and Bob holds B,\mathsf{B}, which is to say that together they share an e-bit. We'd like to have a density matrix description of Alice's qubit A\mathsf{A} in isolation, as if Bob decided to take his qubit and visit the stars, never to be seen again.

First let's think about what would happen if Bob decided somewhere on his journey to measure his qubit with respect to a standard basis measurement. If he did this, he would obtain the outcome 00 with probability

(IA0)ϕ+2=1202=12,\bigl\| \bigl( \mathbb{I}_{\mathsf{A}} \otimes \langle 0\vert \bigr) \vert \phi^+ \rangle \bigr\|^2 = \Bigl\| \frac{1}{\sqrt{2}} \vert 0 \rangle \Bigr\|^2 = \frac{1}{2},

in which case the state of Alice's qubit becomes 0;\vert 0\rangle; and he would obtain the outcome 11 with probability

(IA1)ϕ+2=1212=12,\bigl\| \bigl( \mathbb{I}_{\mathsf{A}} \otimes \langle 1\vert \bigr) \vert \phi^+ \rangle \bigr\|^2 = \Bigl\| \frac{1}{\sqrt{2}} \vert 1 \rangle \Bigr\|^2 = \frac{1}{2},

in which case the state of Alice's qubit becomes 1.\vert 1\rangle.

So, if we ignore Bob's measurement outcome and focus on Alice's qubit, we conclude that she obtains the state 0\vert 0\rangle with probability 1/21/2 and the state 1\vert 1\rangle with probability 1/2.1/2. This leads us to describe the state of Alice's qubit in isolation by the density matrix

1200+1211=12IA.\frac{1}{2} \vert 0\rangle\langle 0\vert + \frac{1}{2} \vert 1\rangle\langle 1\vert = \frac{1}{2} \mathbb{I}_{\mathsf{A}}.

That is, Alice's qubit is in the completely mixed state. To be clear, this description of the state of Alice's qubit doesn't include Bob's measurement outcome; we're ignoring Bob altogether.

Now, it might seem like the density matrix description of Alice's qubit in isolation that we've just obtained relies on the assumption that Bob has measured his qubit, but this is not actually so. What we've done is to use the possibility that Bob measures his qubit to argue that the completely mixed state arises as the state of Alice's qubit, based on what we've already learned. Of course, nothing says that Bob must measure his qubit — but nothing says that he doesn't. And if he's light years away, then nothing he does or doesn't do can possibly influence the state of Alice's qubit viewed it in isolation. That is to say, the description we've obtained for the state of Alice's qubit is the only description consistent with the impossibility of faster-than-light communication.

We can also consider the state of Bob's qubit B,\mathsf{B}, which happens to be the completely mixed state as well. Indeed, for all four Bell states we find that the reduced state of both Alice's qubit and Bob's qubit is the completely mixed state.

Reduced states for a general quantum state vector

Now let's generalize the example just discussed to two arbitrary systems A\mathsf{A} and B,\mathsf{B}, not necessarily qubits in the state ϕ+.\vert \phi^+\rangle. We'll assume the classical state sets of A\mathsf{A} and B\mathsf{B} are Σ\Sigma and Γ,\Gamma, respectively. A density matrix ρ\rho representing a state of the combined system (A,B)(\mathsf{A},\mathsf{B}) therefore has row and column indices corresponding to the Cartesian product Σ×Γ.\Sigma\times\Gamma.

Suppose that the state of (A,B)(\mathsf{A},\mathsf{B}) is described by the quantum state vector ψ,\vert\psi\rangle, so the density matrix describing this state is ρ=ψψ.\rho = \vert\psi\rangle\langle\psi\vert. We'll obtain a density matrix description of the state of A\mathsf{A} in isolation, which is conventionally denoted ρA.\rho_{\mathsf{A}}. (A superscript is also sometimes used rather than a subscript.)

The state vector ψ\vert\psi\rangle can be expressed in the form

ψ=bΓϕbb\vert\psi\rangle = \sum_{b\in\Gamma} \vert\phi_b\rangle \otimes \vert b\rangle

for a uniquely determined collection of vectors {ϕb:bΓ}.\{\vert\phi_b\rangle : b\in\Gamma\}. In particular, these vectors can be determined through a simple formula.

ϕb=(IAb)ψ\vert\phi_b\rangle = \bigl(\mathbb{I}_{\mathsf{A}} \otimes \langle b\vert\bigr)\vert\psi\rangle

Reasoning similarly to the previous example of an e-bit, if we were to measure the system B\mathsf{B} with a standard basis measurement, we would obtain each outcome bΓb\in\Gamma with probability ϕb2,\|\vert\phi_b\rangle\|^2, in which case the state of A\mathsf{A} becomes

ϕbϕb.\frac{\vert \phi_b \rangle}{\|\vert\phi_b\rangle\|}.

As a density matrix, this state can be written as follows.

(ϕbϕb)(ϕbϕb)=ϕbϕbϕb2\biggl(\frac{\vert \phi_b \rangle}{\|\vert\phi_b\rangle\|}\biggr) \biggl(\frac{\vert \phi_b \rangle}{\|\vert\phi_b\rangle\|}\biggr)^{\dagger} = \frac{\vert \phi_b \rangle\langle\phi_b\vert}{\|\vert\phi_b\rangle\|^2}

Averaging the different states according to the probabilities of the respective outcomes, we arrive at the density matrix

ρA=bΓϕb2ϕbϕbϕb2=bΓϕbϕb=bΓ(IAb)ψψ(IAb)\rho_{\mathsf{A}} = \sum_{b\in\Gamma} \|\vert\phi_b\rangle\|^2 \frac{\vert \phi_b \rangle\langle\phi_b\vert}{\|\vert\phi_b\rangle\|^2} = \sum_{b\in\Gamma} \vert \phi_b \rangle\langle\phi_b\vert = \sum_{b\in\Gamma} \bigl(\mathbb{I}_{\mathsf{A}} \otimes \langle b\vert\bigr) \vert\psi\rangle\langle\psi\vert \bigl(\mathbb{I}_{\mathsf{A}} \otimes \vert b\rangle\bigr)

The partial trace

The formula

ρA=bΓ(IAb)ψψ(IAb)\rho_{\mathsf{A}} = \sum_{b\in\Gamma} \bigl(\mathbb{I}_{\mathsf{A}} \otimes \langle b\vert\bigr) \vert\psi\rangle\langle\psi\vert \bigl(\mathbb{I}_{\mathsf{A}} \otimes \vert b\rangle\bigr)

leads us to the description of the reduced state of A\mathsf{A} for any density matrix ρ\rho of the pair (A,B),(\mathsf{A},\mathsf{B}), not just a pure state.

ρA=bΓ(IAb)ρ(IAb)\rho_{\mathsf{A}} = \sum_{b\in\Gamma} \bigl( \mathbb{I}_{\mathsf{A}} \otimes \langle b \vert\bigr) \rho \bigl( \mathbb{I}_{\mathsf{A}} \otimes \vert b \rangle\bigr)

This formula must work, simply by linearity together with the fact that every density matrix can be written as a convex combination of pure states.

The operation being performed on ρ\rho to obtain ρA\rho_{\mathsf{A}} in this equation is known as the partial trace, and to be more precise we say that the partial trace is performed on B,\mathsf{B}, or that B\mathsf{B} is traced out. This operation is denoted TrB,\operatorname{Tr}_{\mathsf{B}}, so we can write

TrB(ρ)=bΓ(IAb)ρ(IAb).\operatorname{Tr}_{\mathsf{B}} (\rho) = \sum_{b\in\Gamma} \bigl( \mathbb{I}_{\mathsf{A}} \otimes \langle b \vert\bigr) \rho \bigl( \mathbb{I}_{\mathsf{A}} \otimes \vert b \rangle\bigr).

We can also define the partial trace on A,\mathsf{A}, so it's the system A\mathsf{A} that gets traced out rather than B,\mathsf{B}, like this.

TrA(ρ)=aΣ(aIB)ρ(aIB)\operatorname{Tr}_{\mathsf{A}} (\rho) = \sum_{a\in\Sigma} \bigl(\langle a \vert\otimes\mathbb{I}_{\mathsf{B}}\bigr) \rho \bigl(\vert a \rangle\otimes\mathbb{I}_{\mathsf{B}}\bigr)

This gives us the density matrix description ρB\rho_{\mathsf{B}} of the state of B\mathsf{B} in isolation rather than A.\mathsf{A}.

To recapitulate, if (A,B)(\mathsf{A},\mathsf{B}) is any pair of systems and we have a density matrix ρ\rho describing a state of (A,B),(\mathsf{A},\mathsf{B}), the reduced states of the systems A\mathsf{A} and B\mathsf{B} are as follows.

ρA=TrB(ρ)=bΓ(IAb)ρ(IAb)ρB=TrA(ρ)=aΣ(aIB)ρ(aIB)\begin{aligned} \rho_{\mathsf{A}} & = \operatorname{Tr}_{\mathsf{B}}(\rho) = \sum_{b\in\Gamma} \bigl( \mathbb{I}_{\mathsf{A}} \otimes \langle b \vert\bigr) \rho \bigl( \mathbb{I}_{\mathsf{A}} \otimes \vert b \rangle\bigr)\\[2mm] \rho_{\mathsf{B}} & = \operatorname{Tr}_{\mathsf{A}}(\rho) = \sum_{a\in\Sigma} \bigl( \langle a \vert \otimes \mathbb{I}_{\mathsf{B}}\bigr) \rho \bigl( \vert a \rangle\otimes \mathbb{I}_{\mathsf{B}} \bigr) \end{aligned}

If ρ\rho is a density matrix, then ρA\rho_{\mathsf{A}} and ρB\rho_{\mathsf{B}} will also necessarily be density matrices.

Generalization to three or more systems

These notions can be generalized to any number of systems in place of two in a natural way. In general, we can put the names of whatever systems we choose in the subscript of a density matrix ρ\rho to describe the reduced state of just those systems. For example, if A,\mathsf{A}, B,\mathsf{B}, and C\mathsf{C} are systems and ρ\rho is a density matrix describing a state of (A,B,C),(\mathsf{A},\mathsf{B},\mathsf{C}), then we can define

ρAC=TrB(ρ)=bΓ(IAbIC)ρ(IAbIC)ρC=TrAB(ρ)=aΣbΓ(abIC)ρ(abIC)\begin{aligned} \rho_{\mathsf{AC}} & = \operatorname{Tr}_{\mathsf{B}}(\rho) = \sum_{b\in\Gamma} \bigl( \mathbb{I}_{\mathsf{A}} \otimes \langle b \vert \otimes \mathbb{I}_{\mathsf{C}} \bigr) \rho \bigl( \mathbb{I}_{\mathsf{A}} \otimes \vert b \rangle \otimes \mathbb{I}_{\mathsf{C}} \bigr) \\[2mm] \rho_{\mathsf{C}} & = \operatorname{Tr}_{\mathsf{AB}}(\rho) = \sum_{a\in\Sigma} \sum_{b\in\Gamma} \bigl( \langle a \vert \otimes \langle b \vert \otimes \mathbb{I}_{\mathsf{C}} \bigr) \rho \bigl( \vert a \rangle \otimes \vert b \rangle \otimes \mathbb{I}_{\mathsf{C}} \bigr) \end{aligned}

and similarly for other choices for the systems.

Alternative description of the partial trace

An alternative way to describe the partial trace mappings TrA\operatorname{Tr}_{\mathsf{A}} and TrB\operatorname{Tr}_{\mathsf{B}} is that they are the unique linear mappings that satisfy the formulas

TrA(MN)=Tr(M)NTrB(MN)=Tr(N)M.\begin{aligned} \operatorname{Tr}_{\mathsf{A}}(M \otimes N) & = \operatorname{Tr}(M) N \\[2mm] \operatorname{Tr}_{\mathsf{B}}(M \otimes N) & = \operatorname{Tr}(N) M. \end{aligned}

In these formulas, NN and MM are square matrices of the appropriate sizes: the rows and columns of MM correspond to the classical states of A\mathsf{A} and the rows and columns of NN correspond to the classical states of B.\mathsf{B}.

This characterization of the partial trace is not only fundamental from a mathematical viewpoint, but can also allow for quick calculations in some situations. For example, consider this state of a pair of qubits (A,B).(\mathsf{A},\mathsf{B}).

ρ=120000+1211++\rho = \frac{1}{2} \vert 0\rangle\langle 0\vert \otimes \vert 0\rangle\langle 0\vert + \frac{1}{2} \vert 1\rangle\langle 1\vert \otimes \vert +\rangle\langle +\vert

To compute the reduced state ρA\rho_{\mathsf{A}} for instance, we can use linearity together with the fact that 00\vert 0\rangle\langle 0\vert and ++\vert +\rangle\langle +\vert have unit trace.

ρA=TrB(ρ)=12Tr(00)00+12Tr(++)11=1200+1211\rho_{\mathsf{A}} = \operatorname{Tr}_{\mathsf{B}}(\rho) = \frac{1}{2} \operatorname{Tr}\bigl(\vert 0\rangle\langle 0\vert\bigr)\, \vert 0\rangle\langle 0\vert + \frac{1}{2} \operatorname{Tr}\bigl(\vert +\rangle\langle +\vert\bigr) \vert 1\rangle\langle 1\vert = \frac{1}{2} \vert 0\rangle\langle 0\vert + \frac{1}{2} \vert 1\rangle\langle 1\vert

The reduced state ρB\rho_{\mathsf{B}} can be computed similarly.

ρB=TrA(ρ)=12Tr(00)00+12Tr(11)++=1200+12++\rho_{\mathsf{B}} = \operatorname{Tr}_{\mathsf{A}}(\rho) = \frac{1}{2} \operatorname{Tr}\bigl(\vert 0\rangle\langle 0\vert\bigr)\, \vert 0\rangle\langle 0\vert + \frac{1}{2} \operatorname{Tr}\bigl(\vert 1\rangle\langle 1\vert\bigr) \vert +\rangle\langle +\vert = \frac{1}{2} \vert 0\rangle\langle 0\vert + \frac{1}{2} \vert +\rangle\langle +\vert

The partial trace for two qubits

The partial trace can also be described explicitly in terms of matrices. Here we'll do this just for two qubits, but this can also be generalized to larger systems. Assume that we have two qubits (A,B),(\mathsf{A},\mathsf{B}), so that any density matrix describing a state of these two qubits can be written as

ρ=(α00α01α02α03α10α11α12α13α20α21α22α23α30α31α32α33)\rho = \begin{pmatrix} \alpha_{00} & \alpha_{01} & \alpha_{02} & \alpha_{03}\\[2mm] \alpha_{10} & \alpha_{11} & \alpha_{12} & \alpha_{13}\\[2mm] \alpha_{20} & \alpha_{21} & \alpha_{22} & \alpha_{23}\\[2mm] \alpha_{30} & \alpha_{31} & \alpha_{32} & \alpha_{33} \end{pmatrix}

for some choice of complex numbers {αjk:0j,k3}.\{\alpha_{jk} : 0\leq j,k\leq 3\}.

The partial trace over the first system has the following formula.

TrA(α00α01α02α03α10α11α12α13α20α21α22α23α30α31α32α33)=(α00α01α10α11)+(α22α23α32α33)=(α00+α22α01+α23α10+α32α11+α33)\operatorname{Tr}_{\mathsf{A}} \begin{pmatrix} \alpha_{00} & \alpha_{01} & \alpha_{02} & \alpha_{03}\\[2mm] \alpha_{10} & \alpha_{11} & \alpha_{12} & \alpha_{13}\\[2mm] \alpha_{20} & \alpha_{21} & \alpha_{22} & \alpha_{23}\\[2mm] \alpha_{30} & \alpha_{31} & \alpha_{32} & \alpha_{33} \end{pmatrix} = \begin{pmatrix} \alpha_{00} & \alpha_{01} \\[2mm] \alpha_{10} & \alpha_{11} \end{pmatrix} + \begin{pmatrix} \alpha_{22} & \alpha_{23}\\[2mm] \alpha_{32} & \alpha_{33} \end{pmatrix} = \begin{pmatrix} \alpha_{00} + \alpha_{22} & \alpha_{01} + \alpha_{23}\\[2mm] \alpha_{10} + \alpha_{32} & \alpha_{11} + \alpha_{33} \end{pmatrix}

One way to think about this formula begins by viewing 4×44\times 4 matrices as 2×22\times 2 block matrices, where each block is 2×2.2\times 2. That is,

ρ=(M0,0M0,1M1,0M1,1)\rho = \begin{pmatrix} M_{0,0} & M_{0,1} \\[1mm] M_{1,0} & M_{1,1} \end{pmatrix}

for

M0,0=(α00α01α10α11),M0,1=(α02α03α12α13),M1,0=(α20α21α30α31),M1,1=(α22α23α32α33).M_{0,0} = \begin{pmatrix} \alpha_{00} & \alpha_{01} \\[2mm] \alpha_{10} & \alpha_{11} \end{pmatrix}, \quad M_{0,1} = \begin{pmatrix} \alpha_{02} & \alpha_{03} \\[2mm] \alpha_{12} & \alpha_{13} \end{pmatrix}, \quad M_{1,0} = \begin{pmatrix} \alpha_{20} & \alpha_{21} \\[2mm] \alpha_{30} & \alpha_{31} \end{pmatrix}, \quad M_{1,1} = \begin{pmatrix} \alpha_{22} & \alpha_{23} \\[2mm] \alpha_{32} & \alpha_{33} \end{pmatrix}.

We then have

TrA(M0,0M0,1M1,0M1,1)=M0,0+M1,1.\operatorname{Tr}_{\mathsf{A}}\begin{pmatrix} M_{0,0} & M_{0,1} \\[1mm] M_{1,0} & M_{1,1} \end{pmatrix} = M_{0,0} + M_{1,1}.

Here is the formula when the second system is traced out rather than the first.

TrB(α00α01α02α03α10α11α12α13α20α21α22α23α30α31α32α33)=(Tr(α00α01α10α11)Tr(α02α03α12α13)Tr(α20α21α30α31)Tr(α22α23α32α33))=(α00+α11α02+α13α20+α31α22+α33)\operatorname{Tr}_{\mathsf{B}} \begin{pmatrix} \alpha_{00} & \alpha_{01} & \alpha_{02} & \alpha_{03}\\[2mm] \alpha_{10} & \alpha_{11} & \alpha_{12} & \alpha_{13}\\[2mm] \alpha_{20} & \alpha_{21} & \alpha_{22} & \alpha_{23}\\[2mm] \alpha_{30} & \alpha_{31} & \alpha_{32} & \alpha_{33} \end{pmatrix} = \begin{pmatrix} \operatorname{Tr} \begin{pmatrix} \alpha_{00} & \alpha_{01}\\[1mm] \alpha_{10} & \alpha_{11} \end{pmatrix} & \operatorname{Tr} \begin{pmatrix} \alpha_{02} & \alpha_{03}\\[1mm] \alpha_{12} & \alpha_{13} \end{pmatrix} \\[4mm] \operatorname{Tr} \begin{pmatrix} \alpha_{20} & \alpha_{21}\\[1mm] \alpha_{30} & \alpha_{31} \end{pmatrix} & \operatorname{Tr} \begin{pmatrix} \alpha_{22} & \alpha_{23}\\[1mm] \alpha_{32} & \alpha_{33} \end{pmatrix} \end{pmatrix} = \begin{pmatrix} \alpha_{00} + \alpha_{11} & \alpha_{02} + \alpha_{13}\\[2mm] \alpha_{20} + \alpha_{31} & \alpha_{22} + \alpha_{33} \end{pmatrix}

In terms of block matrices of a form similar to before, we have this formula.

TrB(M0,0M0,1M1,0M1,1)=(Tr(M0,0)Tr(M0,1)Tr(M1,0)Tr(M1,1))\operatorname{Tr}_{\mathsf{B}} \begin{pmatrix} M_{0,0} & M_{0,1} \\[1mm] M_{1,0} & M_{1,1} \end{pmatrix} = \begin{pmatrix} \operatorname{Tr}(M_{0,0}) & \operatorname{Tr}(M_{0,1}) \\[1mm] \operatorname{Tr}(M_{1,0}) & \operatorname{Tr}(M_{1,1}) \end{pmatrix}

The block matrix descriptions of these functions can be extended to systems larger than qubits in a natural and direct way.

To finish the lesson let's apply these formulas to the same state we considered above.

ρ=120000+1211++=(120000000001414001414).\rho = \frac{1}{2} \vert 0\rangle \langle 0 \vert \otimes \vert 0\rangle \langle 0 \vert + \frac{1}{2} \vert 1\rangle \langle 1 \vert \otimes \vert +\rangle \langle + \vert = \begin{pmatrix} \frac{1}{2} & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0 \\[2mm] 0 & 0 & \frac{1}{4} & \frac{1}{4}\\[2mm] 0 & 0 & \frac{1}{4} & \frac{1}{4} \end{pmatrix}.

The reduced state of the first system A\mathsf{A} is

TrB(120000000001414001414)=(Tr(12000)Tr(0000)Tr(0000)Tr(14141414))=(120012)\operatorname{Tr}_{\mathsf{B}} \begin{pmatrix} \frac{1}{2} & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & \frac{1}{4} & \frac{1}{4}\\[2mm] 0 & 0 & \frac{1}{4} & \frac{1}{4} \end{pmatrix} = \begin{pmatrix} \operatorname{Tr} \begin{pmatrix} \frac{1}{2} & 0\\[1mm] 0 & 0 \end{pmatrix} & \operatorname{Tr} \begin{pmatrix} 0 & 0\\[1mm] 0 & 0 \end{pmatrix} \\[4mm] \operatorname{Tr} \begin{pmatrix} 0 & 0\\[1mm] 0 & 0 \end{pmatrix} & \operatorname{Tr} \begin{pmatrix} \frac{1}{4} & \frac{1}{4}\\[1mm] \frac{1}{4} & \frac{1}{4} \end{pmatrix} \end{pmatrix} = \begin{pmatrix} \frac{1}{2} & 0\\[2mm] 0 & \frac{1}{2} \end{pmatrix}

and the reduced state of the second system B\mathsf{B} is

TrA(120000000001414001414)=(12000)+(14141414)=(34141414).\operatorname{Tr}_{\mathsf{A}} \begin{pmatrix} \frac{1}{2} & 0 & 0 & 0\\[2mm] 0 & 0 & 0 & 0\\[2mm] 0 & 0 & \frac{1}{4} & \frac{1}{4}\\[2mm] 0 & 0 & \frac{1}{4} & \frac{1}{4} \end{pmatrix} = \begin{pmatrix} \frac{1}{2} & 0\\[1mm] 0 & 0 \end{pmatrix} + \begin{pmatrix} \frac{1}{4} & \frac{1}{4}\\[1mm] \frac{1}{4} & \frac{1}{4} \end{pmatrix} = \begin{pmatrix} \frac{3}{4} & \frac{1}{4}\\[2mm] \frac{1}{4} & \frac{1}{4} \end{pmatrix}.

Reduced states in Qiskit

The partial_trace function computes the partial trace over a collection of systems given a DensityMatrix argument.

Copy to clipboard

Output:

Imports loaded.

Copy to clipboard

Output:

[120000000001414001414] \begin{bmatrix} \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{4} & \frac{1}{4} \\ 0 & 0 & \frac{1}{4} & \frac{1}{4} \\ \end{bmatrix}[120012] \begin{bmatrix} \frac{1}{2} & 0 \\ 0 & \frac{1}{2} \\ \end{bmatrix}[34141414] \begin{bmatrix} \frac{3}{4} & \frac{1}{4} \\ \frac{1}{4} & \frac{1}{4} \\ \end{bmatrix}

Note that partial_trace numbers the systems from right to left — so qubit 0 is the one on the right and qubit 1 is the one on the left.

The plot_bloch_multivector function plots these reduced states as Bloch ball points using a similar naming convention.

Copy to clipboard

Output:

Was this page helpful?