Utility-Scale QAOA

Lesson overview:

So far in this course, we hope we've given you a solid foundation of the framework and tools needed to solve utility-scale problems on a quantum computer. Now, we're finally going to see these tools in action.

In this lesson, we'll get our hands dirty with a utility-scale example of the Max-Cut problem, which is a famous problem in graph theory involving how to best partition a graph into two. We'll start with a simple, five-node graph to build our intuition for how a quantum computer can help us solve the problem, then apply this to a utility-scale version of the problem.

This lesson will serve as a broad overview of the approach we take to solving this problem. This won't be a code walk-through. Accompanying this lesson, though, there is a tutorial with actual code that you can run to solve the Max-Cut problem on a quantum computer.

The Problem

As a reminder, not all computational problems are suitable for quantum computing. “Easy problems” won't gain any advantages from this technology because classical computers are perfectly good at solving them already.

The three use-cases we're most optimistic about exploring are:

simulating nature
processing data with complex structure
optimization

Today, we'll be focusing on the third use-case, optimization. In an optimization problem, we're generally looking for the largest or smallest possible value for a given function. The difficulty of finding these extrema with classical methods can increase exponentially as the problem size grows.

The optimization problem of interest today is called Max-Cut, which we're going to solve using an algorithm called Quantum Approximate Optimization Algorithm (QAOA).

What is Max-Cut?

We start with a graph, which consists of a collection of vertices (or nodes), some of which are connected by edges. In the problem, we're asked to divide the nodes of the graph into two subsets by “cutting” the edges that connect them. We want to find the partition that maximizes the number of edges that are cut in this way – hence the name, “Max-Cut.”

Illustration of a max-cut problem

For instance, the figure above shows a five-node graph with a Max-cut solution on the right. It cuts through five edges, which is the best one can do with this graph.

Because a five-node graph is so small, it isn't too hard to work out the Max-Cut in your head or by trying a few cuts on a piece of paper. But as you can imagine, the problem becomes more and more difficult as the number of vertices grows — in part because the number of possible cuts to consider grows exponentially in the number of nodes. And at a certain point, this becomes hard even for supercomputers to do with the any known classical algorithms.

We'd like a way to solve the Max-Cut problem for these larger, more complicated graphs, because the problem has many practical applications, including fraud detection in finance, graph clustering, network design, and social media analysis. Max-Cut is often encountered as subproblem within a particular approach to a larger problem. So, it's much more common than we might naively think.

The Solution

Now, we're going to walk through the approach we use to solve the Max-Cut problem on a quantum computer. We'll do this with a simple, five-node graph. You can follow along using the python notebook tutorial. After that simple example, the tutorial will walk you through a utility-scale example of the problem.

The first step is to create our graph by defining the number of nodes and the edges that connect two nodes. You can do this by importing a package called rustworkx, as demonstrated in the tutorial. The result will be a graph that looks like this:

Output of Rustworkx Max-Cut graph

We'll use the Qiskit patterns framework to find the Max-Cut solutions for this graph on our quantum computer.

1. Map

We need to map the problem onto our quantum computer. To do this, let's first note that maximizing the number of cuts in a graph can be mathematically written as:

$\max\limits_{x\in\{0,1\}^n} \sum\limits_{(i,j)} {x_i + x_j - 2x_ix_j}$

Where $i$ and $j$ are nodes in the graph, and $x_i$ and $x_j$ are either 0 or 1, depending on which side of the partition each node is on (one group is labelled “0” and one is labelled “1”). When $x_i$ and $x_j$ are on the same side of the partition, the expression in the sum is equal to zero. When they are on opposite sides, so there is a cut between them, the expression is equal to one. So, maximizing the number of cuts will maximize the sum.

We can also flip this around, and search for the minimum by multiplying each of the values by negative one.

$\min\limits_{x\in\{0,1\}^n} \sum\limits_{(i,j)} {2x_ix_j - x_i - x_j}$

Now, we're ready to map. It can be kind of daunting to think about how to go from a graph like we one we just drew to a quantum circuit. But we'll take it one step at a time.

Remember, we're going to try to solve Max-Cut using QAOA. In the QAOA methodology, we ultimately want to have an operator (or in other words a Hamiltonian) that will be used to represent the cost function of our hybrid algorithm, as well as a parametrized circuit (the ansatz) that we use to represent possible solutions to the problem.

QUBO

We can sample from these candidate solutions and then evaluate them using the cost function. To do this, we take advantage of a series of mathematical reformulations, including the Quadratic Unconstrained Binary Optimization notation — or QUBO for short — which is a useful way to encode combinatorial optimization problems. In QUBO, we want to find:

$\min\limits_{x\in\{0,1\}^n} x^TQx$

where $Q$ is an $n\times n$ matrix of real numbers, $n$ corresponds to number of nodes in our graph, here, five.

To apply QAOA, we need to formulate our problem as a Hamiltonian — which is a function or matrix that represents the total energy of a system. Specifically, we want to create a cost function Hamiltonian that has the property that the ground state corresponds to the minimum value of the function. So, to solve our optimization problem, we'll try to prepare the ground state of $H$ on a quantum computer. Then, sampling from this state will yield the solution to $\min 𝑓(𝑥)$ with a high probability.

Mapping to a cost function Hamiltonian

As it turns out, we are in luck, because the QUBO problem is very closely related to, and actually computationally equivalent to, one of the most famous and ubiquitous Hamiltonians in physics: the Ising Hamiltonian.

In order to write the QUBO problem as the Ising Hamiltonian, all we actually need to do is do a simple change of variables:

$x_i = \frac{1-z_i}{2}$ .

We won't walk through all the steps here, but they are explained in the attached notebook. In the end, the minimization of the QUBO expression is the same as the minimization of this expression:

$\min_{x\in\{0,1\}^n} x^TQx\Longleftrightarrow \min_{z\in\{-1,1\}^n}z^TQz + b^Tz$

Rewriting again slightly and we have our cost function Hamiltonian, where the minimum of the expression represents the ground state, Z is the Pauli Z operator, and $b$ is a real scalar coefficient:

$H_C=\sum_{ij}Q_{ij}Z_iZ_j + \sum_i b_i Z_i$

Now that we have our Hamiltonian, we need to rewrite it in terms of two-local Pauli ZZ operators, that we can easily convert to two-qubit gates in our quantum circuit. We'll end up with six objects — or Pauli strings — where each corresponds to each of the six edges in the graph. Each of the five elements in a string represents an operation on a node — the identity if the node is not connected to that particular edge, and the Pauli Z operator if it is. In Qiskit, bitstrings representing qubits are indexed backwards. For example, an edge between nodes 0 and 1 is encoded as IIIZZ, and an edge between 2 and 4 is encoded as ZIZII.

Construct the quantum circuit

With our Hamiltonian written in terms of Pauli operators, we're ready to construct our quantum circuit, which allows us to sample good solutions using a quantum computer:

Circuit diagram with QAOA layers

The QAOA algorithm takes inspiration from the Adiabatic Theorem, which states that if we start in the ground state of a time-dependent Hamiltonian, if the Hamiltonian evolves slowly enough, and given enough time, the final state will be the ground state of the final Hamiltonian. QAOA can be thought of as the discrete, trotterized version of this Quantum Adiabatic Algorithm, where each trotter step represents a layer of the QAOA algorithm. So instead of evolving from one state to the other, in each layer, we will switch back and forth between our cost function Hamiltonian and a so-called “mixer” Hamiltonian, which we'll cover later in this lesson.

The advantage of QAOA is that it's faster than the quantum adiabatic algorithm, but it returns approximate solutions rather than the optimal. In the limit where the number of layers goes to infinity, QAOA converges to the QAA case, but of course this is very computationally expensive.

To create our quantum circuit, we'll apply alternating operators, parameterized by $\gamma$ and $\beta$ , which will represent the discretization of the time evolution.

So, the three main parts of the QAOA circuit are:

the initial trial state, in grey, which is the ground state of the mixer, created by applying a Hadamard gate applied to every qubit
the cost function evolution, which we've discussed previously, in dark purple
the evolution under the mixer Hamiltonian, which we have not yet covered, in light purple.

Our starting Hamiltonian is called the Mixer because its ground state is the superposition of all possible bitstrings of interest: hence enforcing a mixture of all possible solutions at the start.

The mixer Hamiltonian is the simple, sum of Pauli-X operations on each node of the graph. Qiskit allows you to use a different, custom mixer operator if you wish, but we are going to use the standard one here. So again, you can see that with Qiskit, a lot of the work is removed for us, making coming up with the mixer Hamiltonian and the starting state trivial. The only work we had to do was to find the cost function.

Each iteration of these operators is called a layer. These layers can be seen as discretization of the time evolution of the system, as previously described. The alternating pattern comes from the trotter decomposition and approximates the exponential functions of non-commuting matrices. In general, the more layers or steps we include, the closer we will be to continuous time evolution, like in QAA, so in theory, the more accurate the result will be. But for this example, we'll begin by just sampling with one layer. Remember, both the cost function Hamiltonian and the mixer are parameterized, we still need to come up with optimal values for $\gamma$ and $\beta$ .

2. Optimize

While the circuit we just created looks pretty simple and is useful to build up an intuitive understanding, remember, the quantum chip doesn't understand what the QAOA gate is. We need to turn this into a series of single- and two-qubit “native” gates that can be performed directly on the hardware. Native gates are those that can be performed directly on the qubits. Such circuits are said to be written in the backend's Instruction Set Architecture (ISA).

The Qiskit library offers a series of transpilation passes that cater to a wide range of circuit transformations. We want to make sure that the circuit is optimized for our purpose.

Recall from our previous lesson that the transpilation process involves several steps: • Initial mapping of the qubits in the circuit (i.e. decision variables) to physical qubits on the device. • Unrolling of the instructions in the quantum circuit to the hardware native instructions that the backend understands. • Routing of any qubits in the circuit that interact to physical qubits that are adjacent with one another.

And as always, more details about this can be found on the documentation site.

Before transpiling, though, we need to choose which backend we'll run our circuit on, since the transpiler optimizes differently for different processors. This is yet another reason it's important to use an automated transpiler — you wouldn't want to go through the time-consuming process of optimizing your circuit by hand, only to realize you actually want to run your circuit on a different processor with different properties.

Pass your backend of choice through the transpiler function and specify your optimization level. In the tutorial, you'll select level 3, which is the highest and most thorough level.

And with that, we have a transpiled circuit that is ready to be executed on hardware!

3. Execute

So far, we transpiled the circuit leaving the parameters gamma and beta alone — but we can't actually run the circuit without specifying these parameters. In the QAOA workflow, the optimal QAOA parameters are found in an iterative optimization loop, where we run a series of circuit evaluations and then use a classical optimizer to find the optimal 𝛽 and 𝛾 parameters. However, we need to start somewhere, so we make an initial guess of $\gamma=\pi/2$ and $\beta=\pi$ .

Execution Modes

Now, we're almost ready to run the circuit — I promise! But first, it is important to note that you can send your job in a variety of different ways, which are called execution modes.

Job mode: A single primitive request of the estimator or the sampler is made without a context manager. Circuits and inputs are packaged as primitive unified blocs (PUBs) and submitted as an execution task on the quantum computer.

Batch mode: A multi-job manager for efficiently running an experiment that is comprised of bundle of independent jobs. Use batch mode to submit multiple primitive jobs simultaneously

Session mode: A dedicated window for running a multi-job workload. This allows users to experiment with variational algorithms in a more predictable way, and even run multiple experiments simultaneously, taking advantage of parallelism in the stack. Use sessions for iterative workloads or experiments that require dedicated access. See Run jobs in a session for examples.

For a QAOA experiment, a session would be a good choice to proceed with if you have access to it, since we need to sample our circuit many times with different parameter values to find the optimum.

Back to the optimization problem. We need to find better values of gamma and beta than just our rough first guesses. We will do this by plugging in our cost function and these initial guesses into a scipy optimizer COBYLA.

COBYLA optimization graph

Here you can see the value of the cost function over the iterations. It starts off a little wonky and goes up and down, but then settles to a low value. We will use the values that scipy found that correspond to the lowest evaluation of the cost function.

Now that we were able to reduce our cost function by finding better values of our parameters, we will run our circuit using the new values we found for gamma and beta. I listed the specific values that I am using here, but remember, when you try your hand at this or even just rerun the same tutorial notebook, these values might change slightly. Now we will run our optimized circuit with these values and find our candidate solution to our Max-Cut problem.

In the post-processing stage, we'll analyze the data and display these results to see if our quantum algorithm found the correct solutions.

4. Post-process

Now let's plot a histogram of the data to look at the final solution:

Max-Cut solution histogram

The bit strings represent how each of the nodes were partitioned into two groups (labelled “0” and “1”) by the cut. There should be four solutions that all give the maximum value of edges cut. These four are shown in purple. You can see right away that 4 solutions are much more probable than any of the others. The highest, and thus most probable bit string solution is 0,1,0,1,1. (Remember – the order of the qubits is reversed in the plot bitstrings!)

From this plot, we can take the most likely bitstring and represent it as a partitioned graph, with the cut going through five edges:

Max-Cut solution

So, this is indeed a Max-Cut solution. But it's not the only one! Because of the symmetry of this graph, there are multiple correct solutions. Instead of having the nodes 0 and 3 be inside the cut, we could have nodes 2 and 4 be included. You can see that all I had to do was rotate my cut to include these new points. The number of edges cut remains five. There turns out to be four max cut solutions, since each of the two solutions we noted also has an “opposite” partner, where the purple nodes are grey and the grey nodes are purple – so the cut remains the same, but each node effectively swaps to the opposite side of the partition.

Let's look again at the histogram and the four most likely solutions for a minute. Ideally, they would be each of the four true Max-Cut solutions. The problem is that the algorithm actually didn't identify the fourth and final solution as being one of the top 4 most probable answers. It was the fifth most likely. The fourth solution that the algorithm identified is incorrect--if you were to draw it, you would see that the solution only has four cuts.

But remember: this is an approximate algorithm. It is not infallible, and it is not correct 100% of the time. You do have to employ some of your own knowledge and understanding to sanity-check the solutions.

This error can arise from several places:

it could be the approximate nature of the algorithm itself, and the small number of layers I employed
It could be a finite sampling error, this could be reduced if I increase the number of shots in my experiment
it could also be a readout error, since the fourth real solution is only off by one bit

This kind of error analysis is what it takes to become a practitioner of quantum computing. You need to understand the performance of the hardware and how this can contribute to certain types of errors and how to correct for them.

However, let's not forget that there were 32 possible bit strings, and that the four real solutions were included in the top five best candidates. And we only used two layers to find this. In general, if we wanted to increase our chances of finding the best Max-Cut every time, we could increase the layer depth. There are some subtleties to this, but that's for a later lesson.

Now at utility-scale:

Now that you've had a taste of the process of solving a small Max-Cut problem on a quantum computer, I challenge you to do it at scale. Follow along in the linked tutorial to see how many cuts you can get in a 125-node graph.

Was this page helpful?

Yes