Quantum algorithmic foundations
Download the slides for this lesson.
Introduction
Quantum algorithms provide provable advantages over classical algorithms in the query model — but what about more standard models of computation, where problem inputs are given explicitly rather than in the form of an oracle or black box? This turns out to be a much more difficult question to answer, and to address it we must first establish a solid foundation upon which to base our investigation. That's the purpose of this lesson.
We'll begin by discussing computational cost, for both classical and quantum computations, and how it can be measured. This is a general notion that can be applied to a wide range of computational problems — but to keep things simple we'll mainly examine it through the lens of computational number theory, which addresses problems including basic arithmetic, computing greatest common divisors, and integer factorization, which are computational tasks that are likely to be familiar to most readers. While computational number theory is a narrow application domain, these problems serve well to illustrate the basic issues — and as a bonus they happen to be highly relevant to the lesson following this one.
Our focus will be on algorithms, as opposed to the ever-improving hardware on which they're run — so we'll be more concerned with how the cost of running an algorithm scales as the specific problem instances it is run on grow in size, rather than how many seconds, minutes, or hours some particular computation requires. We focus on this aspect of computational cost in recognition of the fact that algorithms have fundamental importance, and will naturally be deployed against larger and larger problem instances using faster and more reliable hardware as technology develops.
Finally, we turn to a critically important task, which is running classical computations on quantum computers. The reason this task is important is not because we hope to replace classical computers with quantum computers (which seems extremely unlikely to happen any time soon, if ever), but rather because it opens up many interesting possibilities for quantum algorithms. Specifically, classical computations running on quantum computers become available as subroutines — effectively leveraging decades of research and development on classical algorithms in pursuit of quantum computational advantages.
Two examples: factoring and GCDs
The classical computers that exist today are incredibly fast, and their speed seems to be ever increasing. For this reason, some might be inclined to believe that computers are so fast that no computational problem is beyond their reach.
This belief is false. Some computational problems are so inherently difficult that, although there exist algorithms to solve them, no computer on the planet Earth today is fast enough to run these algorithms to completion on even moderately sized inputs within the lifetime of a human — or even within the lifetime of the Earth itself.
To explain further, let's introduce the integer factorization problem.
Integer factorization
Input: an integer
Output: the prime factorization of
By the prime factorization of we mean a list of the prime factors of and the powers to which they must be raised to obtain by multiplication. For example, the prime factors of are and and to obtain we must take the product of to the power and to the power
Up to the ordering of the prime factors, there is only one prime factorization for each positive integer which is a fact known as the fundamental theorem of arithmetic.
The
Output:
{2: 2, 3: 1}
Factoring small numbers like is easy, but when the number to be factored gets larger, the problem becomes more difficult. For example, running
Output:
{3: 2, 74519450661011221: 1, 5073729280707932631243580787: 1}
For even larger values of things become impossibly difficult, at least as far as we know. For example, the RSA Factoring Challenge, which was run by RSA Laboratories from 1991 to 2007, offered a cash prize of $100,000 to factor the following number, which has 309 decimal digits (and 1024 bits when written in binary). The prize for this number was never collected and its prime factors remain unknown.
Output:
135066410865995223349603216278805969938881475605667027524485143851526510604859533833940287150571909441798207282164471551373680419703964191743046496589274256239341020864383202110372958725762358509643110564073501508187510676594629205563685529475213500852879416377328533906109750544334999811150056977236890927563
We need not bother running
The fastest known algorithm for factoring large integers is known as the number field sieve. As an example of this algorithm's use, the RSA challenge number RSA250, which has 250 decimal digits (and 829 bits in its binary representation), was factored using the number field sieve in 2020. The computation required thousands of CPU core-years, distributed across tens of thousands of machines around the world.
Output:
True
The security of the RSA public-key cryptosystem is based on the computational difficulty of integer factoring; an efficient algorithm for integer factoring would break it.
Next let's consider a related but very different problem, which is computing the greatest common divisor (or GCD) of two integers.
Greatest common divisor (GCD)
Input: nonnegative integers and at least one of which is positive
Output: the greatest common divisor of and
The greatest common divisor of two numbers is the largest integer that evenly divides both of them.
This problem is easy for computers — it has roughly the same computational cost as multiplying the two input numbers together. The following code cell runs the
Output:
135066410865995223349603216278805969938881475605667027524485143851526510604859533833940287150571909441798207282164471551373680419703964191743046496589274256239341020864383202110372958725762358509643110564073501508187510676594629205563685529475213500852879416377328533906109750544334999811150056977236890927563
We can push it even further and compute GCDs for numbers with thousands of digits in no time.
Output:
113
This is possible because we have very efficient algorithms for computing GCDs, the most well-known of which is Euclid's algorithm, discovered over 2,000 years ago.
Could there be a fast algorithm for integer factorization that we just haven't discovered yet, allowing large numbers like RSA1024 to be factored in the blink of an eye? The answer is yes. Although we might expect that an efficient algorithm for factoring as simple and elegant as Euclid's algorithm for computing GCDs would have been discovered by now, there is nothing that rules out the existence of a very fast classical algorithm for integer factoring, beyond the fact that we've failed to find one thus far. One could be discovered tomorrow — but don't hold your breath. Generations of mathematicians and computer scientists have searched, and factoring numbers like RSA1024 remains beyond our reach.
Measuring computational cost
Next we'll discuss a mathematical framework through which computational cost can be measured. It should be understood that this discussion is narrowly focused on the needs of this series — the analysis of algorithms and computational complexity are entire subjects onto themselves, and have much more to say about these notions.
As a starting point, consider the following figure (which first appeared in Lesson 5), which represents a very high level abstraction of a computation.
The computation itself could be modeled or described in a variety of ways, such as by a computer program written in Python, a Turing machine, a Boolean circuit, or a quantum circuit. Our focus will be on circuits (both Boolean and quantum).
Encodings and input length
Let's begin with the input and output of a computational problem, which we'll assume are binary strings. Different symbols could be used, but we'll keep things simple for the purposes of this discussion by restricting our attention to binary string inputs and outputs.
Through binary strings, we can encode a variety of interesting objects that the problems we're interested in solving might concern, such as numbers, vectors, matrices, and graphs, as well as lists of these and other objects. For example, to encode nonnegative integers, we can use binary notation. The following table lists the binary encoding of the first nine nonnegative integers, along with the length (meaning the total number of bits) of each encoding.
number | binary encoding | length |
---|---|---|
0 | 0 | 1 |
1 | 1 | 1 |
2 | 10 | 2 |
3 | 11 | 2 |
4 | 100 | 3 |
5 | 101 | 3 |
6 | 110 | 3 |
7 | 111 | 3 |
8 | 1000 | 4 |
We can easily extend this encoding to handle both positive and negative integers by appending a sign bit to the representations if we choose. Sometimes it's also convenient to allow binary representations of nonnegative integers to have leading zeros, which don't change the value being encoded but can allow representations to fill up a string or word of a fixed size.
Using binary notation to represent nonnegative integers is both common and efficient, but if we wanted to we could choose a different way to represent nonnegative integers using binary strings, such as the ones suggested in the following table. The specifics of these alternatives are not important to this discussion — the point is only to clarify that we do have choices for the encodings we use. (In this table, the symbol represents the empty string, which has no symbols in it and length equal to zero. Naturally, to avoid an obvious source of confusion, we use a special symbol such as to represent the empty string rather than literally writing nothing.)
number | unary encoding | lexicographic encoding |
---|---|---|
0 | ||
1 | 0 | 0 |
2 | 00 | 1 |
3 | 000 | 00 |
4 | 0000 | 01 |
5 | 00000 | 10 |
6 | 000000 | 11 |
7 | 0000000 | 000 |
8 | 00000000 | 001 |
Other types of inputs, such as vectors and matrices, or more complicated objects like descriptions of molecules, can also be encoded as binary strings. Just like we have for nonnegative integers, a variety of different encoding schemes can be selected or invented. For whatever scheme we come up with to encode inputs to a given problem, we interpret the length of an input string as representing the size of the problem instance being solved.
For example, the number of bits required to express a nonnegative integer in binary notation, which is sometimes denoted is given by the following formula.
Assuming that we use binary notation to encode the input to the integer factoring problem, the input length for the number is therefore Note, in particular, that the length (or size) of the input is not itself — when is large we don't need nearly this many bits to express in binary notation.
From a strictly formal viewpoint, whenever we consider a computational problem or task, it should be understood that a specific scheme has been selected for encoding whatever objects are given as input or produced as output. This allows computations that solve interesting problems to be viewed abstractly as transformations from binary string inputs to binary string outputs — and naturally the details of how objects are encoded as binary strings must be important to computations at some level.
Usually, though, we don't worry all that much about these details when we're analyzing computational cost, so that we can avoid getting into details of secondary importance. The basic reasoning is that we expect the computational cost of converting back and forth between "reasonable" encoding schemes to be insignificant compared with the cost of solving the actual problem. In those situation in which this is not the case, the details can (and should) be clarified.
For example, a very simple computation with low cost converts between the binary representation of a nonnegative integer and its lexicographic encoding (which we have not explained in detail, but it can be inferred from the table). For this reason, the computational cost of integer factoring wouldn't differ significantly if we decided to switch from using one of these encodings to the other for the input On the other hand, encoding nonnegative integers in unary notation incurs an exponential blow-up in the total number of symbols required, and we would not consider it to be a "reasonable" encoding scheme.
Elementary operations
Now let's consider the computation itself, which is represented by the blue rectangle in the figure above. The way that we'll measure computational cost is to count the number of elementary operations that each computation requires. Intuitively speaking, an elementary operation is one involving a small, fixed number of bits or qubits that can be performed quickly and easily — such as computing the AND of two bits. (In contrast, running the
Formally speaking, there are different choices for what constitutes an elementary operation depending on the computational model being used. Our focus will be on circuit models, and specifically quantum and Boolean circuits.
Universal gate sets
For circuit-based models of computation, it is typical that each gate is viewed as an elementary operation. This leads to the question of precisely which gates we allow.
Focusing for the moment on quantum circuits, we've seen several gates thus far in the series, including and gates, swap gates, controlled versions of gates (including controlled-NOT, Toffoli, and Fredkin gates), as well as gates that represent standard basis measurements. In the context of the CHSH game we also saw a few additional rotation gates.
We also discussed query gates (within the query model), and saw that any unitary operation acting on any number of qubits, can be viewed as being a gate if we so choose. We'll disregard both of these options for the sake of this discussion. We won't be working in the query model (although the implementation of query gates using elementary operations is discussed later in the lesson), and viewing arbitrary unitary gates, potentially acting on millions of qubits, as being elementary operations does not lead to meaningful or realistic notions of computational cost.
Sticking with quantum gates that operate on small numbers of qubits, one approach to deciding which ones are to be considered elementary is to tease out a precise criterion — but this is not the standard approach or the one we'll take. Rather, we simply make a choice.
Here's one standard choice, which we shall adopt as the default gate set for quantum circuits:
- Single-qubit unitary gates from this list: and
- Controlled-NOT gates
- Single-qubit standard basis measurements
A common alternative is to view Toffoli, Hadamard, and gates as being elementary, in addition to standard basis measurements. (Sometimes all single-qubit gates are viewed as being elementary, although this does lead to an unrealistically powerful model when noise and inaccuracies are not properly accounted for.)
The unitary gates in our default collection form what's called a universal gate set. This means that we can approximate any unitary operation, on any number of qubits, and to any degree of accuracy we wish, with circuits composed of these gates alone. To be clear, the definition of universality places no requirements on the cost of such approximations, meaning the number of gates from our set that we need. Indeed, a fairly simple argument based on the mathematical notion of measure reveals that most unitary operations must have extremely high cost. Proving the universality of quantum gate sets is not a simple matter and won't be covered in this series.
For Boolean circuits, we'll take AND, OR, NOT, and FANOUT gates to be the ones representing elementary operations. We don't actually need both AND gates and OR gates — we can use De Morgan's laws to convert from either one to the other by placing NOT gates on all three input and output wires — but nevertheless it's both typical and convenient to allow both AND and OR gates. AND, OR, NOT, and FANOUT gates form a universal set for deterministic computations, meaning that any function from any fixed number of input bits to any fixed number of output bits can be implemented with these gates.
The principle of deferred measurement
Standard measurement gates can appear within quantum circuits — but sometimes it's convenient to delay them until the end, so that we can view quantum computations as consisting of a unitary part (representing the computation itself), followed by a simple read-out phase where qubits are measured and the results are output.
This can always be done, provided that we're willing to add an additional qubit for each standard basis measurement. In the following figure, the circuit on the right illustrates how this can be done for the gate on the left.
Specifically, the classical bit in the circuit on the left is replaced by a qubit on the right (initialized to the state), and the standard basis measurement is replaced by a controlled-NOT gate, followed by a standard basis measurement on the bottom qubit. The point is that the standard basis measurement in the right-hand circuit can be pushed all the way to the end of the circuit. If the classical bit in the circuit on the left is later used as a control bit, we can use the bottom qubit in the circuit on the right as a control instead, and the overall effect will be the same. (We are assuming that the classical bit in the circuit on the left doesn't get overwritten after the measurement takes place by another measurement — but if it did we could always just use a new classical bit rather than overwriting one that was used for a previous measurement.)
Circuit size and depth
Circuit size
The total number of gates in a circuit is referred to as that circuit's size. Thus, presuming that the gates in our circuits represent elementary operations, we have that a circuit's size represents the number of elementary operations it requires — or, in other words, its computational cost. We write to refer to the size of a given circuit
For example, consider the following Boolean circuit (from Lesson 3) for computing the XOR of two bits.
The size of this circuit is 7, because there are 7 gates in total. (Fanout operations are not always counted as being gates, but for the purposes of this lesson we will count FANOUT operations as being gates.)
Time, cost, and circuit depth
In the previous lesson we discussed the importance of time as a resource, or limiting constraint, on computations. The examples above, such as the task of factoring RSA1024, reinforce this viewpoint. The
The notion of computational cost, as the number of elementary operations required to perform a computation, is intended to be an abstract proxy for the time required to implement a computation. Each elementary operation requires a certain amount of time to perform, and the more of them a computation needs to perform the longer it's going to take, at least in general. In the interest of simplicity, we'll continue to make this association between computational cost and the time required to run algorithms as we continue on in the series.
But notice that the size of a circuit doesn't necessarily correspond directly to how long it takes to run. In our Boolean circuit for computing the XOR of two bits, for instance, the two FANOUT gates could be performed simultaneously, as could the two NOT gates, as well as the two AND gates.
A different way to measure efficiency for circuits, which takes this possibility of parallelization into account, is by their depth. This is the minimum number of layers of gates needed within the circuit, where the gates within in each layer operate on different wires. Equivalently, the depth of a circuit is the maximum number of gates encountered on any path from an input wire to an output wire. For the circuit above, for instance, the depth is 4.
Circuit depth is one way to formalize the running time of parallel computations. It's an advanced topic, and there exist very sophisticated circuit constructions known to minimize the depth required for certain computations. There are also some fascinating unanswered questions concerning circuit depth. (For example, much remains unknown about the circuit depth required to compute GCDs.) We won't have too much more to say about circuit depth in this series, aside from including a few interesting facts concerning circuit depth as we go along, but it should be clearly acknowledged that parallelization is a potential source of computational advantages.
Assigning costs to different gates
One final note concerning circuit size and computational cost is that it is possible to assign different costs to different gates, rather than viewing every gate as contributing equally to cost.
For example, as was already mentioned, FANOUT gates are often viewed as being free for Boolean circuits — which is to say that we could choose that FANOUT gates have zero cost. As another example, when we're working in the query model and we count the number of queries that a circuit makes to an input function (in the form of a black box), we're effectively assigning unit cost to query gates and zero cost to other gates, such as Hadamard gates. A final example is that we sometimes weigh the costs of different gates depending on how difficult they are to implement, depending on the particular hardware implementation we have in mind.
While all of these options are sensible in different contexts, for this lesson we'll keep things simple and stick with circuit size as a representation of computational cost.
Cost as a function of input length
We're primarily interested in how computational cost scales as inputs become larger and larger. This leads us to represent the costs of algorithms as functions of the input length.
Families of circuits
Inputs to a given computational problem can vary in length, potentially becoming arbitrarily large. Every circuit, on the other hand, has a fixed number of gates and wires. For this reason, when we think about algorithms in terms of circuits, we generally need infinitely large families of circuits to represent algorithms. By a family of circuits, we mean a sequence of circuits that grow in size, allowing larger and larger inputs to be accommodated.
For example, imagine that we have a classical algorithm for integer factorization, such as the one used by
This leads us to represent the computational cost of an algorithm by a function defined so that is the number of gates in the circuit that implements the algorithm for bit inputs. In more formal terms, an algorithm in the Boolean circuit model is described by a sequence of circuits where solves whatever problem we're talking about for -bit inputs (or, more generally, for inputs whose size is parameterized in some way by ), and the function representing the cost of this algorithm is defined as
For quantum circuits the situation is similar, where larger and larger circuits are needed to accommodate longer and longer input strings.
Example: integer addition
To explain this further, let's take a moment to consider a much simpler problem than integer factoring, or even computing GCDs, which is the problem of addition.
Integer addition
Input: integers and
Output:
How might we design Boolean circuits for solving this problem?
Just to keep things simple, let's restrict our attention to the case where and are both nonnegative integers represented by -bit registers using binary notation. We'll allow for any number of leading zeros in these encodings, so that The output will be an -bit binary string representing the sum, which is the maximum number of bits we need to express the result.
We begin with an algorithm — the standard algorithm for addition of binary representations — which is the base analogue to the way addition is taught in elementary/primary schools around the world. This algorithm can be implemented with Boolean circuits as follows.
Starting from the least significant bits, we can compute their XOR to determine the least significant bit for the sum. Then we can compute the carry bit, which is the AND of the two least significant bits of and Sometimes these two operations together are known as a half adder. Using the XOR circuit we've now seen a few times together with an AND gate and two FANOUT gates, we can build a half adder with 10 gates. If for some reason we changed our minds and decided to include XOR gates in our set of elementary operations, we would need 1 AND gate, 1 XOR gate, and 2 FANOUT gates to build a half adder.
Moving on to the more significant bits we can use a similar procedure, but this time including the carry bit from each previous position into our calculation. By cascading 2 half adders and taking the OR of the carry bits they produce, we can create what's known as a full adder. (Fortunately both carry bits can't be so we don't need to worry about that case.) This requires 21 gates in total: 2 AND gates, 2 XOR gates (each requiring 7 gates to implement), one OR gate, and 4 FANOUT gates.
Finally, by cascading the full adders, we obtain a Boolean circuit for nonnegative integer addition having gates. Had we decided to include XOR gates in our set of elementary operations, we'd need AND gates, XOR gates, OR gates, and FANOUT gates, for a total of gates. If in addition we decide not to count FANOUT gates, it's gates.
For example, the following circuit computes the sum of two 4-bit integers.
Asymptotic notation
On the one hand, it is good to know precisely how many gates are needed to perform various computations, like in the example of integer addition above. These details are important for actually building the circuits.
On the other hand, if we perform analyses at this level of detail for all the computations we're interested in, including ones for tasks that are much more complicated than addition, we'll very quickly be buried in details. To keep things manageable, and to intentionally suppress details of secondary importance, we typically use Big-O notation when analyzing algorithms. Through this notation we can express the order at which functions grow.
Formally speaking, if we have two functions and we write that if there exists a positive real number and a positive integer such that
for all Typically is chosen to be as simple an expression as possible, so that the notation can be used to reveal the limiting behavior of a function in simple terms. For example,
This notation can be extended to functions having multiple arguments in a fairly straightforward way. For instance, if we have two functions and defined on positive integers and we write that if there exists a positive real number and a positive integer such that
whenever
Connecting this notation to the example of nonnegative integer addition, we conclude that there exists a family of Boolean circuits where adds two -bit nonnegative integers together, such that This reveals the most essential feature of how the cost of addition scales with the input size: it scales linearly. Notice also that it doesn't depend on the specific detail of whether we consider XOR gates to have unit cost or cost In general, using Big-O notation allows us to make statements about computational costs that aren't sensitive to such low-level details.
More examples
Here are a few more examples of problems from computational number theory, beginning with multiplication.
Integer multiplication
Input: integers and
Output:
Creating Boolean circuits for this problem is more difficult than creating circuits for addition — but by thinking about the standard multiplication algorithm, we can come up with circuits having size for this problem (assuming and are both represented by -bit binary representations). More generally, if has bits and has bits, there are Boolean circuits of size for multiplying and
There are, in fact, other ways to multiply that scale better. For instance, the Schönhage-Strassen multiplication algorithm can be used to create a Boolean circuits for multiplying two -bit integers having size The intricacy of this method causes a lot of overhead, however, making it only practical for numbers having tens of thousands of bits.
Another basic problem is division, which we interpret to mean computing both the quotient and remainder given an integer divisor and dividend.
Integer division
Input: integers and
Output: integers and satisfying and
The cost of integer division is similar to multiplication: if has bits and has bits, there are Boolean circuits of size for solving this problem. Like multiplication, there are asymptotically superior methods.
We can now compare known algorithms for computing GCDs with those for addition and multiplication. Euclid's algorithm for computing the GCD of an -bit number and an -bit number requires Boolean circuits of size similar to the standard algorithms for multiplication and division. Also similar to multiplication and division, there are asymptotically faster GCD algorithms — including ones requiring elementary operations to compute the GCD of two -bit numbers.
A somewhat more expensive computation that arises in number theory is modular exponentiation.
Integer modular exponentiation
Input: integers and with and
Output:
By we mean the remainder when is divided by — meaning the unique integer satisfying and for some integer
If has bits, has bits, and has bits, this problem can solved by Boolean circuits having size — which is not at all obvious at first glance. The solution is not to first compute and then take the remainder — that would require exponentially many bits just to store the number Rather, we can use the power algorithm (known alternatively as the binary method and repeated squaring), where we use the binary representation of to perform the entire computation modulo Assuming and are all -bit numbers, we obtain an algorithm — or a cubic time algorithm. (Once again, there are asymptotically faster algorithms.)
Here's a code cell that computes a modular exponent for fairly large input numbers.
Output:
5420808908959774060288987536719770456217318912094898225713892936909049029205875268391016773496273016272919329382669533127141405381618319955871147811133071683113976741106560564348611362839209748910974112799894385464900617664468329271717655067495341858822704829649579551724483155264208314048742469387847020407987537873027379036515942355206425576104942394452390040081069041858525217985044712018028991394635458870225358476099232040714610377672598679720015462702833934580459151626243131178465271884695260294198503010478207039551370657556801980546328939683389782893101937977023160027020824612118350981172299997784862364914426784395705304625946981620876565903630321426713564601636615003526450559450988257054265576673783996389684225946503950767764911765912500478971120407712043832726706539999202935732936820400534970855338294653369081634048804115320256182211145544094816107629124112742249226319353536843143250675559791743355892993051078488285292927178023438402583054839832109503781471596500072959107707020065782555861325253891174468479828455070701800024212593144971400832140786063620665224339027746048978496589265602430286475229240500730395030244167373760773225724090156004049352594635867027935465962414764456373585290956999305794964617561548270565216080421238370652518997421499178722058617256564053290758153841574743341020230723665105568383923335524981262395314837292291351213579853233800664543959697570668968932545105391200894345122403901126855808339292892144618034647640594129638891809347433516954054255735374114229687467253661509404329081038467536976485314005457794642793240728907716748113582623047896927299525364803461784339899349011331842366119834985254767022797817541982842136038797346045524543446837609691961446346898034481960162369906547207851740703325595273724313509866798139791862738941412435088490790473992542655256187359052648995226579813780101763700632784177318998593960123977376065188642849714402071821332656099246595192273548124579308954369379166791375504366177963367167357734204250134523901012053728414505662083719483277149998297345600850207968594611610502058119754052458059616153460321784141539990056380808571117172677389938997159425776949065037133138197247161115127659585109470076784212846689843449874294528288829117435818727532583190741742686601090580534050289592016761135679603364904072305569448382741870366244397396582565453282641402195407023981530691751496898222633942198230270907098913253995651978611545066193950188718245653197837791596426329560941347753939650429611447245577078846931722372616367550006355620360276455568080512099035179175416756143233950934214736621950039902506644221443998435341460452476466171419972558395644689971465747907674896790822097660307109062931957694684033978412952567599593488687529011526359106169809555979667401866723277931201510177630302133847932158278990312682334549377652520626935835739547712983314320798573894229076437190704856364396327795151484278956526648597260084828697308427169704251546673832583609705456575281533722178029103192256765343504645112826902073960223407989624481672668703972408414665894403347181679160774217536153364422436840721557202572348021932511877708639899866042064925150997870115610130091923308716372491505600234638328560252998561255435872867079194898819838898311502430190295512215594677576275248220954675704199712880196887650126401810574270118295619216693001247689245264372564423355830421087279204264834276290790089692636984198176930764597788529620410821188522782016777910796700483449826
It's a little bit slower than computing GCDs, but it's still fast.
A different example, outside of the domain of computational number theory, arose in the previous lesson. For the post-processing step of Simon's algorithm, we need to compute the null space modulo of an matrix of binary values (so here the input length is bits in total). We can do this using Gaussian elimination with elementary operations, which is elementary operations if Even for a binary matrix, which is a million bits of input, the computation time is on the order of seconds.
Output:
GF([[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1,
0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0,
0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1,
0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1,
0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1,
1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1,
0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0,
0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1,
1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1,
1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1,
1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1,
1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,
1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0,
1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1,
1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1,
0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1,
1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1,
0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1,
0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0,
0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0,
1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0,
1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1,
0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0,
1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0,
0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0,
1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1,
0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0,
1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1,
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0,
0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1,
1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1,
0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0,
1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0,
0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1,
0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0,
0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1,
0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1,
0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1,
1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0,
1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1]], order=2)
Cost of integer factorization
In contrast to the algorithms just discussed, known algorithms for integer factorization are much more expensive — as we might expect from the examples in the first section of the lesson.
If we consider a simple trial division algorithm that searches through the list for prime factors of an input number then we have iterations — which means iterations when is an -bit number. Each iteration requires a trial division, which means elementary operations for each iteration (using a standard algorithm for integer division). So we end up with circuits of size which is exponential in the input size
There are algorithms for integer factorization having better scaling. The number field sieve mentioned earlier, for instance, which is an algorithm that makes use of randomness, is generally believed (but not rigorously proven) to require
elementary operations to factor -bit integers with high probability. While it is quite significant that is raised to the power the fact that appears in the exponent is still a problem that causes poor scaling — and explains in part why RSA1024 remains outside of its domain of applicability.
Polynomial versus exponential cost
As we have seen, classical algorithms for integer addition, multiplication, division, and computing greatest common divisors allow us to solve these problems in the blink of an eye for inputs with thousands of bits. Addition has linear cost while the other three problems have quadratic cost (or subquadratic cost using asymptotically fast algorithms). Modular exponentiation is more expensive but can still be done pretty efficiently, with cubic cost (or sub-cubic cost using asymptotically fast algorithms). Computing the null space modulo of an binary matrix has cost (bearing in mind that there are input bits for this problem).
All of these algorithms are examples having polynomial cost, meaning that they have cost for some choice of a fixed constant As a rough, first-order approximation, algorithms having polynomial cost are abstractly viewed as representing efficient algorithms.
In contrast, known classical algorithms for integer factoring have exponential cost. Sometimes the cost of the number field sieve is described as sub-exponential, but in complexity theory it is more typical to reserve this term for algorithms whose cost is
for every
The so-called NP-complete problems are a class of problems not known to (and widely conjectured not to) have polynomial-cost algorithms. A circuit-based formulation of the exponential-time hypothesis posits something even stronger, which is that no NP-complete problem can have a sub-exponential cost algorithm.
The association of polynomial-cost algorithms with efficient algorithms must be understood as being a loose abstraction. Of course, if an algorithm's cost scales as or for inputs of size then it's a stretch to describe that algorithm as efficient.
However, even an algorithm having cost that scales as must be doing something clever to avoid having exponential cost, which is generally what we expect of algorithms based in some way on "brute force" or "exhaustive search." Even the sophisticated refinements of the number field sieve, for instance, fail to avoid this exponential scaling in cost. Polynomial-cost algorithms, on the other hand, manage take advantage of the problem structure in some way that avoids an exponential scaling.
In practice, the identification of a polynomial cost algorithm for a problem is just a first step toward actual efficiency. Through algorithmic refinements, polynomial-cost algorithms with large exponents can sometimes be improved dramatically, lowering the cost to a more "reasonable" polynomial scaling. Sometimes things become easier when they're known to be possible — so the identification of a polynomial-cost algorithm for a problem can also have the effect of inspiring new, more efficient algorithms.
As we consider advantages of quantum computing over classical computing, our eyes are generally turned first toward exponential advantages, or at least super-polynomial advantages — ideally finding polynomial-cost quantum algorithms for problems not known to have polynomial-cost classical algorithms. Theoretical advantages on this order have the greatest chances to lead to actual practical advantages — but identifying such advantages is an extremely difficult challenge. Only a few examples are currently known, but the search continues.
Polynomial (but not super-polynomial) advantages in computational cost of quantum over classical are also interesting and should not be dismissed — but given the current gap between quantum and classical computing technology, they do seem rather less compelling at the present time. One day, though, they could become significant. Grover's algorithm, which is covered in Lesson 8, provides an example of a quadratic advantage of quantum over classical for so-called unstructured searching, which could potentially be broadly applicable.
A hidden cost of circuit computation
There is one final issue that is worth mentioning, although we will not concern ourselves with it further in this series. There's a "hidden" computational cost when we're working with circuits, and it concerns the specifications of the circuits themselves. As inputs get longer, we need larger and larger circuits — but somehow we're going to need to get our hands on the descriptions of these circuits if we're going to implement them.
For all of the examples we've discussed, or will discuss later in the series, there's some underlying algorithm from which the circuits are derived. Usually the circuits in a family follow some basic pattern that's easy to extrapolate to larger and larger inputs — such as cascading full adders to create Boolean circuits for addition or performing layers of Hadamard gates and other gates in some simple to describe pattern.
But what happens if there are prohibitive computational costs associated with the patterns of the circuits themselves? For instance, the description of each member in a circuit family could, in principle, be determined by some extremely difficult to compute function of
The answer is that this is a problem, and we must place additional restrictions on families of circuits beyond having polynomial cost in order for them to truly represent efficient algorithms. The property of uniformity for circuits does this by stipulating that it must (in various precise formulations) be computationally easy to obtain the description of each circuit in a family. All of the circuit families we'll discuss do indeed have this property, so we won't concern ourselves with it further — but this is an important issue to be aware of when studying circuit models of computation from a formal viewpoint.
Classical computations on quantum computers
We'll now turn our attention to implementing classical algorithms on quantum computers. We'll see that any computation that can be performed with a classical Boolean circuit can also be performed by a quantum circuit with asymptotically the same computational cost. Moreover, this can be done in a "clean" manner, which is an important requirement for using these computations as subroutines inside of larger quantum computations.
Simulating Boolean circuits with quantum circuits
Boolean circuits are composed of AND, OR, NOT, and FANOUT gates. To simulate Boolean circuit with quantum circuit, we'll begin by showing how each of these four gates can be simulated by quantum gates. Once that's done, converting a given Boolean circuit to a quantum circuit is a simple matter of simulating one gate at a time. We'll only need NOT gates, controlled-NOT gates, and Toffoli gates to do this, which are all deterministic operations in addition to being unitary.
Toffoli gates
Toffoli gates can alternatively be described as controlled-controlled-NOT gates, whose action on standard basis states is as the following figure suggests.
The matrix representation of this gate is as follows.
Another way to think about Toffoli gates is that they're essentially query gates for the AND function, in the sense that they follow exactly the same pattern we saw in the previous lesson for the unitary query gate implementation of arbitrary functions with binary string inputs and outputs.
Toffoli gates are not included in the default gate set discussed earlier in the lesson, but it is possible to construct a Tofolli gate from and CNOT gates as the following Qiskit code cell illustrates.
Output:
Simulating Boolean gates with Toffoli, controlled-NOT, and NOT gates
A single Toffoli gate, used in conjunction with a few NOT gates, can implement an AND and OR gate, as the following diagrams suggest.
FANOUT gates can easily be implemented using controlled-NOT gates like this:
In all three cases, the bits that the AND, OR, and FANOUT gates act upon come in from the left as inputs, and in addition we require one workspace qubit initialized to the zero state. These workspace qubits appear within the dotted rectangles to illustrate that they're new, and one of the costs of implementing Boolean gates and circuits through the method being described is the inclusion of these new qubits.
For the AND and OR gates, in addition to the output bits we also have two additional qubits left over. For example, inside the dotted rectangle in the diagram for an AND gate, the top two qubits are left in the states and These bits appear inside the dotted rectangles to illustrate that they're no longer needed — they aren't output by the Boolean logic gates we're simulating, and we can simply ignore them (for now at least).
The remaining Boolean gate, the NOT gate, is also a gate in our default set of quantum gates, so we don't need to worry about simulating them in a special way.
Gate by gate simulation of Boolean circuits
Now suppose that we have an ordinary Boolean circuit named composed of AND, OR, NOT, and FANOUT gates, and having input bits and of output bits. We'll let be the total number of gates in and we'll give the name to the function that computes — so this function takes the form where
Now consider what happens when we go one at a time through the AND, OR, and FANOUT gates of replacing each one by the simulation described above, including the addition of a workspace qubit. We'll arrange the qubits of so that the input bits of are on the top and the workspace qubits are on the bottom. As a result of performing these substitutions, we obtain a new circuit that operates as this figure suggests:
Here, is the number of workspace qubits that need to be added, and is a function of the form that describes the states of the leftover qubits after is executed.
In the figure, the qubits corresponding to the output are on the top and the remaining, leftover qubits storing are on the bottom. We can force this to happen if we wish by rearranging the qubits using swap gates. A swap gate can be implemented with three controlled-NOT gates like this:
As we'll see in the next section, it isn't essential to rearrange the output qubits like this, but it's easy enough to do it if we choose.
The function that describes the classical states of the leftover qubits is determined somehow by the circuit but we don't really need to worry all that much about it, because we really don't care specifically what state these qubits are in when the computation finishes. The letter comes after so it is a reasonable name for this function on that account, but there's a better reason to pick the name it's short for garbage — or stuff left over from the simulation that's cluttering up these qubits.
Cleaning up the garbage
If our only interest is in computing whatever function is computed by a given Boolean circuit with a quantum circuit, we don't need to proceed any further than the gate-by-gate simulation just described — and we can accept that in addition to the answer we'll have a bunch of garbage left over.
This, however, won't be good enough if we want to perform classical computations as subroutines within larger quantum computations, because those garbage qubits will cause problems. As we saw in the previous lesson, the phenomenon of interference is critically important to quantum algorithms, and garbage qubits can ruin the interference patterns needed to make quantum algorithms work.
Fortunately, it's not to difficult to clean up the garbage, so-to-speak. The key is to use the fact that because is a quantum circuit, we can run it in reverse — by simply replacing each gate with its inverse and applying them in the reverse order — thereby obtaining a quantum circuit for the operation (Toffoli gates, CNOT gates, and NOT gates are their own inverses, so in this case running in reverse is just a matter of applying the gates in the reverse order — but more generally any quantum circuit can be reversed as just described.)
So, what we can do is to add more qubits (recalling that is the number of output bits of the function ), use CNOT gates to copy the output of onto these qubits, and then reverse to clean up the garbage. The following figure illustrates the resulting circuit and describes its action on standard basis states.
If we put a box around the entire circuit and call it it looks like this:
If has gates, then has gates.
If we disregard the additional workspace qubits, what we have is a circuit that functions exactly like a query gate for the function If we simply want to compute the function on some string we can set and the resulting value will appear on the bottom qubits — or we can feed in a different state to the bottom qubits if we wish (perhaps to make use of a phase kickback, like in Deutsch's or the Deutsch-Jozsa algorithm). This means that for any query algorithm, if we have a Boolean circuit that computes the input function, we can replace each query gate with a unitary implementation as just described and the query algorithm will function correctly.
Note that the workspace qubits are needed to make this process work — but they are, in fact, returned to their initial states once the combined circuit is executed, so they could be used again as workspace qubits for other purposes. There are strategies that reduce the number of workspace qubits required, but we won't discuss those strategies here.
Implementing invertible functions
The construction just described allows us to simulate any Boolean circuit with a quantum circuit in a garbage-free manner, following this basic pattern: if is a Boolean circuit implementing a function then we obtain a quantum circuit that operates as the next equation describes on standard basis states.
The number indicates how many workspace qubits are required, and depends on how many gates has and which ones they are.
This will be enough for the purposes of this series, but it is possible to take this one step further when the function itself is invertible. To be precise, suppose that the function takes the form and suppose furthermore there exists a function such that for every (The function is necessarily unique when it exists.)
This means that the operation that transforms into for every is unitary — so we might hope to build a quantum circuit that implements the unitary operation defined by
for every To be clear, the fact that this is a unitary operation relies on being invertible — it's not unitary when isn't invertible. Disregarding the workspace qubits, is different from the operation that the circuit implements because we're not keeping a copy of the input around and XORing it to an arbitrary string, we're replacing by
So the question is: when is invertible, can we do this? The answer is yes — provided that in addition to having a Boolean circuit that computes we also have one that computes (So this won't be a shortcut to inverting functions computationally if we don't already know how to do that.) The following diagram illustrates how it can be done by composing two quantum circuits, and which are obtained individually for the functions and through the method described above.
Was this page helpful?