Learning Home Catalog Composer Lab
Learning
Home Catalog Composer Lab Return to course
Learning
Practical introduction to quantum-safe cryptography
Introduction to this course Cryptographic hash functions Symmetric key cryptography Asymmetric key cryptography Quantum-safe cryptography What's next? Works cited Exam

Asymmetric key cryptography

Introduction

In this lesson we will look at asymmetric key cryptography which forms the basis of many secure network interactions today.

By the end of the lesson we will have covered:

  • What asymmetric key cryptography is
  • Usage of asymmetric key cryptography, including key exchange and digital signatures
  • Security of asymmetric key cryptography in general
  • Further details on RSA, DSA & Elliptic Curves algorithms and security
  • Some Python code examples showing how the algorithms work in practice
  • Threats to these algorithms from both classical and quantum computers

Introduction to asymmetric key cryptography

As we learned in the last lesson, symmetric key cryptography is very fast and efficient for protecting information, but it has a few limitations:

  1. As the number of parties wishing to exchange secure information increases, the numbers of keys required grows combinatorially. It provides no mechanism to securely distribute these keys between senders and receivers.
  2. There is no provision for . Any party is able to decrypt, or encrypt, messages with no way to guarantee a message was received or where it originated

The solution to both these problems is provided by asymmetric key cryptography (AKC), also known as (PKC), which therefore forms a cornerstone of modern digital security.

Asymmetric key cryptography (AKC) involves the use of a pair of keys – one public, one private. The public and private keys are cryptographically linked and typically generated at the same time as a key pair using a specialized mathematical algorithm. The public key, as suggested by its name, is then meant to be freely distributed, while the private key is kept secret by the party generating the key pair. Security of communications employing asymmetric key pairs is assured as long as the private key remains confidential.

Figure 1. Asymmetric Key Encryption

Figure 1. Asymmetric Key Encryption

AKC offers several useful functions, such as:

  1. Encryption and decryption to enable confidentiality of communications.
  2. Digital signatures to ensure authenticity, integrity, and non-repudiation.
  3. Secure key exchange to facilitate subsequent use of symmetric cryptosystems.

In modern applications, AKC is primarily used for digital signatures and secure key exchange. In this lesson, we introduce these two key functions, and then we discuss several variations of cryptographic protocols for these functions.

Key exchange with asymmetric key cryptography

One of the fundamental problems in cryptography is securely . For example, if two parties want to use symmetric encryption, both parties need the same key to encrypt and decrypt messages. But how do they securely exchange the key? Asymmetric key cryptography addresses this through interactive and non-interactive key exchange mechanisms.

Interactive key exchange

An interactive key exchange protocol refers to a method where two parties collaborate to create a shared secret key over an insecure communication channel. This shared secret key can then be used for symmetric encryption and decryption tasks.

The most well known among such protocols is the (DH), which was devised specifically to facilitate key exchange. In this protocol, each party generates a pair of keys (public and private) and broadcasts their public key. Then each party uses their own private key and the other party's public key to generate a shared secret key. DH employs the principles of modular arithmetic to ensure both parties end up with the same shared secret even though each party has access only to the other party's public key.

Modern cryptosystems based on (ECC) extend this concept with the (ECDH) key exchange. ECDH works similarly to DH but utilizes the properties of elliptic curves, resulting in a more secure and efficient system.

Figure 2. Key Exchange Protocol

Figure 2. Key Exchange Protocol

Non-interactive key exchange

Unlike key exchange protocols like DH and ECDH, which are interactive, requiring back-and-forth communication to decide the symmetric key, AKC also provides for non-interactive ways to establish a shared secret key. In such schemes, one party generates a pair of keys (public and private) and shares the public key with the other party. This second party then generates a random symmetric key, encrypts it with the received public key, and sends it back to the first party. The first party uses their private key to decrypt the received message, thereby obtaining the shared symmetric key. This scheme is non-interactive in that the symmetric key is determined by one party and simply communicated securely to the other party in encrypted form.

An important consideration in non-interactive key exchange has to do with the length differential in bits between the symmetric key the parties wish to exchange and recommended message sizes in AKC. Typically, modern symmetric keys are in the range of 128-256 bits long, while asymmetric key cryptosystems such as RSA work with message sizes around 1024-4096 bits long. Therefore, when using AKC to transmit a symmetric key, for security it must nevertheless be encoded into a longer 1024-4096 bit message. This can be achieved via two approaches:

  • Padding-based key exchange: In this approach, the shorter (128-256 bit) symmetric key is generated first and then an agreed-upon reversible padding scheme such as is used to embed it into a longer (1024-4096 bit) message. This longer message is encrypted using AKC and broadcasted as . The recipient first decrypts the ciphertext and then removes the padding to extract the shorter symmetric key.
  • Key mechanisms (KEMs): With key exchange, a random, long (1024-4096 bit) plain text message is first generated, from which a shorter (128-256 bit) symmetric key can be extracted using an agreed-upon (KDF). The longer plain text is encrypted using AKC and broadcasted to the recipient as ciphertext. The recipient decodes the ciphertext using their private key and then uses the KDF to extract the shorter (128-256 bit) symmetric key. Popular cryptosystems such as , with their ability to directly encrypt data, can be used to implement KEMs.

Figure 3. Key Encapsulation Mechanism

Figure 3. Key Encapsulation Mechanism

Digital signatures with asymmetric key cryptography

are another powerful application of asymmetric key cryptography. They provide authentication, integrity, and nonrepudiation enabled by the fact that within AKC, entities possess unique private keys. The basic idea underlying signature protocols is that senders of secure messages will additionally digitally sign the messages using their unique private key. The receiver will then verify the digital signature using the public key of the sender. Within AKC, digital signatures can be implemented by using algorithms specifically designed for that purpose or by using generic cryptosystems.

Figure 4. Digital signatures with asymmetric key cryptography

Figure 4. Digital signatures with asymmetric key cryptography

Dedicated digital signature algorithms

Currently, the US Federal Information Processing Standard (FIPS) for digital signatures is a dedicated scheme simply titled the (DSA). Somewhat similarly to the Diffie-Hellman protocol, the DSA uses the algebraic properties of the and for signature generation and verification.

The (ECDSA) is an ECC variant of DSA, providing the same functionality but with significantly shorter keys. This results in improved efficiency, making it a popular choice for systems with .

Both DSA and ECDSA will be illustrated in more detail later.

Digital signature schemes using generic cryptosystems

In addition to dedicated algorithms, digital signatures can also be generated using generic asymmetric cryptosystems, such as .

RSA, which will be discussed in detail in a later section, also exploits modular multiplicative inverses and as fundamental operations but combines them in a different sequence than DSA. In RSA, the signer typically creates a hash of the message and then encrypts the hash with their private key, creating the digital signature. Any party can then verify this signature by decrypting it with the signer's public key and comparing it with the hashed message.

Applications of asymmetric key cryptography

is in modern digital technology applications. The basic functionalities of AKC described above form the building blocks of many higher-level application protocols, including:

  1. Internet communication: Secure communication over the internet, such as HTTPS, relies heavily on asymmetric key cryptography. The Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), use asymmetric key cryptography during the initial handshake process to establish a symmetric key, which is then used for the rest of the communication session.

  2. Authentication: Asymmetric key cryptography is used to create digital signatures, allowing an entity to authenticate a digital document or message as coming from a specific sender. This is used in many scenarios, from verifying software updates to legally binding digital contracts.

  3. Email encryption: Email encryption protocols such as PGP (Pretty Good Privacy) and its open-source alternative GPG (GNU Privacy Guard) use asymmetric key cryptography to ensure that only the intended recipient can read the email content.

  4. Secure shell (SSH): SSH is a protocol for secure remote login and other secure network services over an insecure network. It uses asymmetric key cryptography to authenticate the server to the client and, optionally, the client to the server.

  5. VPN (virtual private network): Asymmetric key encryption is used to establish secure connections in VPNs, ensuring secure communication over public networks.

  6. Blockchain and cryptocurrencies: Blockchain technologies, including Bitcoin and Ethereum, use asymmetric key cryptography. For example, the ownership of Bitcoin is established through digital signatures using asymmetric key cryptography.

  7. Certificate authorities: Asymmetric key cryptography is used by certificate authorities (CAs) to issue and sign digital certificates, which are used in TLS communication, code signing, email encryption, and more. A digital certificate binds a public key to a specific entity (for example, a person or server).

Figure 5. Issuing and signing digital certificates using asymmetric key cryptography

Figure 5. Issuing and signing digital certificates using asymmetric key cryptography

Security of asymmetric key cryptography

Several cryptographic notions come together to enable secure asymmetric key cryptography, including:

Private key secrecy: The most basic security requirement of AKC is that the private key remains secret. However, since the private key must be mathematically linked to the public key, private key secrecy also requires that it be computationally infeasible to derive the private key from knowledge of the public key. Key generation schemes within AKC rely on computationally hard mathematical problems to facilitate this requirement.

Trapdoor functionality In AKC, the encryption and decryption operations involve different complementary keys from a single-key pair. Ciphertext generated by encryption using one of the keys (for example the public key) must be inscrutable to third parties while being easily decryptable by the holder of the complementary key (in this case the private key). In other words, encryption should resemble a such that third parties cannot invert the operation and recover the plain text but the private key provides a secret trapdoor that enables easy inversion. Popular AKC algorithms use modular exponentiation to set up trapdoor one-way function behavior.

Randomness: The key generation process should also exploit randomness to ensure that keys are unpredictable, as any pattern or predictability in key generation could be exploited by an attacker. Randomness is also used for padding during encryption to generate semantically secure ciphertexts and within digital signature schemes to produce unique signatures even when the same message is signed multiple times. For this reason, the use of strong random number generators is an important part of AKC.

Large key size: As in the case of SKC, large key sizes ensure protection against brute force attacks. However, since large key sizes also increase the computational cost of the encryption and decryption process, an optimal solution needs to balance security and efficiency considerations. The following table shows typical key sizes for various asymmetric key cryptography protocols and applications:

ProtocolTypical Key Sizes (in bits)Application
RSA1024 (deprecated), 2048, 3072, 4096Encryption, digital signatures
DSA1024 (deprecated), 2048, 3072Digital signatures
DH2048, 3072, 4096Key exchange
ECDH224, 256, 384, 521Key exchange
ECDSA224, 256, 384, 521Digital signatures

Public key infrastructure: In AKC, the private keys must be kept secret by the owners while the public keys are shared. There needs to be a secure mechanism to manage and distribute these public keys between users. Public key infrastructure (PKI) provides a way to do this using digital certificates. A digital certificate provides proof of identity of the owner of the public key and is issued by a trusted authority like a certificate authority(which is part of a PKI). Hence, PKI plays an integral part in the security of modern application using AKC by enabling large-scale key management (by creating, managing, distributing and revoking digital certificates).

Security risks to asymmetric key cryptography

As outlined in the above table, modern asymmetric key algorithms such as RSA typically employ much larger key sizes than commonly used symmetric key counterparts such as AES-128. Even the ECC-based protocols (ECDH and ECDSA) that have smaller key sizes employ a minimum of at least 224 bits for keys. This in turn implies that a brute force attack involving an exhaustive search of the private key space to identify the correct key is computationally intractable in the foreseeable future. This remains true even if quantum computers were to be deployed for this task. Therefore, attacks against AKC usually focus on exploiting other potential weaknesses of specific cryptosystems. Some well-documented attack modes target:

  1. Algorithmic weakness by using sophisticated mathematical and computational means to undermine the hardness assumptions used to formulate asymmetric key algorithms. For example, the security of RSA is predicated on the difficulty of factoring large prime numbers, and recent computational advances have . Therefore, 1024-bit RSA is currently deprecated. As will be discussed later, the primary risk posed by quantum computers to AKC also falls into this category.

  2. Imperfect randomness, which can lead to weakness in the key generation process. For example, if the random number generator used to create the keys is flawed and generates keys that are not truly random, an attacker might be able to predict the keys.

  3. Implementation flaws such as errors in the implementation of cryptographic algorithms that inadvertently reveal information about the keys. For example, incorrect padding can potentially reveal details about keys.

  4. Side channels that refer to information leakage from the physical system that performs the cryptography. Such leaks could be in the form of timing information, power consumption, or even sound, which can be exploited in a so-called . For example, analyzing how long cryptographic operations take to execute could reveal the number of '1's in a binary key. This seemingly innocent leakage significantly narrows the search space, unveiling potential solutions to problems that initially seemed insurmountable.

  5. Key exchange by intercepting keys while they are being exchanged, such as within a (MITM). The DH protocol is susceptible to MITM attacks if additional authentication steps are not incorporated.

  6. Key storage by aiming to steal keys from poorly secured storage. This includes physical attacks such as manipulation or theft of storage devices.

Securing asymmetric key cryptosystems against the variety of attacks that are possible is therefore a significant task involving mathematical, hardware, software, logistical, and legal considerations.

RSA

The algorithm is one of the first public key cryptosystems and is widely used for secure data transmission. It is a versatile cryptosystem in that it provides the necessary operations to enable encryption, decryption, digital signatures, and key exchange within a common framework.

In this section, we will illustrate asymmetric key cryptography (AKC) using RSA through simple examples.

We will use the standard scenario of two parties, Alice and Bob, who wish to communicate securely using AKC.

The RSA algorithm

The basic RSA algorithm involves four operations: key generation, key distribution, encryption, and decryption:

  1. Key generation:

    Public and private keys are generated based on mathematical principles around , where calculating them is easy, but the reverse is hard.

    We'll refer to these:

    • Public key: (e,n)(e, n)
    • Private key: (d,n)(d, n)

    Note that nn is common to both public key and private key, and is known as the modulus. We will need to use this later.

    Mathematical detail
    • Choose two distinct prime numbers, pp and qq.
      • chosen at random (for security).
      • They should be similar in magnitude, but differ in length by a few digits, to make factoring harder.
      • Prime numbers can be efficiently chosen using a .
    • Compute n=pqn = p*q.
      • nn is the modulus for both the public and private keys.
    • Compute the (n)=(p1)(q1)(n) = (p-1)*(q-1).
      • The totient is meant to be kept secret and typically discarded after key generation.
    • Choose an integer ee such that 1<e<1 < e < (n)(n) and (e,(e, (n))=1(n)) = 1.
      • i.e. ee and (n)(n) should be .
      • This number ee forms the public key and is typically chosen as a small number for computational efficiency.
      • The prime number 65537=216+165537 = 2^{16} + 1 is often used.
      • Compute dd to satisfy the de1(d*e ≡ 1 ((n))(n)).
        • That is, dd is the of ee modulo (n)(n).
        • This is more efficiently computed using the .
        • This number dd is the private key .
    • The public key consists of (e,n)(e, n), and the private key is (d,n)(d, n).
  1. Key distribution:

    • The public key (e,n)(e, n) is made public to those who may wish to send a message
    • The private key (d,n)(d, n) is kept secret.
  2. Encryption:

    • Alice wishes to send a message MM to Bob. In this case a simple integer

    • Alice uses Bob's public key (e,n)(e, n) to encrypt the message into ciphertext CC

    Mathematical detail
    • MM is an 0M<n0 ≤ M < n.
    • CMe(modn)C ≡ M^e (mod n), where CC is the ciphertext.
  3. Decryption:

    • Bob receives the ciphertext CC
    • Bob uses his private key (d,n)(d, n) to decrypt the message back into message MM
    Mathematical detail
    • MCd(modn)M ≡ C^d (mod n).

This is the basic outline of RSA. In practice, are applied to the plain text MM before encryption to ensure that equal plain texts result in different ciphertexts. This prevents a range of possible attacks against naive implementations of RSA and enables .

Illustration of RSA in Python

In the following code cells, we illustrate a simple example of the RSA algorithm using small integers and then demonstrate practical key distribution and digital signature applications using Python libraries implementing RSA.

Note: In this section we will show the math calculations in detail as part of the Python code

RSA key generation

Let's step through a simple instance of the RSA algorithm employing small prime numbers.

We will need to be able to compute the of two integers, as it will be needed to test whether two integers are .

We will explain one simple way to calculate this, but it is much more efficient with larger integers to use the math.gcd python function.

Authenticate to run code cells
Reset Copy to clipboard

Output:

gcd(50,10) = 10
gcd(99,1033) = 33
gcd(59,9) = 1

The first phase of the RSA workflow is key generation. This is initiated by choosing two prime numbers, which are meant to be kept secret by the entity generating the keys.

Authenticate to run code cells
Reset Copy to clipboard

Output:

The secret prime numbers p and q are: 13 19

Next, the nn, which is simply the product of the two chosen primes, is computed. This modulus will be published as part of the public key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

modulus n (p*q)= 247

The Euler function is computed next, as it is needed for the operation used to determine the keys in RSA. phiphi is also kept secret and typically discarded after key generation.

Authenticate to run code cells
Reset Copy to clipboard

Output:

The secret Euler's function (totient) [phi(n)]: 216

We are now ready to calculate the public and private keys. In RSA, each of these is specified by a tuple of two integers. The first entry in each tuple is a distinct integer, and the second entry is the modulus nn that is common to both keys.

The first entry in the public key can be any integer greater than 1 that is coprime to phiphi. Two integers are coprime if their is 1. So we use the math.gcd function find an integer ee coprime to phiphi.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Public Key (e): 5

The private key requires an integer dd, which is the of ee modulo phiphi; that is, it satisfies the de1(modφ(n))d*e\equiv 1 \pmod{\varphi(n)}. For this simple illustration where we are dealing with small integers, we can just loop over the positive integers to locate a suitable dd. In realistic settings, the computationally efficient is used for this purpose.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Private Key (d): 173

We now form the tuples (e,n),(d,n)(e, n), (d, n), which form the public and private keys respectively. The public key is then published, while the private key is kept secret.

Authenticate to run code cells
Reset Copy to clipboard

Output:

The Public key is (5, 247) and Private Key is (173, 247)

Encryption and decryption in RSA use the . Further, the public and private keys are complementary in that either can be used to encrypt a message that the other can then decrypt.

Here we illustrate the case where the public key (e,n)(e,n) is used for encryption and the private key (d,n)(d, n) is used for decryption by defining a Python function for each operation.

We then encrypt and decrypt an integer message msgmsg.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Original Message: 9
Encrypted Message: 16
Decrypted Message: 9

While the small integers used above are useful to easily outline the core ideas in the RSA algorithm, in real applications RSA requires the use of very large integers. For example, 2048-bit RSA involves the use of a nn that is 2048 bits long, the decimal integer equivalents of which are around 10616^616. These truly enormous numbers are necessary for the practical security of RSA.

Symmetric key exchange with RSA

As discussed previously, enables two parties who wish to communicate to securely establish a shared secret, which can be used, for instance, as the secret key for subsequent symmetric encryption of bulk plain text.

Let us consider the following scenario. Alice and Bob want to use to encrypt and decrypt messages. However, before this process can be initialized, they need to agree on a common secret key. One option is for one party — say, Alice — to generate a secret key and then securely transmit it to Bob. To achieve this secure transfer, Alice decides to use RSA as the (KEM).

This involves the following steps:

  • First, Alice generates a random symmetric key, which she intends to share with Bob.
  • Then, Bob generates an asymmetric key pair and makes his public key available on a suitable channel.
  • Next, Alice uses Bob's public key to encrypt the symmetric key, thus encapsulating it in a ciphertext.
  • Then, Alice broadcasts the ciphertext over a reliable but not necessarily secure channel.
  • Finally, Bob receives the ciphertext and decrypts it using his private key. He now has access to the symmetric key generated by Alice.

Figure 1. Symmetric key exchange with RSA

Figure 1. Symmetric key exchange with RSA

Padding-based key-exchange example in Python

A practical workflow utilizing RSA for padding-based non-interactive key exchange is now illustrated using the cryptography Python library.

Import necessary modules from the cryptography Python library. If needed, this library can be installed using the command pip install cryptography.

Alice then generates a random secret key, which she intends to transmit to Bob.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Requirement already satisfied: cryptography in /home/travis/virtualenv/python3.11.0/lib/python3.11/site-packages (42.0.0)
Requirement already satisfied: cffi>=1.12 in /home/travis/virtualenv/python3.11.0/lib/python3.11/site-packages (from cryptography) (1.16.0)
Requirement already satisfied: pycparser in /home/travis/virtualenv/python3.11.0/lib/python3.11/site-packages (from cffi>=1.12->cryptography) (2.21)


Symmetric key generated by Alice: b'n7499EbS36iUOehWm_XTgiT_RgRhKPe8tbD6QEHps7U='

Using the rsa module from the cryptography library, Bob generates a key pair and then broadcasts his public key. Any one can intercept the public key and read off the public numbers (e,n)(e,n) that form the key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Public key broadcast by Bob: <cryptography.hazmat.bindings._rust.openssl.rsa.RSAPublicKey object at 0x7fa7a4570290>

Public numbers in Bobs' public key: <RSAPublicNumbers(e=65537, n=28318586403474141406763564385551842567768357338303384331554692142040905569740933730177623451298221852890131171760569011411834419533062644264362237798713112012903562636899238690956445516763597222426451385820917153594982717835682879126504394700949486697366215656026171908759444098766648941648259553434147485142815729100352503479814708778130638335262520147401525661707978920617274148896293533924735012550867833159183955058383484476603763220505596309250511798152283696550616275594731528859541566726709785752736960726169358661677802248535004836852848635989762250159175787401257757712790765363612944909468318337956683930129)>

At this point, we assume Alice has received the public key broadcast by Bob. Alice's symmetric_key above can now be encrypted using Bob's public key to produce the ciphertext. In a realistic setting, Alice will also use additional methods such as to ensure for her communications with Bob.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Ciphertext: b'G*\xcfe\xd0\xe00\x8a\xa82\x05\xaa\xad\xb7\xecS\x14\x98\xa7H\xa6dlPl*\x88\xec\x99)\t%\xee\x8f\xcb\xf8\xba\x11\x97\x17j\xc3\x9e\xc7^\x86\x92p0\xfc\x1e\xb8M\x94\x99\xf4J\xf1\x8e\xda\xf2\x9d\x11<\xc5\x80\xe1+\x1b\xd9N\x1e0t9\xb65\x08Dg\xa0\xfd|/A\x80g\x85?\xa1\xfdDu\xf7\x02!n\xb5\x9d\xa15\x8dW\xbc\xa2\xd8E\x1b\x16\xf9\x8f<wK@#\x83N\xf9\x07\xd6t\xccT\x87\xdda\x86\xf5gw\x8e\xae\x9c\xc3\xa3\xb8\xb3Q\xe2!|E\x91\xe7z\xe33\xd7\x00\x88:^\x9b\x7fK\xad\xb5\xf8\x97\x0e\xec\xae\xbb\xbb8\xe8\xd5\xd74\x84\xb5\x19\xf1\x010 \x15\xcc6\x91K\x04\xa0\x9f\x81\xaa\x9aN\x19\x1b\xf0\xa5\x98B\xe7\x1d\x15\xf0\xa2\x0b\xa4\xb4\x9c\x83\xe2q\x95:\xe8\x95}\xc3\xcb\x9a\x1b\xef\xa8f\xa7y/\xb1\x18\x98\x8e\x0b\xb1gnw.\xb0\x00<\xe3\x9cXH)\xc3#_S~JA\x14o\\\xe7LK=>v'

Alice then broadcasts the ciphertext over an open channel, confident that only Bob with the corresponding private key will be able to decrypt it. We assume Bob has received the ciphertext and can now decrypt it using his confidential private key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Decrypted key: b'n7499EbS36iUOehWm_XTgiT_RgRhKPe8tbD6QEHps7U='

At this point, Alice and Bob both have access to the secret symmetric key, which they can use for applications.

Simulating a Key Encapsulation Mechanism with RSA in Python

In the following workflow, we illustrate the use of RSA to simulate a Key Encapsulation Mechanism (KEM) whereby a sufficiently long random secret message is securely exchanged and subsequently converted into a shared-secret of the appropriate length using a KDF.

Once again Alice and Bob want to establish a shared secret non-interactively and Alice is the party that decides which secret to use.

We start by importing some necessary Python libraries.

Bob then generates his RSA key pair and broadcasts his public key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Bob's private and public keys created

Alice first generates a long random secret from which the shared secret will be eventually derived. In a pure , the long secret will be a random element from the algebraic structure underlying the cryptosystem. In the case of 2048-bit RSA, this would be a random integer modulo the 2048-bit RSA modulus. As such a pure KEM does not require additional padding but in this example we are only simulating a KEM using RSA and the cryptography library requires the use of padding when encrypting with RSA. So we will use a somewhat shorter long secret which nevertheless is much longer than a standard 256-bit AES key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Alice's secret created

Next Alice encrypts the long secret using Bob's public key and the encrypted secret is communicated to Bob.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Alice's secret encrypted

Bob decrypt's the encrypted secret received from Alice using his private key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Secrets match

Finally both Alice and Bob separately apply an agreed upon (KDF) on the long secret to derive the symmetric key.

Note that the process involves a hashing protocol and the use of a random which ensures uniqueness and unpredictability of the shared symmetric key in case the long secret is ever reused (not recommended). However the salt itself does not have to be secret and once it is randomly generated, say by Alice in this example, it can be broadcasted to Bob alongside the encrypted long secret.

We will assume therefore that both Alice and Bob have access to the same random salt.

Authenticate to run code cells
Reset Copy to clipboard

Output:

A symmetric key of length 256 bits was successfully derived by both Alice and Bob!

Digital signatures with RSA

We will now extend the above confidential communication scenario with Alice and Bob to one that also includes validation with the help of a

As before, Alice will confidentially send a message encapsulating a symmetric key to Bob, but she will also digitally sign the message so that Bob, upon receiving the message, can verify that it was Alice who originated it and that the contents of the message were not tampered with during transmission.

More generally, it is desirable to enable validation without compromising confidentiality whereby any interested party is able to verify the , , and establish non-repudiation with respect to a given communication, even if that party does not have access to the actual plain text message.

We will consider this general setting which then involves the following steps:

  • First, both Bob and Alice make their public keys available over an open channel.
  • Then, Alice encrypts the plain text using Bob's public key, creating a ciphertext.
  • Next, Alice creates a hash of the ciphertext with a hash function and further encrypts the resulting hashed ciphertext using her private key. This encrypted hash is the signature.
  • Then, Alice then transmits both the ciphertext and the signature over an open channel.
  • Then, Bob uses Alice's public key to decrypt the signature, revealing the hashed ciphertext.
  • Next, since Bob also has access to the ciphertext itself, he uses the same hash function used by Alice to recreate a second instance of the hashed ciphertext. If the latter matches the one obtained by decrypting Alice's signature, then the message is validated, even though the ciphertext itself has not yet been decrypted.
  • Finally, Bob, having validated the message, decrypts the ciphertext using his own private key.

Figure 2. Digital signatures with RSA

Figure 2. Digital signatures with RSA

This workflow for a digital signature is illustrated next.

We once again import some useful modules from the cryptography library.

As before, Alice intends to securely send a symmetric key to Bob, but she also wishes to digitally sign it. In this case, we need public keys for both Alice and Bob. Therefore, the first step is for both Alice and Bob to create their own key pair using RSA and broadcast their own public key to the world.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Private and Public keys generated for Bob and Alice.

In the next step, as before, Alice uses Bob's public key to encrypt the symmetric key and prepares the ciphertext.

Authenticate to run code cells
Reset Copy to clipboard

Output:

ciphertext of symmetric key:  b'b$\x1cQ\r.\x93\xc5=\xde\x1cg\xe4X\x8f\xf7\xecic\x06F\n\x8ao\xc7X\x9c\x12\xef{9r8\xb5A\x95Ma6\x90\xf4\xc2\xf8\xcd\xe7%\xa8\xcd+\x18\xcb\x9cC7\xd9_s\x07\xb4\xd8\x11O\x99\xa1Q\xdd/\xc8p\xbb\xb5R\xdc\xe4n$?\x84\xcb\x8a\x16\xc1 \x84\\\x9c\xa9\xc7\xdbOu/\x16t\xe8C\xce\x94 \x7f\x92\x80\x85\xefn7\x0e_\xb6Y\xbb\xdcQD8\xee\xecG9N\xd5G\x97@\x89\x9e^.\x04~\x8bC!@G\x88\x0f\xba\x1ciY\xea\xd3M\x04h\xfd\xb0\x06\xc3\x8b]f!J\x84\xc9\xb8t_\xf1o\x9cB:\\Yu0\xa1\x8b\xd3C<\x02\xc8$\xfa\x96\xf7g\xae\xbb\xb2\xd2-\xdbTk\xb3R\xe1\x90\x00E\xdd}2\xb5S$Cx8\x15>\xc6mu]\x8e\xd8z\xa0\xab\x06+\xc9T=\x87\xcf{\x8e\x8e\xe3\x80\xb2\xc4\x15\xdb\xed\x16,\x86\x12\xf5\xe9upb\xe2#z\xe6pX3\xc4\x12$\x0f\xd2\xee\xf4\x11'

At this stage, instead of just broadcasting the ciphertext, Alice intends to attach a digital signature to it so that she can prove to Bob that she was the sender of the message. This is done in two steps:

  1. Create a hash of the ciphertext using a hashing algorithm.
  2. Encrypt the hash using Alice's private key, which amounts to a signature.
Authenticate to run code cells
Reset Copy to clipboard

Output:

signature:  b'\x940\xa1\x0302\xda*\xc6\xf4\xbbC\x1d\x85U\'\xceL\x92\x18\xdc\xce\x81NK\xf8\xab~\xc4\xd0\x96\x14\x04\x8dL\x7f\x1aHr*\xde\xfdEt\xe9&YZ:\xd7\xcde\x1a\x8d\xc4\x8a^\xfd\x9c\x89\x8d\x1c\xdem\xb2 U1\xc3\x1ev\xe3\xf7\xb7\xb8\xc7Ya\xc9\xa5\x9b\xbf,\xad|\xc0\xf7\x98\xc4\xec>h\xa5.\x8c\x81L\xed\x81\xf3<r$1\xcctd\xa7xbZi\x05\xc7\xb6\x92z \x03"\xb62j\x0eE\x06\x0f\xdeO\x84\xf1\xd0m\x89\x9f\xaa\x8d\xae\xee\xe44?\x98\xdf\xbb\x9a\xd5\xe2%FWW\xdb\xd9=\xac&/q\x8d\x9bY\xcf60\x94\xad\xdbX\xdfq\x85\x0f\xb7\x08\x10.\xdc^0\xf3PG\xa9\xcf\x88\xc6\x05\x11\x1b\x89\xb8\x962\n\xb1\x8aC\xde{\x83l\x94\'\xd8\x98vD\xc5\xb0\xc7\x93\x7f\x11\n\x1dp\x7f\x94\xb6f\xb0\xbf\x97-o\xc1tQ*k\x9fSe\xc5i_4p\rL\xed\x85j@\xb6\x03\xb6\xf2\x13Bl\xda\xff\x15*'

Alice then broadcasts the ciphertext and the signature over a network so that Bob is able to intercept both of them.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Sending ciphertext and signature.....

On Bob's side, the first task is to verify the and of the ciphertext. To do this, Bob creates a hash of the received ciphertext using the same hashing algorithm used by Alice.

Authenticate to run code cells
Reset Copy to clipboard

Output:

hash to verify:  b'/Iy\x91\xf8\xc7\x9d\xb0\xe0,1{\x99\xb7EM\x96\xcb\xf4\xf7\xaf<A\xdc3\x10\xf1\xb2;T\x84\xef'

Then, Bob decrypts the received signature using Alice's public key. Because Alice used her private key to create the signature, Bob is able to decrypt it using Alice's public key. The decrypted signature is nothing but a hash of the ciphertext created on Alice's end. If the hash created by Bob matches the decrypted signature, then Bob has verified that the ciphertext he received has not been tampered with and that it was Alice who signed the ciphertext.

In the Python code below, these operations are combined into a useful utility function called verify provided by an object associated with Alice's public key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

The signature is valid.

Having verified the and of the received ciphertext, Bob can then decrypt it using his private key because Alice created the ciphertext using Bob's public key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Decrypted message: n7499EbS36iUOehWm_XTgiT_RgRhKPe8tbD6QEHps7U=

In the above digital signature scenario, any party — not just Bob — can verify that Alice is the sender of the ciphertext because everyone can access Alice's public key, the ciphertext, and the digital signature. Furthermore, after sending the ciphertext and the signature, Alice cannot later deny having done so because the signature can be decrypted to a meaningful hash using only her public key. This establishes .

By enabling secure key distribution and supporting non-repudiation, public key cryptography establishes a strong foundation for secure digital communication.

Breaking RSA

The utility and security of the algorithm outlined above rests on two mathematical assumptions:

  1. Finding the dd given that access only to (e,n)(e, n) is computationally infeasible.

  2. In the RSA setting, the modular exponentiation operation behaves like a . When used for encryption, it yields ciphertext that is inscrutable, and without access to the private key, inverting the operation to recover the plain text from the ciphertext is not feasible. However, with access to the private key, which acts as a trapdoor, the ciphertext can be easily decrypted.

The most prominent attack on the algorithm aims to undermine assumption 1 by efficiently recovering the private key number dd via factorizing the nn. As will be illustrated below, it is easy to recover dd if one has access to either the pp and qq of nn or the totient (n)(n). Recall that pp, qq, and (n)(n) are kept secret during key generation and discarded after. A third party that recovers this information using a classical or quantum computer essentially uncovers the private key, breaking RSA. Thus, prime factorization is the key computational primitive necessary for breaking RSA.

Classical computing and RSA

Prime factorization of large integers is known to exhibit or scaling on classical computers. The best-known classical algorithm for factoring integers larger than 1010010^{100} is the .

Mathematical detail

Ln[13,(649)(13)]=e[((649)(13)+o(1))(log(n))13(log(log(n))23)]L_n[\frac{1}{3}, (\frac{64}{9})^{(\frac{1}{3})}] = e^{[((\frac{64}{9})^{(\frac{1}{3})} + o(1)) (log(n))^{\frac{1}{3}}(log(log(n))^{\frac{2}{3}})]}

as a function of the integer nn to be factorized.

This scaling is super-polynomial in the number of bits required to represent nn.

Therefore, prime factorization is considered inefficient on classical computers.

At present, the are in the range of 829 bits or 250 decimal digits. Given the exponential growth in classical computing power that has been witnessed over the last several decades, 1024-bit RSA is no longer considered secure in the near term and is now deprecated. Nevertheless, in the foreseeable future, factoring 2048-bit integers whose magnitude are in the range of 1061710^{617} is at present considered infeasible on classical systems, suggesting continued viability. The advent of quantum computers, however, voids this assessment.

Shor's quantum algorithm and RSA

Probably the most well-known quantum algorithm today is for finding the prime factors of integers. When , it was recognized as the first quantum algorithm that offered a relative to classical algorithms on a problem of great practical importance, namely prime factorization.

Shor's algorithm can factor primes with O(n2)O(n^2) where nn is the number of bits.

Mathematical explanation of Shor's algorithm

In the context of , works by exploiting the of the f(x)=ax(mod n)f(x) = a^x (mod~n) and provides a quantum -finding primitive that enables efficient prime factorization of the nn.

A simplified high-level outline of Shor's overall scheme for breaking RSA is as follows:

  1. Given the modulus nn, which is published as part of the public key, choose a number aa coprime to nn i.e., gcd(a,n)=1gcd(a,n) = 1. Since we know that n=pqn = p*q has exactly two prime factors (p,q)(p, q), almost any number less than nn that we pick at random is likely to be coprime to nn.

  2. Having chosen aa, find the exponent rr such that ar1(mod n)a^r \equiv 1 (mod~n). This implies ar10(mod n)a^r - 1 \equiv 0 (mod~n). The existence of an rr such that the above holds is guaranteed by the of .

  3. If rr is even, ar10(mod n)    (ar/21)(ar/2+1)=γna^r - 1 \equiv 0 (mod~n) \implies (a^{r/2} - 1) (a^{r/2} + 1) = \gamma*n for some integer γ\gamma. The left-hand side of this latter equality must contain pp and qq as two of its prime factors since the right-hand side does. If rr is odd, go back to step 1 and try a different choice for aa.

  4. Use the for finding gcd((ar/21),n)gcd((a^{r/2} - 1), n) or gcd((ar/2+1),n)gcd((a^{r/2} + 1), n). The computed is very likely to identify one of the prime factors pp or qq. Divide nn with this prime factor to recover the other.

  5. Once p,qp, q are known, use the steps from the original RSA algorithm for reconstructing the totient ϕ(n)\phi(n) and generating the private key number dd as the of the known public key number ee.

In August 2023 Oded Regev on Shor's original using a multi-dimensional approach resulting in O(n1.5)O(n^{1.5}). There continues to be further research including by in this area which may improve time, cost, or number of qubits needed. Whilst we cannot say when running such algorithms against real-world RSA encryption truly becomes viable, it is becoming closer all the time.

Python example demonstrating breaking RSA encryption

In the following code cells, we illustrate an example of finding a private key given only the public key. This will use brute-force classical computation, but shows how Shor's algorithm could be used - including large keys.

Note: In this section we will show the math calculations in detail as part of the Python code

In the example, we have a public key (5,247)(5, 247), and we will recover the private key.

Step 1: The first step is to pick a number coprime to 247. Almost any number we guess will do the job. Let's pick 6.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Checked 247 and 6 are coprime

Step 2: Next we need to find the rr such that 6r1(mod 247)6^r \equiv 1 (mod~247). In this example, we compute rr classically using brute force, but we could also use Shor's algorithm on a quantum computer using .

Authenticate to run code cells
Reset Copy to clipboard

Output:

period r is: 36
Checked 6^36 mod 247 is 1

Step 3: Since the r=36r = 36 is even, we can compute f1=(ar/21),f2=(ar/2+1)f1 = (a^{r/2}-1), f2=(a^{r/2}+1).

Authenticate to run code cells
Reset Copy to clipboard

Output:

f1 = 101559956668415
f2 = 101559956668417

Step 4: Find the of either of those factors with nn. Simply divide nn with the prime factor already found to obtain the second prime factor.

Authenticate to run code cells
Reset Copy to clipboard

Output:

One possible prime factor of n (247) is: 19
The second prime factor of n (247) is: 13

Step 5: Having recovered the prime factors of n=247n = 247 as pfound=13p_{found}=13 and qfound=19q_{found}=19, we compute the totient ϕfound=(pfound1)(qfound1)\phi_{found} = (p_{found}-1)*(q_{found}-1).

The private key is the of the public key number e=5e=5.

Authenticate to run code cells
Reset Copy to clipboard

Output:

The totient is: 216
Private Key number: 173

In the above scheme, step 2 is the crucial -finding operation for which uses two fundamental quantum primitives, namely the and . For a detailed explanation of the quantum aspects of Shor's algorithm, see the Phase estimation and factoring lesson in the Fundamentals of Quantum Algorithms. Steps 1 and 3 through 5 involve relatively inexpensive operations that can be easily carried out on classical computers.

Optionally, here's a detailed visual walkthrough of implementing Shor's algorithm.

On quantum computers, Shor's algorithm can exhibit as favorable as O((log n)2(log log n))O((log~n)^2 (log~log~n)) in terms of the modulus nn, or in terms of the number of bits needed to represent nn. This is a compared to the classical .

indicate that based on certain assumptions made regarding the hardware configuration, a few tens of thousands to several million will be necessary to break 2048-bit RSA using Shor's algorithm. It is not inconceivable that quantum computers with several tens of thousands of qubits will become available over the next several years, making the lower end of the resource estimate accessible.

Diffie-Hellman key exchange and the Digital Signature Algorithm

In the previous section, we discussed the RSA cryptosystem, whose security is based on the computational hardness of prime factorization. Here, we will discuss two popular asymmetric key cryptographic protocols, (DH) and the (DSA), both of which are based on a different mathematical problem, namely the discrete logarithm problem (DLP).

The discrete logarithm problem

In the following equation we need to find aa given only ee,MM,cc

aea^e modmod M=cM = c

This is believed to be difficult with classic computers due to the use of modulo arithmetic, and therefore is a good mathematical basis for an encryption algorithm.

This is known as the (DLP).

Mathematical details of the discrete logarithm problem

The DLP is typically framed in the context of and is stated as follows.

Consider a GG generated by a group element gGg \in G and given an arbitrary element hGh \in G, find an integer kk such that h=gkh = g^{k}.

Here the integer klogghk \equiv log_{g}h is the discrete logarithm. The cyclic property of GG guarantees that for every hh, a valid integer kk exists.

For cryptography, the DLP on the of integers modulo a prime number pp denoted (Zp)×(\mathbb{Z}_p)^{\times} turns out to be useful. The elements of (Zp)×(\mathbb{Z}_p)^{\times} are congruence classes labeled by integers modulo pp that are coprime to pp.

For example:

(Z5)×={[1],[2],[3],[4]} and (Z7)×={[1],[2],[3],[4],[5],[6]}(\mathbb{Z}_5)^{\times} = \{[1],[2],[3],[4]\}~\mathrm{and}~(\mathbb{Z}_7)^{\times} = \{[1],[2],[3],[4],[5],[6]\}

The operation of multiplication (×)(\times) on these groups is simply ordinary integer multiplication followed by reduction modulo pp and exponentiation by an integer kk is just repeated multiplication kk times and reduction modulo pp.

Let's illustrate an instance of the DLP on (Z7)×(\mathbb{Z}_7)^{\times}.

This multiplicative group has two generators [3],[5]{[3],[5]} also known as primitive roots. We will use [5][5] as the generator; that is, generate every element of (Z7)×(\mathbb{Z}_7)^{\times} using successive integer powers of 5.

Authenticate to run code cells
Reset Copy to clipboard

No output produced

Authenticate to run code cells
Reset Copy to clipboard

Output:

Using generator 5
5**0 mod 7 ≡ 1
5**1 mod 7 ≡ 5
5**2 mod 7 ≡ 4
5**3 mod 7 ≡ 6
5**4 mod 7 ≡ 2
5**5 mod 7 ≡ 3
5**6 mod 7 ≡ 1
5**7 mod 7 ≡ 5
5**8 mod 7 ≡ 4
5**9 mod 7 ≡ 6
5**10 mod 7 ≡ 2
5**11 mod 7 ≡ 3
5**12 mod 7 ≡ 1
5**13 mod 7 ≡ 5
5**14 mod 7 ≡ 4
5**15 mod 7 ≡ 6
5**16 mod 7 ≡ 2
5**17 mod 7 ≡ 3
5**18 mod 7 ≡ 1
5**19 mod 7 ≡ 5
5**20 mod 7 ≡ 4

We see that in modulo 7 arithmetic, raising 5 to successive integer powers yields every element of (Z7)×(\mathbb{Z}_7)^{\times} exactly once before the cycle repeats indefinitely with a period p1=6p-1 =6.

So the DLP on (Z7)×(\mathbb{Z}_7)^{\times} with generator 5 is:

Given h(Z7)×,find k such that 5kh (mod 7) \mathrm{Given}~h \in (\mathbb{Z}_7)^{\times} \mathrm{, find~} k \mathrm{~such~that~} 5^{k} \equiv h~(mod~7).

From the above Python cell output we see that:

h=2    k=4 or 4=log5(2)(mod 7)h = 2 \implies k=4 \mathrm{~or~} 4 = log_5(2) (mod~7)

h=6    k=3 or 3=log5(6)(mod 7)h = 6 \implies k=3 \mathrm{~or~} 3 = log_5(6) (mod~7)

In ordinary real number arithmetic, exponentiation is a , function and finding the logarithm of any number to any base is computationally easy. In contrast, as is apparent from the simple (Z7)×(\mathbb{Z}_7)^{\times} example above, modular exponentiation is non-monotonic, and even though it is periodic with period p1p-1, it is otherwise highly nontrivial. So computing its inverse, the discrete logarithm turns out to be inefficient for large pp on classical computers.

This observation underpins both Diffie-Hellman (DH) key exchange and the Digital Signature Algorithm (DSA), which are discussed in the next section.

The DLP can be extended to cyclic subgroups as follows:

  • Consider (Zp)×(\mathbb{Z}_p)^{\times} defined above and an element g(Zp)×g \in (\mathbb{Z}_p)^{\times} of prime order rr i.e., gr1( mod p)g^r \equiv 1 (~mod~p).
  • The set of integer powers of gg: {gk (mod p)1kr}=g\{g^k~(mod~p) | 1 \le k \le r\} = \langle g \rangle is a cyclic subgroup of (Zp)×(\mathbb{Z}_p)^{\times} with group order rr.
  • A DLP can be specified on g\langle g \rangle by choosing a hgh \in \langle g \rangle and asking for 1ar1 \le a \le r such that ga (mod p)=h g^a~(mod~p) = h

Diffie-Hellman key exchange

In 1976, to enable the creation of a shared secret key over insecure communication channels. The secret key could then be used by parties sharing it for symmetric encryption. The algorithm relies on the DLP.

Figure 1. Diffie-Hellman key exchange

Figure 1. Diffie-Hellman key exchange

Mathematical details of Diffie-Hellman key exchange With Alice and Bob being the two parties communicating, the protocol works as follows:
  • First, Alice and Bob agree on a large prime number pp and a primitive root or generator aa.
  • Then, Alice chooses a secret exponent kAk_A randomly with 1kAp21 \le k_A \le p-2 and calculates hA=akA (mod p)h_A = a^{k_A}~(mod~p). kA,hAk_A, h_A are Alice's private and public keys respectively.
  • Next, Bob chooses a secret exponent kBk_B randomly with 1kBp21 \le k_B \le p-2 and calculates hB=akB (mod p)h_B = a^{k_B}~(mod~p). kB,hBk_B, h_B are Bob's private and public keys respectively.
  • Then, Alice sends Bob hAh_A and Bob sends Alice hBh_B over a reliable but not necessarily secure channel.
  • Then, Alice then uses the hBh_B she received to compute the shared secret key κ=hBkA (mod p) \kappa = h_B^{k_A}~(mod~p).
  • Finally, Bob meanwhile uses the hAh_A he received to compute the shared secret key κ=hAkB (mod p) \kappa = h_A^{k_B}~(mod~p).

With this protocol,

  • Alice and Bob are guaranteed to end up with the same secret key κ\kappa because hBkA (mod p)=(akB)kA (mod p)=akAkB (mod p)=(akA)kB (mod p)=hAkB (mod p)h_B^{k_A}~(mod~p) = (a^{k_B})^{k_A}~(mod~p) = a^{k_A k_B}~(mod~p) = (a^{k_A})^{k_B}~(mod~p) = h_A^{k_B}~(mod~p) .
  • A third party that intercepts hAh_A or hBh_B cannot construct the secret key κ\kappa because they do not have access to kBk_B or kAk_A respectively.
  • Extracting kAk_A or kBk_B using the public information aa, pp, hAh_A and hBh_B is computationally hard as it is equivalent to solving the DLP on (Zp)×(\mathbb{Z}_p)^{\times}.

Illustration of the Diffie-Hellman protocol in Python

Let's look at a simple example of the DH protocol in Python using small prime numbers:

Note: In this section we will show the math calculations in detail as part of the Python code

Step 1: Alice and Bob agree on a prime pp and a primitive root aa. Let's choose p=11,a=7p=11, a=7.

Authenticate to run code cells
Reset Copy to clipboard

Output:

prime: 11
primitive root: 7

Steps 2, 3: Alice chooses a secret exponent kAk_A and calculates hA=akA (mod p)h_A = a^{k_A}~(mod~p). Similarly, Bob chooses a secret exponent kBk_B and calculates hB=akB (mod p)h_B = a^{k_B}~(mod~p).

Authenticate to run code cells
Reset Copy to clipboard

Output:

Alice's private key is 4 and public key is 3
Bob's private key is 8 and public key is 9

Step 4: The two parties broadcast their public keys hAh_A and hBh_B.

Steps 5, 6: Each party combines their private key with the other party's public key to create the shared secret key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

The shared secret key is: 5

Alice and Bob can now use the shared secret key for symmetric encryption.

Security of Diffie-Hellman key exchange

As noted above, the security of DH is predicated on the computational difficulty of solving the DLP with large primes pp. In typical applications, NIST recommends 2048- or 3072-bit prime integers for DH key exchange, which is considered sufficiently secure against attempts to solve the DLP using classical computers.

Man-in-the-middle (MITM) attacks: The fact that DH is an interactive scheme where the shared secret depends on combining one party's private key with the other party's public key makes it vulnerable to a so-called Man-in-the-middle (MITM) attack.

Mathematical details of DH security and MITM attacks

In this scenario, a third party — say, Eve — intercepts the public keys hA,hBh_A, h_B during transmission and substitutes her own public key hEh_E for each of hAh_A and hBh_B before forwarding them along to Bob and Alice, respectively.

Then, instead of using hBh_B to create her shared secret, Alice will use hEh_E while thinking that she is using Bob's public key. Similarly, instead of using hAh_A to create his shared secret, Bob will use hEh_E while thinking that he is using Alice's public key.

Because hEh_E was used to create Alice's (Bob's) shared secret, plain text encrypted by Alice (Bob) can be decrypted by Eve.

Thus, DH key exchange is typically used in conjunction with a digital signature algorithm to ensure that each party uses an authenticated public key for creating their shared secret.

The Digital Signature Algorithm (DSA)

Even though generic cryptosystems like provide digital signature functionality, in 1994 NIST adopted a specialized signature scheme based on modular exponentiation and the discrete logarithm problem as the federal standard for digital signatures. This scheme, which came to be known simply as the (DSA), involves four distinct phases:

  1. Key generation:

    DSA keys are generated from:

    • 2 primes that meet certain rules
      • pp - typically 256 bits (we'll call this length NN)
      • qq - typically 3072 bits (we'll call this length LL)
    • A cryptographic hash function that will convert from strings of length LL to NN
    • An additional Parameters gg (see details below)

    From this we choose a random number xx as private key, and are able to calculate a public key yy

    Mathematical details of key generation
    • Parameter generation: Mathematically, DSA involves a cyclic subgroup of (Zp)×(\mathbb{Z}_p)^{\times} generated by an element gg of prime order q < p. The first step in the DSA is therefore to select two primes p, q to set up the necessary mathematical structures.

      • Choose an NN-bit prime qq.
      • Choose an LL-bit prime pp such that p1p-1 is a multiple of qq. The choice of pp specifies the group (Zp)×(\mathbb{Z}_p)^{\times}
      • Pick an integer 1<h<p11 < h < p-1 at random and compute g=h(p1)/q mod pg = h^{(p-1)/q}~mod~p. If g=1g=1 which happens rarely, try another h.
      • Verify that gq mod p=1g^q~mod~p = 1. g is thus a generator of a cyclic subgroup g\langle g \rangle of (Zp)×(\mathbb{Z}_p)^{\times} of prime order q.

      The parameters (q,p,g)(q, p, g) specify an instance of the algorithm and are public information. Typically, N256 N \sim 256 and L3072L \sim 3072 in realistic applications.

      The protocol also requires a cryptographic hash function H:{0,1}L{0,1}NH : \{0,1\}^L \rightarrow \{0, 1\}^N that maps binary LL-bit strings to NN-bit strings, for example, SHA-256.

    • Key-pair generation: The signer needs to generate a key pair on their end.

      • Choose a random secret integer x{1,2...q1} x \in \{1,2...q-1\}. xx is the private key.
      • Generate y=gx mod p y = g^{x}~mod~p. yy is the signer's public key.
  2. Key distribution:

    The following information is shared with anyone who wishes to validate a signature

    • The parameters used (p,q,g)(p,q,g) which describe the algorithm
    • The hash function HH
    • the public key yy
  3. Signing:

    • The signer can now sign a message mm. The resulting signature is (r,s)(r,s)
    • The message and signature are both now sent to the recipient
    Mathematical details of signing a message

    The signer signs a message mm as follows:

    • Choose a secret integer k at random from {1,2...q1}\{1,2...q-1\}
    • Compute r=(gk mod p) mod qr = (g^k~mod~p)~mod~q. In the rare instance when r=0r=0, try another random k.
    • Compute s=(k1(H(m)+xr)) mod qs = (k^{-1} (H(m) + xr))~mod~q. In a rare instances if s=0s=0, try another random k.
    • The signature is the tuple (r,s)(r, s).
    • The signer transmits the message mm as well as the signature (r,s)(r,s) to the verifier.

    Because both r,sr, s are integers, modulo qq the signature size is on the order of NN-bits and not the longer LL-bits.

  4. Verification:

    The recipient now wishes to verify the authenticity of the sender. They have access to the public information (q,p,g,H,y,(r,s),m)(q, p, g, H, y, (r,s), m) They can execute a calculation which proves that the sender had access to the private key xx

    and seeks to ascertain that the signer is someone with access to the private key xx.

    Mathematical details of DSA verification scheme
    • Verify that 0<r<q0 \lt r \lt q and 0<s<q0 \lt s \lt q, i.e., r,sr, s are valid modulo~q integers.
    • Compute w=s1 mod qw = s^{-1}~mod~q.
    • Compute u1=H(m)w mod qu_1 = H(m)w~mod~q.
    • Compute u2=rw mod qu_2 = rw~mod~q.
    • Compute v=(gu1yu2 mod p) mod qv = (g^{u_1}y^{u_2}~mod~p)~mod~q.
    • The signature is verified if v=rv = r.

    For legitimate signatures, the correctness of the DSA algorithm is easily demonstrated as follows:

    • On the signer's end: s=(k1(H(m)+xr)) mod q    k=((H(m)+xr)s1) mod qs = (k^{-1} (H(m) + xr))~mod~q \implies k = ((H(m) + xr)s^{-1})~mod~q
    • Considering the right-hand side of the latter equality, a verifier can compute s1,H(m)s^{-1}, H(m) because s,q,m,Hs, q, m, H are public information.
    • Thus, the verifier computes w=s1 mod qw = s^{-1}~mod~q and u1=H(m)w mod q=H(m)s1 mod qu_1 = H(m)w~mod~q = H(m)s^{-1}~mod~q.
    • The verifier also computes u2=rw mod q=rs1 mod qu_2 = rw~mod~q = rs^{-1}~mod~q as rr is also public.
    • Note that k=(u1+xu2) mod qk = (u_1 + xu_2)~mod~q.
    • However, a verifier does not have access to xx as it is the signer's private key, so kk itself cannot be directly computed. - The scheme instead relies on the fact that even as the verifier cannot compute kk, with a legitimate signer, the verifier should instead be able to re-compute (gk mod p) mod q=r(g^k~mod~p)~mod~q = r with the help of the signer's public key y=gx mod py = g^{x}~mod~p.
    • So, the verifier computes v=(gu1yu2 mod p) mod q=(gu1gxu2 mod p)mod q=(gu1+xu2 mod p)mod q=(gk mod p)mod qv = (g^{u_1}y^{u_2}~mod~p)~mod~q = (g^{u_1}g^{xu_2}~mod~p)mod~q = (g^{u_1+xu_2}~mod~p)mod~q = (g^k~mod~p)mod~q.
    • The last equality follows because gg has order qq and establishes v=rv = r for valid signatures.

Illustration of the DSA in Python

In this example, we'll use small primes to illustrate the DSA process in a scenario where Alice sends a signed message to Bob. We'll use small integers in this example. A real-world scenario would employ much larger primes on the order of 2048-3072 bits.

You can try rerunning this example with different values to see how the algorithm behaves, but ensure you start running from the first cell in this section.

We start by importing necessary libraries and choosing our parameters.

The integer parameters pp, qq, gg are public information, and satisfy the following rules:

  • pp, qq are both prime
  • (p1)mod q=0(p-1) \mod\ q = 0
  • g2g \ge 2
  • g(p2)g \le (p-2)
  • g(p1)/qmod p1g^{(p-1)/q} \mod\ p \neq 1
Authenticate to run code cells
Reset Copy to clipboard

Output:

Public information is good: q=23, p=11, g=4

Next the signer, Alice, generates her private key.

The private key, k (alice-private-key in the python code) must satisfy:

  • k2k \ge 2
  • k(q1)k \le (q-1)
Authenticate to run code cells
Reset Copy to clipboard

Output:

Alice's private key is 7

Alice keeps her private key secret.

Next, Alice computes her public key using modular exponentiation. Inverting this step to recover the private key is an instance of the DLP, so difficult to compute on classical computers where large primes are used.

From the math explanation above, we know is this can be calculated using the formula:

y=gxmod py = g^{x} \mod\ p

Authenticate to run code cells
Reset Copy to clipboard

Output:

Alice's public key is 8

As usual, Alice makes her public key available over a not-necessarily secret channel.

Now Alice can sign a message.

For the digital signature scheme, we need to generate a hash H(m)H(m) of the message mm to be signed.

Let's assume Alice and Bob agree to use a particular hashing algorithm with hash length NN equal to the number of bits in qq. In this simple example, we will bound the outputs of our mock hash function by qq. The hash function here is trivial, simply generating a random integer.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Inspection tomorrow!
Alice's message hash is: 1

Next, Alice needs to generate a random per-message secret number kk as well as its multiplicative inverse modulo qq to compute the signature.

A simple brute-force algorithm can be used to compute the modular inverse. However it's better to use one of the built-in Python function pow as shown below

Authenticate to run code cells
Reset Copy to clipboard

Output:

modular_inverse(3,7) = 5
modular_inverse(4,11) = 3
modular_inverse(7,5) = 3
error! no inverse found! for 2,6
Exception from pow() - no modular inverse found!

Alice can now compute the signature.

  • r=(gkmodp)mod qr = (g^{k} \mod p) \mod\ q
  • s=(k1(H(m)+xr))modqs = (k^{-1} (H(m) + xr)) \mod q

Note that there are some particular conditions that will require us to choose a different value for kk in the case where either rr or ss compute to 00. In this case we will generate a new value and repeat.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Using random k = 1
Alice's signature is : (4, 7)

Alice broadcasts the message and her signature to the receiver or verifier, Bob.

Bob has previously intercepted Alice's public key and can now verify the signature to authenticate her message.

To do so, he performs some checks, and then re-generates a hash of Alice's broadcast message and computes the auxiliary quantities w,u1,u2w, u_1, u_2 and vv.

  • 0<r<q0 < r < q
  • 0<s<q0 < s < q
  • w=s1mod qw = s^{-1} \mod\ q
  • u1=H(m).wmod qu_{1} = H(m) . w \mod\ q
  • u2=r.wmod qu_{2} = r . w \mod\ q
  • v=(gu1yu2mod p)mod qv = (g^{u1}y^{u2} \mod\ p) \mod\ q

Finally, Bob checks if vv equals rr. If they are equal, then the signature is verified.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Inspection tomorrow!
Signature is valid!

Security of the DSA

With classical computing, the security of the DSA can be influenced by several factors:

  1. Key size: As always, key size provides basic protection against brute force attacks. Modern applications that use DSA use key sizes of at least 2048 bits.

  2. Quality of kk: In DSA, each signature needs a unique, random, and secret kk value (see above). The uniqueness, entropy, and secrecy of kk are critical, and weakness in any of these aspects could lead to the private key xx being compromised. The random number generators used for generating kk need to be secure themselves.

  3. Hash function strength: Since DSA uses a hash function as part of the signature generation and verification process, a weak hash function could compromise the security of the digital signature. For example, the use of SHA-1 with DSA is deprecated and SHA-2 or higher is recommended.

In addition to these, robust DSA implementation should also guard against other types of attacks that target asymmetric key cryptography as previously outlined.

Authenticated key exchange with Diffie-Hellman and DSA

Modern network security protocols, such as and many others, involve combining the functionalities of key exchange and authentication. In the context of Diffie-Hellman, authentication is needed to guard against MITM attacks. In the following code cells, we illustrate an example where Alice and Bob perform an authenticated key exchange by combining the Diffie-Hellman and DSA protocols. For this, we will use the cryptography Python library.

Figure 2. Authenticated key exchange with Diffie-Hellman and DSA

Figure 2. Authenticated key exchange with Diffie-Hellman and DSA

First, Alice and Bob agree on a set of DH parameters and generate a set of DH public-private key pairs.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Public and private keys generated for Bob and Alice

The DH public keys are broadcast. Next, both Alice and Bob each generate a separate key pair for use with DSA. These keys will in turn be used to sign the DH public keys to be exchanged.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Additional key pair generated for signing

Alice signs her DH public key using her DSA private key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Alice signed public key

Similarly, Bob signs his DH public key using his DSA private key.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Bob signed public key

The DH public keys and corresponding signatures are now broadcast by both Alice and Bob. Having received their counterparty's public key and signature, each party then verifies the signature is valid. This way, a MITM attack can be prevented, as Alice, for instance, knows that Bob's DH public key was indeed signed by Bob and vice versa.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Signatures are valid

Following signature verification, Alice and Bob generate the shared secret, which completes the key exchange.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Shared secrets generated

Optionally, for additional security, Alice and Bob can use a specialized key derivation function such as HKDF to generate a more secure symmetric key from their shared secret using key stretching techniques.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Keys checked to be the same

Breaking Diffie-Hellman and DSA

Both the (DH) and protocols involve the generation of public keys of the form y=gx mod p y = g^{x}~mod~p, where the private key xx is the discrete logarithm.

An attacker who can solve an instance of the DLP can expose the private key of one of the two parties and, by combining it with the public key of the second party, get access to the shared secret. In the case of DSA, an attacker who can solve the DLP can recover the signer's private key, voiding the basic premise of a digital signature.

On classical computing systems, the DLP is believed not to have a polynomial-time solution. But as shown by Peter Shor in his , Shor's algorithm also solves the DLP in polynomial-time on quantum computers.

Shor's algorithm and the discrete logarithm problem

Shor's algorithm is able to solve the discrete logarithm problem in polynomial time. It primarily achieves its efficiency by using quantum algorithms which can find the period of a function which depends on the input to the problem - which is then combined with more conventional operations.

Mathematical details of Shor's algorithm

Shor's algorithm provides an efficient solution to the DLP by mapping it to a more general problem in group theory known as the (HSP).

In the HSP, one is given an algebraic group GG and a function f:GXf: G \rightarrow X from GG to some set XX such that ff is constant on each coset of some subgroup HH of GG (and distinct on different cosets). Then, the task is to determine HH. It is known that quantum computers can solve the HSP on finite Abelian groups in time polynomial in log Glog~|G| where G|G| is the group order.

In the case of the integer DLP discussed in this lesson, the mapping to the HSP is as follows:

  • Let pp be a prime and consider the DLP given by y=gr mod p y = g^r~mod~p where gg is a generator of (Zp)×(\mathbb{Z}_p)^{\times}. Since gp11 mod pg^{p-1} \equiv 1~mod~p, gg has order p1p-1.
  • Choose G=Zp1×Zp1G = \mathbb{Z}_{p-1} \times \mathbb{Z}_{p-1} where Zp1\mathbb{Z}_{p-1} is the group of integers modulo (p1)(p-1).
  • Choose f:G(Zp)×f : G \rightarrow (\mathbb{Z}_p)^{\times} given as f(a,b)=gayb mod pgarb mod pf(a, b) = g^a y^{-b}~mod~p \equiv g^{a-rb}~mod~p .
  • The kernel of ff is then the subgroup H0=(r,1)={(a,b)Gf(a,b)1 mod p}={(a,b)Garb0 mod (p1)}H_0 = \langle (r, 1) \rangle = \{(a,b) \in G | f(a,b) \equiv 1~mod~p\} = \{ (a, b) \in G | a - rb \equiv 0~mod~(p-1)\}.
  • ff is constant on the cosets (δ,0)+H0(\delta, 0) + H_0 and "hides" the subgroup H0H_0 setting up an HSP.

Given the above, Shor's quantum algorithm for the integer DLP uses a quantum oracle to evaluate the function ff on a state representing a uniform superposition over GG, and then uses the and measurements with to produce quantum states that allow for the identification of the generator (r,1)(r, 1) of H0H_0. From this, rr, the discrete logarithm of interest, can be extracted.

has a detailed description of the algorithm.

Elliptic curve cryptography

, based on the mathematics of , offers a powerful approach to asymmetric key cryptography. ECC is known to provide a similar level of security as methods such as RSA but with shorter keys, making it more efficient and particularly well suited to systems with limited resources, such as embedded systems and mobile devices, where the .

Basic principles of elliptic curve cryptography

An elliptic curve is typically of the form y2=x3+ax+by^2 = x^3 + ax + b with a few interesting properties.

  • It is horizontally symmetric. So for any point (x,y)(x,y) on the curve, it's reflection (x,y)(x,-y) is also on the curve
  • Any non-vertical strain line will only intersect the curve at a maximum of three points

Figure 1. Operations of addition and point doubling on an elliptic curve. Facet 1 defines <b>P + Q = -R</b>. Facet 2 illustrates point doubling <b>2Q = -P</b>. Facet 3 defines the additive inverse of a point as its reflection about the x-axis i.e, <b>P = -Q</b>

Figure 1. Operations of addition and point doubling on an elliptic curve. Facet 1 defines P + Q = -R. Facet 2 illustrates point doubling 2Q = -P. Facet 3 defines the additive inverse of a point as its reflection about the x-axis i.e, P = -Q

Elliptic curve cryptography makes use of applying a series of arithmetic operations to points on the curve. Each effectively navigates to a new point on the curve. This is a simple process to follow in one direction, and with shorter key sizes is more efficient than other algorithms like RSA, but trying to reverse this is very difficult, hence its application to cryptography.

Mathematical principles of elliptic curve cryptography

An over an algebraic field KK is defined by a mathematical equation, typically in the form y2=x3+ax+by^2 = x^3 + ax + b with the coefficients a,bKa, b \in K and describes points (x,y)KK(x, y) \in K \otimes K with the additional requirement that the curve have no kinks or self-intersections.

The properties of elliptic curves allow for operations of "addition" and "scalar multiplication" to be defined involving points on the curve.

Addition: If PP and QQ are two points on an elliptic curve, then P+QP + Q describes a unique third point identified as follows: Draw the line that intersects PP and QQ and find the third point, RR, at which the line intersects the curve again. We then define P+Q=RP + Q = −R, the point opposite RR reflected through the xx-axis (see figure below) . When the line through P,QP, Q does not intersect the curve at any finite (x,y)(x, y), we say it intersects the curve at the point at infinity O\mathbf{O}.

Elliptic curve addition satisfies algebraic group properties with the point at infinity as the additive identity.

Doubling and scalar multiplication: The operation of point doubling which corresponds to scalar multiplication by 22 is obtained by setting P=QP = Q in the addition operation and graphically involves the tangent line at PP. General scalar multiplication by an integer nn defined as nP=P+P+... nnP = P + P + ...~n times is obtained by repeated doubling and addition.

Elliptic curve discrete logarithm problem

The elliptic curve discrete logarithm problem (ECDLP) has many similarities to the discrete logarithm problem discussed previously, being based around the challenges of finding factors.

The operation of allows for the definition of the :

Given points P,QP, Q on an elliptic curve such that Q=nPQ = nP, determine nn.

This problem is known to be intractable on classical computers for large nn and provides a basis for cryptographic security.

Mathematical description of ECDLP

Elliptic curve cryptography is primarily based on the ECDLP formulated on certain algebraic finite fields. In 1999, NIST recommended ten finite fields for use in ECC. These are:

  • Five prime fields Fp\mathbb {F} _{p} for primes pp of size {192,224,256,384,521}\{192, 224, 256, 384, 521\} bits.
  • Five binary fields F2n\mathbb {F} _{2^{n}} for n{163,233,283,409,571}n \in \{163, 233, 283, 409, 571\}.

With the above setup, an ECC-based asymmetric key cryptosystem in the case of prime fields may be specified as follows:

  • Choose a pp from the NIST-recommended list of values to specify Fp\mathbb {F} _{p}.

  • Select parameters a,ba, b of the elliptic curve.

  • Choose a base point GG which "generates" a cyclic subgroup on the curve with order nn; that is, the smallest integer such that nG=OnG = \mathbf{O}.

  • Calculate the cofactor h=E(Fp)/nh = |E(\mathbb {F} _{p})|/n where E(Fp)|E(\mathbb {F} _{p})| is the number of points on the curve. hh is typically a small integer.

  • The domain parameters (p,a,b,G,n,h)(p, a, b, G, n, h) allow the specification of asymmetric keys in this way:

    • The private key is a randomly chosen integer dd with as many bits as in the prime pp. It should be kept secret.
    • The public key is the result of "multiplying" the base point GG by the private key dd. This is typically denoted as Q=dGQ = dG.

Recovering dd is equivalent to solving the ECDLP, which is assumed to be intractable for large dd. The ECDLP therefore forms the basis for key exchange and digital signature schemes in direct analogy with the Diffie-Hellman and DSA schemes defined over (Zp)×(\mathbb{Z}_p)^{\times} discussed previously.

Key exchange with ECC

One of the primary uses of ECC is in the key exchange protocol known as (ECDH). In ECDH, each party generates a private-public key pair and then exchanges public keys. Each party then uses their own private key and the other party's public key to compute a shared secret, which can be used as the key for symmetric encryption.

While it's relatively easy to perform point addition on an elliptic curve and derive a public key from a private key, it's computationally infeasible to reverse the process and derive a private key from a public key. This "one-way" behavior provides the security of the ECDH key exchange.

Here, we will illustrate an example of how one might perform an ECDH key exchange using Python and the library cryptography.

Authenticate to run code cells
Reset Copy to clipboard

Output:

Keys checked to be the same

Digital signatures with ECC

ECC can also be used to generate digital signatures using the (ECDSA). In ECDSA, the signer creates a signature using their private key, and others can verify the signature using the signer's public key. Just like with ECDH, the security of ECDSA relies on the ECDLP. It's computationally infeasible for someone to forge a signature without access to the signer's private key.

The following is an example of a simple sign-and-verify transaction using ECDSA with Python and cryptography.

Authenticate to run code cells
Reset Copy to clipboard

Output:

The signature is valid.

In the above code, if one modifies the message after it has been signed, the verification will fail, providing a guarantee of authenticity and integrity for the message.

Breaking ECDH and ECDSA

In an analogous way to the integer discrete logarithm problem, the ECDLP turns out to be hard on classical computers but has an efficient solution on quantum computers once again via Shor's algorithm. We refer you to recent literature for a discussion related to .

Summary

In this lesson, we started by looking at the main characteristics of asymmetric key cryptography (AKC) and discussed basic security considerations that underpin asymmetric cryptosystems. In particular, we introduced the two main applications of AKC — namely, key exchange and digital signatures — both of which are essential components of modern internet-based communication.

We then looked at the RSA cryptosystem, which since the 1970s, has proven to be of immense value for securing modern digital communications by enabling key exchange and digital signatures within a simple and versatile framework. The cryptographic security of RSA with respect to classical computing is based on the hardness of factoring large integers and requires key sizes in the range of 2048 bits to assure the integers used in practical applications are large enough to resist factorization.

We then looked at Diffie-Hellman (DH) key exchange and the Digital Signature Algorithm (DSA). The security of these schemes is predicated on the integer discrete logarithm (DLP) problem, the private key being computationally hard to extract given the public key, with no polynomial time solution on classical computers. Use of random and unique keys affords additional security against attacks. Both the integer finite field and elliptic curve variants of the DH and DSA protocols currently find widespread use across many modern digital communications protocols such as , , and so on.

Finally we looked at Elliptic curve cryptography. With its efficient key size and strong security properties, it currently represents an excellent choice for many cryptographic applications, from key exchange to digital signatures.

All of these algorithms have exposure to attack from quantum algorithms, since solutions can be developed to solve the mathematical problems which acted as a premise in their design. One such example is Shor's algorithm. Therefore they will need to be replaced by quantum-safe asymmetric cryptosystems which are more resilient to attack from quantum - as well as being as safe with classical algorithms. The mathematical problems they rely on can be solved efficiently by quantum computers, necessitating the development of quantum-safe asymmetric cryptosystems that can withstand quantum attacks while maintaining classical security.

A future lesson will look at quantum-safe cryptosystems and discuss the required approach to maintain cryptographic security.

Was this page helpful?