In a recent story about the U.S. National Security Agency’s controversial Internet surveillance operations, the New York Times reported that “the agency has circumvented or cracked much of the encryption, or digital scrambling, that guards global commerce and banking systems.”
The bolding is mine, because if in fact the agency did crack the encryption schemes used for bank transactions (the Times is somewhat unclear on that point), then in doing so it may have solved a math problem that has long puzzled cryptographers and number theorists alike.
The problem in question is that of integer factorization. It has been shown that every integer (e.g. 1, 2, 3, 4, 5...) can be written as the product of prime numbers. To review, a number is said to be prime if it is divisible only by itself and 1. (The first 10 prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29.)
That means if we pick a number, say 50, we should be able to write it as a product of primes: In this case, it would be 5 x 5 x 2. For small numbers like 50, determining prime factors is within reach of any middle schooler. However, take a sufficiently large number—one that is hundreds of digits long—and the problem quickly becomes intractable, not only for humans but even for modern computers.
To date, there is no known shortcut to quickly factor large integers into primes. It has never been proven that no such shortcut exists. We’ve just never found one.
If the unfactorable nature of these large integers doesn't interest you, consider that it has been the reason many of your most personal messages are kept private as they move across the Internet.
But the Times report about how the NSA penetrated banking encryption seems to suggest the agency may have cracked the problem. Here’s why (and this is going to take some explaining):
In the most basic sense, encryption works like this: Say Alice wants to send Bob a piece of sensitive information. Her computer first asks his to generate two keys: a “public key” and a “private key.” The public key can encrypt the message, and the private key can decrypt it. (Note that the public key cannot decrypt the messages after encrypting it.)
So Bob’s computer creates the two keys and sends Alice the public key. He keeps the private key for himself. Alice then encrypts her message with the public key and sends it to Bob’s computer. Once Bob has the message, he uses his private key (which he has shared with no one) to decipher the encryption.
So, the essence of an encryption algorithm is to create a public key and a private key such that only the private key can decode the messages.
One popular technique is the RSA algorithm, which gets its name from the three students who invented it. Here is how the algorithm accomplishes the task of generating a public and private key.
First, the public key:
- The algorithm randomly selects two prime numbers, let’s call them p and q.
- Next, the algorithm computes p x q. Call that result n. (So, n = p x q).
- Then, it calculates a number z, where z = (p - 1)(q - 1).
- Finally, the algorithm picks an odd number k such that z is larger than k and z is not divisible by k. (For example, if z were 20, we could pick k to be 11, since 20 is not divisible by 11.)
- The public key--the one Alice sends to Bob--is the two numbers n and k.
Now for the private key:
We need to find a number, call it j, such that (k x j) divided by z will give a number with a remainder of 1. Note, that it doesn’t matter what number we choose, as long as it has a remainder of 1.
Note here that “remainder” refers to the remainders computed in long division. So, for example, if you took 30 and divided it by 9 using long division you would get 3 with a remainder of 3, because 9 goes into 30 3 times, leaving a remainder of 3.
Here’s a concrete example: If, say, k = 7 and z = 20, then we need to pick a number j such that (7 x j) / 20 = some number with a remainder of 1. So, j could be 3 because (7 x 3) / 20 = 21/20 = 1 with a remainder of 1. Or we could choose j to be 23, since (7 x 23) / 20 = (161 / 20) = 8 with a remainder of 1. Whatever j we chose, that along with n becomes our private key.
So, returning to the Alice example: Alice has a confidential message she wants to give Bob. Let’s call it M. Before sending it, she asks Bob to generate a public and private key. Using the RSA algorithm, Bob’s computer generates the public key (n,k) and the private is (n,j). Bob then sends Alice the public key and she encrypts her message this way:
- First she computes (M^k) / n.
- The remainder of the resulting number is her encrypted message E. She sends E to Bob. We won’t get into how exactly Bob’s private key decrypts the message because his key would never be intercepted by a hacker. The important point here is why a hacker can’t decrypt Alice’s message with the public key--that is, the numbers (n,k).
Imagine you intercepted the public key and the encrypted message. So, you have k, n and E. The question is, can you solve for M?
(M^k) / n = some number with a remainder of E
In short, the problem is that the encrypted message E is the remainder of some number—but you don’t know which one. When you are dealing with huge numbers, as RSA does, there are a ton of possibilities for M that, plugged into that equation, would have a remainer of E.
(If that doesn’t make sense, consider the two long division calculations 10 / 9 and 19 / 9. The first yields 1 with a remainder of 1. The second gives you 2 with a remainder of 1. So, you can’t definitely determine what number divided by 9 was used to calculate the remainder of 1, since clearly both 10 and 19 satisfy that requirement.)
And here, at last, is where we come to the possibility that the NSA figured out how to crack the RSA algorithm. It turns out if, in the above public key equation, you know the prime factors of n—that is, the p and q we started with, then it is rather easy to solve for the original message.
As I mentioned above, as far as the academic world knows, when n is large, factoring it into the product of prime numbers is all but impossible. But, if the NSA cracked the RSA algorithm, they likely figured out a way to do it. That would be a serious mathematical accomplishment, far beyond being just Internet security.
So how likely is it that the NSA actually solved this long-standing problem?
It’s hard to say. Presumably for security reasons, the Times was quite vague about the NSA’s specific capabilities. In some cases, the NSA got around encryption through partnerships with companies. Other times, they apparently cracked the encryptions with supercomputers.
It is possible the NSA simply partnered with banks to find a backdoor through their encryption schemes. Furthermore, security expert Bruce Schneier, who worked with the Guardian on the same story, said, based on the documents he saw, that the “math is good.” That would seem to imply that he wasn't shown anything to indicate the NSA figured out a prime factorization shortcut.
That said, if anyone could crack it, it’s the U.S. government. The Department of Defense spends $11 billion a year on cryptanalysis, employing some 35,000 people full time to the task—easily one of the most powerful mathematical armies in the world.
If nothing else, the incident stands as a reminder that in many case, our most private transactions on the Internet hinge on simple mathematical problems like prime factorization that, on a small scale, can be solved by a child. Whether or not that simplicity scales up to the large integers that secure online communications, no one knows—except, perhaps, the NSA.
Illustration by Niek Sprakel/Flickr