Number two in our encryption basics series. This time we are going to get into a well-known form of public key encryption, RSA. I plan on giving the same boiler plate warning for each of these; if you promise not to use this for encrypting anything truly important, you are allowed to skip the next couple of lines. The programs contained herein (obligatory lawyer speak) are for educational purposes only. Encryption is an incredibly hard concept to get correct and you shouldn’t assume that the program I wrote is sufficient to protect your data in transit.
RSA Encryption
Public Key Encryption
Public Key Encryption is everywhere. If you have ever paid for anything online, you have used it (I hope!). You probably have asked how it is protected. If I am entering my credit card into this page to purchase something, how does it get to them without anyone being able to read it except for them. In the physical world, I might lock it in a box, but for you to be able to unlock that box you will need the key (or a hammer, but we hopefully are sending hammer resistant boxes and locks). Public Key Encryption is like sharing a lock with the world that is hard to reverse-engineer and only you have the key. Math allows us to set up the constraints to get the protection that we want for it too.
The RSA algorithm is one such method for PKE. It also has another cool thing that we can do with it. The sender can use the algorithm to verify their identity. If you use the private key to encrypt something and send it to another person, that person can use the public key to decrypt it. This allows us to sign messages, verifying that we were the one that actually sent them.
A Simple RSA Program
I have my RSA program here. Running with the -t flag and the -m flag (with an integer of your choice) I have some basic output to demonstrate the process for encrypting a message using PKE.
This uses a properly chosen public / private key pair and shows each step. The public key we see is (3233, 17). The basis for it are two primes multiplied by each other. We select another prime that is coprime (the only integer that divides both numbers is 1) to another number (3233 in this case) made from our two primes. Then we generate a private key using the two primes that we chose. Once we have the private key we can encrypt and decrypt using the public and private keys respectively. If we want to sign a message, then we take just reverse the order we use the private and public keys. Now we can verify we sent something!
The program also has a couple of other simple options, as seen below. They all correspond to the math numbers below so hopefully it makes sense. It allows you to specify your own key and message so you can mess around with it.
One thing to be aware of is that the program fails for a message larger than our public key.
The Math Behind It All
Compared to the first post on DHKE, RSA looks relatively similar. They are both using modular arithmetic, and both look somewhat similar. Here are the steps to generating our private and public keys
- Select two primes, p and q
- Calculate n = p*q (part of our public key)
- Calculate the totient of n, t, which ends up being (p-1)*(q-1) due to our number selection
- Pick a prime e < t and make sure t%e is not 0 (e is the other part of the public key)
- Calculate e ^ -1 mod t which will be our private key d
- e ^ -1 mod t = d where d*e mod t = 1
Now for the steps to encrypt and then decrypt:
- To encrypt we calculate m^e mod n = E (for message m)
- To decrypt we calculate E ^ d mod n
So why does this work? Wikipedia gives us a good proof that Decrypt( Encrypt ( M ) ) = M. This gets into what a lot of people hate in math, mathematical proofs. I personally love this kind of math, but you don’t have to. Try the algorithm with different appropriately calculated keys and a reasonably sized message and you should see that it calculates the correct answer. Between that and the proof provided, it seems to work to scramble and unscramble the data.
I would hazard a guess and say that many of you reading this are at the very least interested in how we attack this system. All an attacker would see when attacking this is n and e. If they can determine what t is, they can then then calculate what the private key is. So the real attack is factoring our public key n. It turns out, integer factorization is a hard problem and that is where the security of the algorithm comes in. There are attacks that are possible on a quantum computer and as time goes on we might see a migration away from the algorithm, but for now, a sufficiently large key should protect our data.
Other Considerations
You may have had some questions when reading the first section on PKE. How do we verify that a public key that we receive is actually the public key for the person that we want to send it to? If we can “pre-share” the public key, then we could have already shared the private key. Moreover, I don’t know Amazon personally, I can’t just walk in to their office and get a public key from them, so I will need to acquire that in some manner. Public Key Infrastructure (PKI) is the method in which we maintain all of these keys. We use encryption to verify our keys for encryption, and from there it is turtles all the way down. This is a whole complex topic that I can in no way cover sufficiently in this one post, but if you want to get ahead of me, go forth and research! That’s half the fun of this profession!
Some more concerns about this algorithm is that it’s pretty slow and if anyone were to determine the two primes, they can decrypt all traffic from you. We instead combine a couple of things to protect against this. Remember how I said that the same algorithm can be used to sign messages? What we can do is exchange our keys using DHKE and sign the message using RSA. Now because of DHKE, the key is ephemeral. It will change between every session, so a new one will need to be calculated for every session to attack the system. Because of RSA, we also have proof that the person sending their side is who they say they are. This gives us a quick algorithm for key exchange and verifies the user. Incredible! I hope to get around to writing something to test that out, just as a proof of concept, so stick around for more code.
Hope you learned something and keep hacking encrypting (or signing) messages!