If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Summary (what's next?)

Why is factorization hard, yet generating primes easy? Where do we go from here? Created by Brit Cruise.

Want to join the conversation?

Video transcript

Voiceover: Congratulations. We've reached a branching point in our lesson. Now a few different ideas have been introduced, so it's important to orient ourselves here before moving forward in various directions. Now, what motivated this lesson? Well, we learned about RSA encryption, and RSA depended on two things. One, that prime factorization is difficult. So if I multiply two big primes together, P1 and P2 and I give you N, I can trust or feel secure in knowing that you'll take a long time to find those primes, and maybe more than your lifetime. Also two, RSA requires that those large primes I generated was easy, meaning I could just quickly generate a large prime. So let's return to the first problem. Difficulty of prime factorization. Well what is it about prime factorization or just prime numbers themselves which make problems hard? And to find out we begin with a core problem. Given X, is X prime or composite, which is a primality test? Now we ended up building some solutions to this problem. And in doing so, we realize that this problem was computationally expensive. That is, there is no instant solution to this problem. As our input number grew, the amount of time or the amount of steps involved for an algorithm to find the solution also grew. And, how much it grew, we actually understand better now because we can predict this search space using the prime number theorem. Though, the real issue can be thought of as a graph, where on the vertical axis we have the number of steps our algorithm takes, so you can just think of it as time. And on the X-axis is the input size and as input size increased, so did time. And the problem we had is that shape of this curve is unacceptable. Because in our case, once N hit even 20 digits, we can no longer solve the problem in the worst case. And if we threw in input hundreds of digits long at our algorithm we can all agree it would not work. In our case it would crash. So it's practically impossible to check if large input is prime using our current strategies. Now let's return to the first point, factorization. Realize based on our current understanding in this lesson, factorization has to be harder than a primality test. That is there are more steps involved in finding the prime factorization of some number, versus just telling me if a number is prime. Since, remember, factorization requires us to find all the prime factors for some input N, whereas a primality test only requires we find one divisor And a nice reminder of this is that some users have actually turned these primality tests into prime factorization algorithms, simply by repeating after it finds its first divisor. So the primality test is just kind of a sub-routine inside the main factorization algorithm. So if we return to this graph, a factorization algorithm would be something like this. As input grows, our factorization algorithm would be above this primality test line. And realize that in any situation we always have a computational limit, that is the number of primitive steps we can calculate which is based on our computing power in any given situation and the amount of time we have. And you could think of this as a horizontal line, or a threshold on this graph. Anything above this line is out of reach, not feasible to solve. And in this lesson we were limited by the rover's on-board computer which was fairly slow, which is why we couldn't run primality tests on numbers with even 20 digits. However, even if we had, say, 1,000 computers running for a year, this would simply just push this horizontal line up to some other threshold. And this would allow us to run tests on larger numbers, but as you can see, we would always hit some limit where the input is large enough that we can no longer solve the problems again. Now, this leads to many interesting question paths. However, I'll identify the two I'm going to explore next. First of all, so far I have not been very precise about these growth curves. Though, it would be helpful if, imagine for any algorithm you give me, no matter what it's trying to solve, and no matter what machine it's even running on, we could draw some corresponding growth curve on this graph, simply by looking at the description of the algorithm. If we could do this, we could identify categories of certain growth shapes, which tell us how difficult a problem would be to solve. And in this way, we could understand an algorithm before we even run it, very important to think about. And you could hand me your algorithm written on a piece of paper and I could look at it and I'll say, "I know "this algorithm is not solvable with the input size needed." And this leads us to a lesson on time complexity, an incredibly important conceptual tool we will need. And if you've ever heard this runs in polynomial time or exponential time, these are terms which come out of time complexity. Next, many of you have been wondering, "well, how "are we generating these random primes in practice," the second point I made about RSA. Well let's just walk through it quickly. To generate a large random prime number hundreds of digits long, we need to follow these instructions. Generate a random number, test if it's prime, if it's prime, we're done. If not, try again until we hit a prime. Now in this three-step procedure, what's most important is the second step, test if it's prime. If we can't do that step, this won't work. So how is this working today? Well, it's based on a subtle alteration to our problem definition, and more importantly, the use of randomness. And this leads us to another lesson on random algorithms. And now finally, if there are any other lingering question paths you have, please post them below and we can plan lessons. For example, there are some deeper mathematics we have yet to explore in order to speed up our existing trial division primality test. And what else are you thinking of? Please share below.