If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Randomized algorithms (intro)

How could random numbers speed up a decision algorithm? Created by Brit Cruise.

Want to join the conversation?

  • leaf blue style avatar for user David Elijah de Siqueira Campos McLaughlin
    As n grows, primes become less and less common. Eventually, could you reach a point that there are no more primes?
    (30 votes)
    Default Khan Academy avatar avatar for user
  • At , when he says that all primes are of the form 6k±1, where does that come from? I can see that it's true, because I tried it in my head for a bunch of small primes that I know. Does it have something to do with divisibility by 2 & 3? Maybe this has already been mentioned in a previous video, but I don't remember it.
    (28 votes)
    Default Khan Academy avatar avatar for user
    • spunky sam blue style avatar for user rjhadley
      6k + 0, 6k + 2, and 6k + 4 are divisible by 2 and therefore composite. 6k + 3 is divisible by 3 and therefore composite.

      That leaves 6k + 1 and 6k + 5. But 6k + 5 = 6(k+1) - 1; i.e. just pick k to be 1 more and it's of the form 6k-1. So you're left with 6k +/- 1.
      (35 votes)
  • starky ultimate style avatar for user Alwin Look
    At , we just need to be 99.9999999999999% sure that it's a prime. So, if we have an error and actually get that 1.0e-15 chance that it's not a prime number, going back to RSA, how does this unprime number weaken the encryption? would it mess up the encrypted message itself? how to exploit this leak?
    (14 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      If p or q is not prime it breaks the RSA algorithm, since it is based on p and q being prime, so that we can choose e and d properly to encrypt and decrypt messages.

      What will happen is:
      -you will encrypt some plain text M using the public key e to get a cipher text C
      -the person will decrypt C using the private key d to get a plain text M'
      -M and M' probably won't be the same

      Why does this happen?

      RSA relies on properties of Fermat's little theorem,or Eulers theorem, both of which assumes p and q are prime.

      M mod n= C^e mod n
      C mod n= M^d mod n

      M mod n= (M'^d mod n)^e mod n = (M'^d)^e mod n = M'^(e * d) mod n

      If p,q are prime, Fermat's Little Theorem (or Euler's Theorem) can be used to show us that M'^ (e * d) mod n = M' mod n.
      (I've excluded the gory math details, but they also rely on how we chose e and d)
      If p and q are not both prime, the value M'^ (e * d) mod n will likely not be M' mod n.

      Let's suppose, for a moment, that RSA still worked even though we chose p as a composite.
      Having p as a composite would mean we had, at least, two factors >1 for p.
      This means we have, at least, 3 factors for n now.
      If we had only two factors the largest that the smallest factor could be is the square root of n.
      Now, with 3 factors, the largest that the smallest factor could be is the cubed root of n, which is much smaller.
      This would make a brute force attack to factor n much easier. (factoring n can be used to reveal the private key)

      Hope this makes sense
      (21 votes)
  • aqualine ultimate style avatar for user Dipen Mehta
    Are there "primes and composites" in negative numbers?
    (10 votes)
    Default Khan Academy avatar avatar for user
  • hopper cool style avatar for user Bhavesh
    There is this theorem called Fermat's theorem, which tests for primes.

    Fermat's theorem states that if p is prime and 1 is less than or equal to a which is less than p, then
    a^p-1 (mod sign) 1 (mod p)
    If we want to test if p is prime, then we can pick random a's in the interval and see if the equality holds. If the equality does not hold for a value of a, then p is composite. If the equality does hold for many values of a, then we can say that p is probable prime.
    It might be in our tests that we do not pick any value for a such that the equality fails. Any a such that
    a^n-1 (mod sign) 1 (mod n)

    when n is composite is known as a Fermat liar. Vice versa, in this case n is called Fermat pseudoprime to base a.
    If we do pick an a such that
    a^n-1 (mod sign with a slash through it) 1 (mod n)
    then a is known as a Fermat witness for the compositeness of n.

    Thanks to Wikipedia!
    (7 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Salman Ahmed
    can we see visual lecture on any other site? here at pakistan youtube is ban from government
    (5 votes)
    Default Khan Academy avatar avatar for user
  • duskpin ultimate style avatar for user Rushabh
    How are random numbers actually generated?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • leafers ultimate style avatar for user jkimball
      That assumes your dice are perfect. Over millions of dice rolls, you would find some numbers are more common than others because tiny imperfections in your dice would result in them favoring one instead of six, or two instead of five. The more you used this method, the more obvious that bias would become.
      (2 votes)
  • leaf green style avatar for user Philipp Gühring
    What is the chance that I win the lottery twice in a row? Isn't it less likely that I win it twice in a raw than just twice at any two of the weekly draws? And isn't it less likely than just winning the first one and a 1:2 chance of winning the second one? I think the calculation is wrong in the beginning on the video, but I am not sure.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • mr pink red style avatar for user brit cruise
      It's the same as rolling a dice twice in a row. Let's say winning lottery = rolling a 6. Rolling two sixes at the same time, or one six followed by another six is the same: 1/6*1/6 = 1/36.
      It doesn't really matter how much time is in-between draws (or rolls)

      However, if you are asking about the probability of winning twice over many draws (say inside of a year), then yes that is a different question (and more likely)
      (3 votes)
  • blobby green style avatar for user Zachary Stark
    that machine said that if a number isn't expressible in the form 6k+/-1, it's 100% proof that the number is a composite but 2 and 3 can't be expressed in that form and they're primes. So is this little test still proof of composites?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • duskpin ultimate style avatar for user Creative_Learner
    How does odds of hitting lottery twice in the row is 6 * 10^-14, can anybody explain me?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      Probability of winning the 6/49 lottery once is:
      1/(49 C 6)
      where:
      49 C 6 is the number of ways we can choose 6 items from 49 (order doesn't matter).
      49 C 6 = 49!/(43! * 6!) = 13,983,816
      So, the odds of winning once is : 1/13,983,816

      Winning the lotto twice (assuming the events are independent) is :
      1/13,983,816 * 1/13,983,816 = 1/195,547,109,921,856 ~= 5 * 10^-15

      So, 6 * 10^-14 appears to be off a little.
      (4 votes)

Video transcript

Voiceover: I have an update. NASA has said that there will be a hardware random number generator on the Rover that we have access to. After that, they made one more comment, they reminded me that we just need our algorithm to work in practice. For something to work in practice, it means that there's always some possibility of error, but perhaps the possibility is so small that we ignore it in practice, and if this sound crazy, just realize that in the physical world nothing is every certain, there is always chances for error. For example, chip packaging contains small amounts of radioactive contaminants, and when these decay, they release alpha particles which can actually flip bits in memory, and perhaps change a number unexpectedly. Even more interesting, cosmic rays can also cause soft errors, we can never remove the chance of error completely. I asked NASA exactly what error probability is acceptable. They said, "We just need to make sure that "the probabilty of error, for a given trial, "is less than the odds of "hitting the lottery twice in a row." The odds of hitting the lottery, say 649 twice in a row, is six times ten to the negative fourteen, so let's just be safe and round it down and say our error probability is ten to the minus fifteen. Safe enough; we will not expect to see an error, and it could run hundreds or thousands of times. Now my question is would access to randomness help us speed up a decision algorithm such as this primality test? This is a deep question, so let's step back and look at the big picture. Given some collection of integers from one to some limit N say, let's think of it as our universe of integers. We can always divide a collection into two sets. A decision problem can actually be thought of as a partition into two sets. For example, provided some N is N less than 100, it will output yes for all integers less than 100. Here is one collection, and no for all others, which is another collection. Okay, so let's now focus on recognizing primes with the decision. Ideally, what we would like is some simply computed criteria that all primes satisfy and no composites satisfy. Now if we knew some simple pattern which describes the location of all primes, and only primes, then given some number N we could simply check if N follows that pattern. The problem is that so far no exaustive and simply computed pattern has been found for all primes and no composites, or all composites and no primes. But we do know fast tests for most composite numbers. For example, a simply computed criteria would be a test for evenness, and all even numbers are divisible by two. It's fast because we can tell if something's even by just looking at the last digit, and even as N grows, the problem doesn't get any harder, we just look at that last digit also known as the low order bit. Now realize that we can use this evenness test as a low quality composite test. In that we could have some algorithm accept our integer N, and all our algorithm does is check if it's even. If it is even, and greater than two, then we know with 100 percent certainty it is composite because we have proof. Think of two as our composite witness. However if not, then we aren't exactly sure. If it's odd it could be prime since all primes are odd, or it could be one of the many composites which are odds, just nine or fifteen. This massive region of odd composites makes for an unacceptable test. Now to be clear, let's draw this. Provided some N, N can be either prime or composite. If N is prime, our algorithm will say prime 100 percent of the time since no primes are even that are greater than two. It will never say composite when a prime is provided. However, if N is composite, our algorithm will say composite about fifty percent of the time, and prime fifty percent of the time. This means that if our algorithm outputs composite, it means it found proof of composite witness. However, for our algorithm outputs prime, we know there is an unacceptably large chance of error. Until now, our strategy has dealt with this large, unacceptably large error, by iteration and let's draw the flow of our machine. Given some N, our machine does a series of divisibility tests starting at A is two. It asks does A divide N. If it doesn't divide N, then we can eliminate all of this region. Then it asks the next question, does N divide three? If not, then we eliminate that whole section. Does N divide five, seven, and so on. Notice these are disjoint sets which gradually fill up all possible composites. Now if we ever answer yes along the way, then we have proof that N is composite. A, whatever it is at that point, is our witness. We halt N output composite from our algorithm. Otherwise, we continue trying until we exhaust all possible composites. Until we hit the square root of N since we know, remember ever composite number N must have a factor less than or equal to the square root of N. This really leads to the final question in our algorithm. If no witnesses are found, and A is greater than the square root of N, then we suddenly have proof and we halt an output prime. Realize we have two exit paths through our algorithm. Both paths represent proof of certainty that N is either prime or composite. When N is prime, we try all possible divisors and we basically rule them all out, and that is our proof that N is prime. The problem with this strategy is that our test value A at minimum requires us to cycle through every prime starting from two to the square root of N. As N grows, the number of primes grow with it. It results in too many iterations in the worst case such as when we provide a prime to our algorithm. Looking at the big picture now, realize it's providing proof when N is prime. Which we now know we don't exactly need. No one said we needed to prove it. We just needed to be 99.9999 fifteen nine's percent certain. Hmm, so we actually need to think about the composite test that's being used in our algorithm. Remember, our fastest trial division primality tests thus far have tried to use prime pattern such as 6K, or all primes are of the form 6K plus or minus one, to help walk along the primes only and eliminate many of the composites to save time. Remember, a test like this can be turned into a composite test. That is, given some integer N is of the form 6K plus or minus one, then we can say it's probably prime, but there are many composites still which follow this pattern. It doesn't include primes only, just all primes and some composites, and we can think of these extra composites as a leak and this leak is our source of error. Now if we invert it and ask is N not of the form 6K plus or minus one, then we can say with 100 percent certainty it is composite, so the leak is our source of error on the prime side, but in this case it's unacceptable it's not a non-negligible error. There's a much larger probabilty. We need to define a new composite testing procedure which is able to shrink this space and make the chance of error negligible. Which means we'll need to review how we can actually shrink errors with iterations. I think we should now post our ideas below regarding what kind of tests might we want to perform and also more importantly how could a random number generator really help us?