If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Idea behind hypothesis testing

The lesson explores hypothesis testing in statistics, demonstrating how varied outcomes from a test with a known accuracy rate can influence the acceptance or rejection of a hypothesis. This understanding is key to evaluating the reliability of statistical tests.

Want to join the conversation?

  • mr pink red style avatar for user Vinay
    Since P(accurate)=.99.
    for 100 test runs P(100/100 accurate)= .99^100
    similarly P(99/100 accurate)= .99^99
    and P(98/100 accurate)= .99^98 and so on.. What am I doing wrong that I am not getting P(98/100 accurate)= 18.5%?
    (6 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Dr C
      > "What am I doing wrong"

      If P(accurate) = 0.99, then:
      P( 100 accurate) = (100 nCr 0) * 0.99^100 * 0.01^0 = 0.366
      P( 99 accurate) = (100 nCr 1) * 0.99^99 * 0.01^1 = 0.3697
      P( 98 accurate) = (100 nCr 2) * 0.99^98 * 0.01^2 = 0.1848

      You're failing to account for:
      1. The probability of the non-accurate trials (hence the 0.01^(n-x) factors)
      2. The number of ways to arrange the accurate/non-accurate trials.

      The fact that you got the right answer for "99 accurate" is pure coincidence. The 0.01^1 and the 100 cancel each other out. This is NOT a general rule, just a coincidence of the numbers in this particular case.
      (42 votes)
  • blobby green style avatar for user Webber Huang
    Excuse me, how to compute P(accurate #/100), I'm lost here, could anybody give me some help?
    (65 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Mikhail
      Don't forget about combinatorics!
      C (n / k) = n! / ((n-k)!*k!)
      P(100/100) = 0.99^100 * C (100 / 100) = 0.99^100 * 1 =~ 0.366
      P(99/100) = (0.99^99)*(0.01^1) * C (100 / 99) = 0.99^99 * 0.01 * 100 =~ 0.37
      P(98/100) = (0.99^98)*(0.01^2) * C (100 / 98) = (0.99^98)*(0.01^2) * 100! / (2! * 98!) =~ 0.185
      P(n/100) = (0.99^n) * (0.01^(100-n)) * C(100 / n)
      (136 votes)
  • winston baby style avatar for user ace dean
    At , how did Sal get 36%?
    I watched the combinatorics videos he mentions but I still don't understand...

    >>>>>>>>>>>> EDIT (7 months later) <<<<<<<<<<
    The videos I watched were not the ones Sal was referencing. As per @greentree096 's answer, the correct videos are the ones on Binomial Distribution https://www.khanacademy.org/v/binomial-distribution
    (20 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user asuk
      He took the probability the test is accurate (each individual time) and raised it to the 100th power. This is because each test should be independent, so he can multiply the probability of these events. This leads to:
      0.99^100 (which represents getting an accurate test each of the 100 times, all in a row).

      As he starts to throw inaccurate tests into the mix, he has to start to multiply by the number of ways it can happen. For example, with the 99 accurate tests he takes 0.99^99 * 0.01^1 (this represents 99 accurate tests and 1 inaccurate test) but he has to multiply this result by the number of ways it can happen in order to represent the complete answer. This leads to:
      0.99^99 * 0.01^1 * 100
      (9 votes)
  • piceratops ultimate style avatar for user Kevin George Joe
    How does Sal calculate these values? Is there some kind of formula I dont know?
    (5 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Jun Long Goh
      Short Answer: Its the binomial distribution formula. P(n,r)=nCr*p^r*q^n-r
      where n is number of trials,
      r is the number f success
      p is the probability of success,
      q is the probability of failure.
      Long Answer: The binomial theorem formula is a formula use to calculate the probability that an event will be successful r times if n times occur. To use the example in the video, we are given that the probability the event is successful is 99% or 0.99. We run 100 trials, so n will equal 100.
      The question is of these 100 trials, what is the probability than it will be successful 100 times?
      If we use the formula, n=100, r=100, p=0.99 and q, the probability that the event will fail is
      1 - 0.99=0.01
      So, the probability that in 100 trial, all of them are successful is
      P(100,100) = 100C100*0.99^100*0.01^0 = 0.366 or 36.6%
      (22 votes)
  • leaf green style avatar for user Paul Merck
    At approximately the minute mark Sal has just finished writing the approximate probabilities of a test that is generally 99% accurate delivering differing levels of accuracy for a given sample size of 100. I noticed that there was about a 5 times greater likelihood of 96/100 versus 95/100, a 4 times greater likelihood of 97/100 versus 96/100, a 3 times greater likelihood of 98/100 versus 97/100, and a 2 times greater likelihood of 99/100 versus 98/100. My question is: Is there an underlying rule of probability or math at work here that causes this 5,4,3,2 pattern or is this simply noise? Thank you. I hope my question makes sense.
    (5 votes)
    Default Khan Academy avatar avatar for user
  • spunky sam blue style avatar for user Alfred
    If I have a test that was administered to 1,000 people. The results of my testing was:
    990 trials proved to be true
    10 times proved to be false
    Thus, my test is 99% accurate.
    I then give the same test to 100 more people and I should expect that the results would be 99 true and 1 false but according to this hypothesis, I should only expect approximately 36 out of the 100 to be true.
    Seems strange.
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Bruce Wintman
    Doesn't one need to account for combinatorics with the inaccurate results. In other words what about the small group of results with 0.01 chance. For 97/100, the 3/100 inaccurate chances can also be rearranged even if the 97 accurate results do not change position in the permutation count.
    (1 vote)
    Default Khan Academy avatar avatar for user
    • piceratops ultimate style avatar for user Christopher
      Yes, one does. I think Mikhail's answer to Webber Huang's question is correct.

      Consider the case of getting 97 accurate results:

      Simply calculating 0.99^97 isn't correct; it works out to 38% because it doesn't include the inaccurate results.

      Including the three inaccurate results means multiplying 0.99^97 by 0.01^3. That calculation yields a much smaller number (about 3.8 x 10^-7), but that represents the probability of just one particular way of getting 97 accurate and 3 inaccurate results.

      The number of ways to have 97 accurate and 3 inaccurate ("100 choose 97") works out to 161,700.

      So, to find the probability of any run including 97 accurate and 3 inaccurate, the calculation is

      0.99^97 * 0.01^3 * 161,700 = (about) 0.061

      which is pretty close to the value Sal gave in the vid.
      (6 votes)
  • female robot ada style avatar for user jma
    What is hypothesis testing? The title advertises "idea behind" it, but IMHO I didn't see any explanation of that (and the previous video (Symple hypothesis testing), only a bunch of pre-calculated percentages.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Matthew Daly
      Yeah, it might be clearer to start watching the video at around and ignore the set-up. Here's the gist of it. We have some experimental data that we hope confirms our hypothesis that the test is 99% accurate. So we test that hypothesis by assuming that it is true, which then gives us the ability to do the binomial distribution calculation. That calculation leads us to conclude that our experimental results would have been very unlikely to have arisen from a test with 99% accuracy. Therefore, we are left with a very low confidence in the hypothesis that the test is 99% accurate, so we should reject that hypothesis.
      (4 votes)
  • aqualine sapling style avatar for user darshshah400
    I am not understanding how is the probability of getting an accuracy of 99% in 100 test cases is less than the probability of getting an accuracy of 99% in 99 test cases. The confusion increases more for me as in the following tests of 98, 97, 96, and 95 test cases, the probability decreases as it should.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seed style avatar for user Danish Amjad
    Suppose I have 100 data points and have a hypothesis, If I get my hypothesis to be true on 95 points. Should I reject it on the basis of 0.3% chance of getting the same results on some other data set? Am I understanding it correctly?
    (3 votes)
    Default Khan Academy avatar avatar for user

Video transcript

- [Voiceover] Let's say that you have a cholesterol test, and you know, you somehow magically know that the probability that it is accurate, that it gives the correct results is 99, 99%. You have a 99 out of 100 chance that any time you apply this test, that it is going to be accurate. Now let's say that you, and you just magically know that, we're just assuming that. Now let's just say that you get 100 folks into this room and you apply this test to all 100 of them. So apply, apply test 100 times. So what are some of the possible outcomes here? Is it for sure that 99, exactly 99 out of the 100 are going to be accurate and that 1 out of the 100 is gonna be inaccurate? Well that's definitely a likely possibility, but it's also possible you get a little lucky and all 100 are accurate, or you get a little unlucky and that 98 are accurate and that two are inaccurate. And actually, I calculated the probabilities ahead of time, and the goal of this video isn't to go into the probability and combinatorics of it, but if you're curious about it, there's a lot of good videos on probability and combinatorics on Khan Academy, but I calculated ahead of time, and the probability, if you have something that has a 99% chance of being accurate, and you apply it 100 times, the probability that it is accurate, that it is accurate 100 out of the 100 times, is approximately equal to, approximately equal to 36.6%. I rounded to the nearest tenth of a percent. So it's a little better than a third chance that you'll actually get, all of the people are going to get an accurate result, even though for any one of them there's a 99% chance that it is accurate. Now we could keep going, the probability that it is accurate, I'm just gonna put these quotes here so I don't have to rewrite accurate over and over again, the probability that it is accurate 99 out of 100 times, I calculated it ahead of time, it is approximately 37.0%. So this is what you would expect, getting 100 out of 100 doesn't seem that unlikely if each of the times you apply it has a 99% chance of being accurate, but it makes sense that you would expect 99 out 100 to be even more likely, slightly more likely. And we can of course keep going, the probability that it is accurate 98 out of 100 times is approximately 18.5%. And I'm just gonna do a few more. The probability that it is accurate 97 out of 100 times, and once again I calculated all of these ahead of time, is 6%, so it's definitely in the realm of possibility, but it's, the probability is much lower than having 99 out of 100 or 100 out of 100 being accurate, and then the probability, let me put the double quotes here, the probability that it is accurate 96 out of 100 times is approximately 1.5%, and then the probability, and I'll just do one more, I could keep going, the probability, you know, there's some probability that even though each test has a 99% chance you just get super unlucky and that, you know, very few of them are accurate, well I'll just, and you see, you see what's happening to the probabilities as we have fewer and fewer of them being accurate, it becomes less and less probable. So the probability that 95 out of the 100 are accurate is, is approximately 0.3%. So this was just kind of a, I guess you could say a thought experiment. If we had a test that we know for sure that every time you administer it, the probability that it is accurate is 99%, then these are the probabilities that if you administered it 100 times, that you get 100 out 100 accurate, the probability that you get 99 out of 100 accurate, and so on and so forth. So let's just keep that in mind, and then think a little bit about hypothesis testing, and how we can use this framework. So let's put all that in the back of our minds, and let's say that you have devised a new test, you have a new test, and you don't know how accurate it is. You have a new cholesterol test, you don't know how accurate it is, you know that in order for it to be approved by whatever governing body it has to be accurate 99, the probability of it being accurate has to be 99%. So needs to have probability of accurate, accurate equal to 99%. You don't know if this is true, you just know that that's what it needs to be. And so you have your test, and let's you set up a hypothesis, and your hypothesis could be a lot of things, and once you get deeper into statistics, there's, you know, null hypothesis and alternate hypotheses, but let's just start with just a simple hypothesis, you're hopeful, your hypothesis is that the probability that your new test is accurate is, this is your hypothesis, because you want that to be your hypothesis cause if you feel good about it, then you're like, okay, yeah, maybe I'll get approved by the appropriate governing body. So you say, "Hey, my hypothesis is that "my new test is accurate 99, the probability "of it being accurate is 99%." So then you go off and you apply it 100 times. So you apply your new test, you don't know the actual probability of it being accurate, you apply the test 100 times. And let's say out of those 100 times you get that they are accurate, you get that it is accurate, and you're able to use some other test that you, you know, some for-sure test, some super accurate test, to verify your own test results, and you see that it is accurate 95 out of the 100 times. So the question you have is, well, does the hypothesis make sense to you? Will you accept this hypothesis? Well what you say is, well, "If my hypothesis was true, "if my test were accurate 99, "if the probability of my test being accurate is 99%, "what's the probability of me getting this outcome?" Well, we figured that out. If it really was accurate 99% of the time, then the probability of getting this outcome is only 0.3%. So if you assume true, if you assume hypothesis, hypothesis, I'll just write "hyp," if you assume the hypothesis is true, the probability of the outcome you got, probability of observed outcome is approximately 0.3%. And so you say, "Look, you know, "maybe, it's definitely possible "that I just got very, very, very, very unlikely, "but based on this, I probably should reject my hypothesis, "because the probability of me getting this outcome, "if the hypothesis was true, "is very, very, very, very low." And as we go deeper into statistics, you'll see that there are thresholds that people often set, for, you know, if the probability of something happening or not happening is above or below some threshold, then we might reject a certain hypothesis. But in this world, you could see that, look, if my test really was accurate 99% of the time, for me to get, when I apply it to 100 people, it's only accurate 95 out of 100, if my hypothesis is true, that would have only, there's only a 0.3% chance that I would have seen this observation. So based on that, it might be completely reasonable to say, "You know what, I might reject my hypothesis, "look for a new test, I don't feel good "about this new cholesterol test that I constructed."