Main content

## Statistics and probability

### Course: Statistics and probability > Unit 12

Lesson 1: The idea of significance tests- Simple hypothesis testing
- Idea behind hypothesis testing
- Simple hypothesis testing
- Examples of null and alternative hypotheses
- Writing null and alternative hypotheses
- P-values and significance tests
- Comparing P-values to different significance levels
- Estimating a P-value from a simulation
- Estimating P-values from simulations
- Using P-values to make conclusions

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Idea behind hypothesis testing

## Want to join the conversation?

- Excuse me, how to compute P(accurate #/100), I'm lost here, could anybody give me some help?(64 votes)
- Don't forget about combinatorics!

C (n / k) = n! / ((n-k)!*k!)

P(100/100) = 0.99^100 * C (100 / 100) = 0.99^100 * 1 =~ 0.366

P(99/100) = (0.99^99)*(0.01^1) * C (100 / 99) = 0.99^99 * 0.01 * 100 =~ 0.37

P(98/100) = (0.99^98)*(0.01^2) * C (100 / 98) = (0.99^98)*(0.01^2) * 100! / (2! * 98!) =~ 0.185

.....

P(n/100) = (0.99^n) * (0.01^(100-n)) * C(100 / n)(130 votes)

- Since P(accurate)=.99.

for 100 test runs P(100/100 accurate)= .99^100

similarly P(99/100 accurate)= .99^99

and P(98/100 accurate)= .99^98 and so on.. What am I doing wrong that I am not getting P(98/100 accurate)= 18.5%?(5 votes)- > "What am I doing wrong"

If P(accurate) = 0.99, then:`P( 100 accurate) = (100 nCr 0) * 0.99^100 * 0.01^0 = 0.366`

P( 99 accurate) = (100 nCr 1) * 0.99^99 * 0.01^1 = 0.3697

P( 98 accurate) = (100 nCr 2) * 0.99^98 * 0.01^2 = 0.1848

You're failing to account for:

1. The probability of the non-accurate trials (hence the 0.01^(n-x) factors)

2. The number of ways to arrange the accurate/non-accurate trials.

The fact that you got the right answer for "99 accurate" is pure coincidence. The 0.01^1 and the 100 cancel each other out. This is NOT a general rule, just a coincidence of the numbers in this particular case.(38 votes)

- At1:09, how did Sal get 36%?

I watched the combinatorics videos he mentions but I still don't understand...

>>>>>>>>>>>>**EDIT***(7 months later)*<<<<<<<<<<

The videos I watched were not the ones Sal was referencing. As per @greentree096 's answer, the correct videos are the ones on**Binomial Distribution**https://www.khanacademy.org/v/binomial-distribution(18 votes)- He took the probability the test is accurate (each individual time) and raised it to the 100th power. This is because each test should be independent, so he can multiply the probability of these events. This leads to:

0.99^100 (which represents getting an accurate test each of the 100 times, all in a row).

As he starts to throw inaccurate tests into the mix, he has to start to multiply by the number of ways it can happen. For example, with the 99 accurate tests he takes 0.99^99 * 0.01^1 (this represents 99 accurate tests and 1 inaccurate test) but he has to multiply this result by the number of ways it can happen in order to represent the complete answer. This leads to:

0.99^99 * 0.01^1 * 100(9 votes)

- How does Sal calculate these values? Is there some kind of formula I dont know?(4 votes)
- Short Answer: Its the binomial distribution formula. P(n,r)=nCr*p^r*q^n-r

where n is number of trials,

r is the number f success

p is the probability of success,

q is the probability of failure.

https://www.khanacademy.org/math/probability/random-variables-topic/binomial_distribution/v/binomial-distribution

Long Answer: The binomial theorem formula is a formula use to calculate the probability that an event will be successful r times if n times occur. To use the example in the video, we are given that the probability the event is successful is 99% or 0.99. We run 100 trials, so n will equal 100.

The question is of these 100 trials, what is the probability than it will be successful 100 times?

If we use the formula, n=100, r=100, p=0.99 and q, the probability that the event will fail is

1 - 0.99=0.01

So, the probability that in 100 trial, all of them are successful is

P(100,100) = 100C100*0.99^100*0.01^0 = 0.366 or 36.6%(14 votes)

- At approximately the4:00minute mark Sal has just finished writing the approximate probabilities of a test that is generally 99% accurate delivering differing levels of accuracy for a given sample size of 100. I noticed that there was about a 5 times greater likelihood of 96/100 versus 95/100, a 4 times greater likelihood of 97/100 versus 96/100, a 3 times greater likelihood of 98/100 versus 97/100, and a 2 times greater likelihood of 99/100 versus 98/100. My question is: Is there an underlying rule of probability or math at work here that causes this 5,4,3,2 pattern or is this simply noise? Thank you. I hope my question makes sense.(5 votes)
- If I have a test that was administered to 1,000 people. The results of my testing was:

990 trials proved to be true

10 times proved to be false

Thus, my test is 99% accurate.

I then give the same test to 100 more people and I should expect that the results would be 99 true and 1 false but according to this hypothesis, I should only expect approximately 36 out of the 100 to be true.

Seems strange.(2 votes)- Hey! I think the closest example of Sal to of how calculating P(100/100), P(99/100) and so forth is this one - https://www.khanacademy.org/math/probability/probability-and-combinatorics-topic/probability_combinatorics/v/probability-and-combinations-part-2(5 votes)

- What
**is**hypothesis testing? The title advertises "idea behind" it, but IMHO I didn't see any explanation of that (and the previous video (Symple hypothesis testing), only a bunch of pre-calculated percentages.(2 votes)- Yeah, it might be clearer to start watching the video at around3:50and ignore the set-up. Here's the gist of it. We have some experimental data that we
*hope*confirms our hypothesis that the test is 99% accurate. So we test that hypothesis by assuming that it is true, which then gives us the ability to do the binomial distribution calculation. That calculation leads us to conclude that our experimental results would have been very unlikely to have arisen from a test with 99% accuracy. Therefore, we are left with a very low confidence in the hypothesis that the test is 99% accurate, so we should reject that hypothesis.(4 votes)

- Doesn't one need to account for combinatorics with the inaccurate results. In other words what about the small group of results with 0.01 chance. For 97/100, the 3/100 inaccurate chances can also be rearranged even if the 97 accurate results do not change position in the permutation count.(1 vote)
- Yes, one does. I think Mikhail's answer to Webber Huang's question is correct.

Consider the case of getting 97 accurate results:

Simply calculating 0.99^97 isn't correct; it works out to 38% because it doesn't include the inaccurate results.

Including the three inaccurate results means multiplying 0.99^97 by 0.01^3. That calculation yields a much smaller number (about 3.8 x 10^-7), but that represents the probability of just one particular way of getting 97 accurate and 3 inaccurate results.

The number of ways to have 97 accurate and 3 inaccurate ("100 choose 97") works out to 161,700.

So, to find the probability of*any*run including 97 accurate and 3 inaccurate, the calculation is

0.99^97 * 0.01^3 * 161,700 = (about) 0.061

which is pretty close to the value Sal gave in the vid.(5 votes)

- Suppose I have 100 data points and have a hypothesis, If I get my hypothesis to be true on 95 points. Should I reject it on the basis of 0.3% chance of getting the same results on some other data set? Am I understanding it correctly?(3 votes)
- I do not believe the 98% figure it can not be correct I would expect it to be almost the same as the 100% the jump is far too large! it defies logic.(2 votes)
- Casey's answer to Vinay's question (from 1 month before you) answers your question pretty well, I think. I had the same reaction as you did.(1 vote)

## Video transcript

- [Voiceover] Let's say that
you have a cholesterol test, and you know, you somehow magically know that the probability that it is accurate, that it gives the correct
results is 99, 99%. You have a 99 out of
100 chance that any time you apply this test, that
it is going to be accurate. Now let's say that you, and
you just magically know that, we're just assuming that. Now let's just say that you
get 100 folks into this room and you apply this test
to all 100 of them. So apply, apply test 100 times. So what are some of the
possible outcomes here? Is it for sure that 99,
exactly 99 out of the 100 are going to be accurate
and that 1 out of the 100 is gonna be inaccurate? Well that's definitely
a likely possibility, but it's also possible
you get a little lucky and all 100 are accurate,
or you get a little unlucky and that 98 are accurate
and that two are inaccurate. And actually, I calculated the
probabilities ahead of time, and the goal of this video isn't to go into the probability
and combinatorics of it, but if you're curious about it, there's a lot of good
videos on probability and combinatorics on Khan Academy, but I calculated ahead of time, and the probability, if you have something that has a 99% chance of being accurate, and you apply it 100 times, the probability that it is accurate, that it is accurate 100
out of the 100 times, is approximately equal to, approximately equal to 36.6%. I rounded to the nearest
tenth of a percent. So it's a little better
than a third chance that you'll actually
get, all of the people are going to get an accurate result, even though for any one of them there's a 99% chance that it is accurate. Now we could keep going, the probability that it is accurate, I'm just gonna put these quotes here so I don't have to rewrite
accurate over and over again, the probability that it is
accurate 99 out of 100 times, I calculated it ahead of time,
it is approximately 37.0%. So this is what you would expect, getting 100 out of 100
doesn't seem that unlikely if each of the times you apply it has a 99% chance of being accurate, but it makes sense that
you would expect 99 out 100 to be even more likely,
slightly more likely. And we can of course keep going, the probability that it is accurate 98 out of 100 times is
approximately 18.5%. And I'm just gonna do a few more. The probability that it is accurate 97 out of 100 times, and once again I calculated all of these ahead of time, is 6%, so it's definitely
in the realm of possibility, but it's, the probability is much lower than having 99 out of 100 or
100 out of 100 being accurate, and then the probability, let
me put the double quotes here, the probability that it is accurate 96 out of 100 times is approximately 1.5%, and then the probability,
and I'll just do one more, I could keep going, the probability, you know, there's some probability that even though each
test has a 99% chance you just get super unlucky
and that, you know, very few of them are
accurate, well I'll just, and you see, you see what's happening to the probabilities as we have fewer and fewer of them being accurate, it becomes less and less probable. So the probability that 95 out of the 100 are accurate is, is approximately 0.3%. So this was just kind of a, I guess you could say
a thought experiment. If we had a test that we know for sure that every time you administer it, the probability that
it is accurate is 99%, then these are the probabilities that if you administered it 100 times, that you get 100 out 100 accurate, the probability that you
get 99 out of 100 accurate, and so on and so forth. So let's just keep that in mind, and then think a little bit
about hypothesis testing, and how we can use this framework. So let's put all that in
the back of our minds, and let's say that you
have devised a new test, you have a new test, and you
don't know how accurate it is. You have a new cholesterol test, you don't know how accurate it is, you know that in order
for it to be approved by whatever governing body
it has to be accurate 99, the probability of it being
accurate has to be 99%. So needs to have probability of accurate, accurate equal to 99%. You don't know if this is true, you just know that that's
what it needs to be. And so you have your test, and let's you set up a hypothesis, and your hypothesis
could be a lot of things, and once you get deeper into
statistics, there's, you know, null hypothesis and alternate hypotheses, but let's just start with
just a simple hypothesis, you're hopeful, your hypothesis
is that the probability that your new test is accurate is, this is your hypothesis,
because you want that to be your hypothesis cause
if you feel good about it, then you're like, okay,
yeah, maybe I'll get approved by the appropriate governing body. So you say, "Hey, my hypothesis is that "my new test is accurate
99, the probability "of it being accurate is 99%." So then you go off and
you apply it 100 times. So you apply your new test, you don't know the actual
probability of it being accurate, you apply the test 100 times. And let's say out of those 100 times you get that they are accurate, you get that it is
accurate, and you're able to use some other test that you, you know, some for-sure test, some
super accurate test, to verify your own test results, and you see that it is accurate
95 out of the 100 times. So the question you have is, well, does the hypothesis make sense to you? Will you accept this hypothesis? Well what you say is, well,
"If my hypothesis was true, "if my test were accurate 99, "if the probability of my
test being accurate is 99%, "what's the probability of
me getting this outcome?" Well, we figured that out. If it really was accurate 99% of the time, then the probability of getting
this outcome is only 0.3%. So if you assume true,
if you assume hypothesis, hypothesis, I'll just write "hyp," if you assume the hypothesis is true, the probability of the outcome you got, probability of observed
outcome is approximately 0.3%. And so you say, "Look, you know, "maybe, it's definitely possible "that I just got very,
very, very, very unlikely, "but based on this, I probably
should reject my hypothesis, "because the probability
of me getting this outcome, "if the hypothesis was true, "is very, very, very, very low." And as we go deeper into statistics, you'll see that there are
thresholds that people often set, for, you know, if the probability
of something happening or not happening is above
or below some threshold, then we might reject a certain hypothesis. But in this world, you
could see that, look, if my test really was
accurate 99% of the time, for me to get, when I
apply it to 100 people, it's only accurate 95 out of 100, if my hypothesis is true,
that would have only, there's only a 0.3% chance that I would have seen this observation. So based on that, it might be
completely reasonable to say, "You know what, I might
reject my hypothesis, "look for a new test, I don't feel good "about this new cholesterol
test that I constructed."