If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Probability of sample proportions example

Probability of sample proportions example.

Want to join the conversation?

  • leafers seedling style avatar for user Kaitlyn Anderson
    How we would solve this if we aren't using a fancy calculator?
    (23 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Jing Qian
      Figure out how many standard deviations away from the mean your proportion is, then consult a z-table and figure out the values.

      In other words, since the mean is 0.15 and we want to figure out what the probability that it's greater than 0.10, then the distance from our proportion to the mean is 0.05. Divide this number by the standard deviation to see how many std. dev. it is away from the mean, so 0.05/0.028, and we get 1.77.

      We know that this is to the left of the mean, so we're going to use -1.77 when we consult our z-table, which gives us a value of 0.0384. That number is the probability which is BELOW the 0.10 line, so we just subtract that number from 1, and we get approximately 96%.
      (37 votes)
  • blobby green style avatar for user Parthiban Rajendran
    But if we know the true proportion to calculate np, we are already know the true proportion why to take samples at all? Its contradicting. As a sampler all I have is sampling data, not true proportions. So how is np threshold a valid approach?
    (13 votes)
    Default Khan Academy avatar avatar for user
  • purple pi teal style avatar for user Vyome
    what happen's when a distribution is not normal?
    (6 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Cary Wang
    How would I do this if I were to standardize the distribution?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Patrick Batoon
    When estimating normality of a sampling distribution do you use the SAMPLE PROPORTION (p̂=0.10) or POPULATION PROPORTION (p=0.15)?

    In this case, the surveyors only know that p̂=0.10. And can only estimate normality in that case.
    n*p̂=(160)(0.10)=16
    n*(1-p̂)=(160)(0.90)=144

    Based on the results using p̂, I conclude the sampling distribution is normal. Additionally, using p̂, the cumulative probability of p>0.10 is approximately 50%. Can someone explain to me why the logic of using p̂ is incorrect?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • boggle blue style avatar for user Bryan
      If you think about it, the sample proportion could be crazily unrepresentative of the actual population proportion. The SRS could have all 160 be really stressed out, and so p-hat would be 1. Obviously, it would be crazy to use p-hat then, since it's so far off. We could instead take a bunch of SRSs, find each p-hat, then take the mean of all these individual sample proportions; this would make it extremely unlikely that p-hat is very far off from the truth. But the mean of a bunch of p-hats is just p, the population proportion!

      Honestly, I don't completely get it either so someone else could come along and help!
      (3 votes)
  • blobby green style avatar for user cd024
    Hi, is there a proof of the "expected success and failure number being greater than 10" rule-of-thumb's veracity? Maybe using the Central Limit Theorem or something? It seems very cookbookish and it frustrates me that don't have any deeper intuition for the rule.
    (3 votes)
    Default Khan Academy avatar avatar for user
    • leafers seedling style avatar for user Shishir Iyer
      Since this rule was invented by statisticians, it can't really be "proved." It's just that they chose 10 because if the number of successes and failures was less than that, a normal distribution became exceedingly unlikely. But they could have as easily chosen 11 or 12 for the cutoff.
      (2 votes)
  • blobby green style avatar for user 8091467
    where did the 8 come from at ? im confused
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user EyeDas
    All problems in this category appear impractical. You can't answer these without knowing the population proportion, but if you knew that, why would you be drawing a random sample to estimate the statistic anyway? All fake examples - impractical
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      These examples are designed to teach statistical concepts and may not always represent practical real-life scenarios. They're useful for understanding how to apply statistical methods and for learning to interpret statistical results. In real-world applications, the true population proportion is often unknown, and these methods are used to estimate it from samples.
      (1 vote)
  • spunky sam green style avatar for user Yao
    What is the difference the binomial distribution and sampling ditribution? I know that sampling distribution is taking a lot of sample and calculate their statistics, while binomial distribution may only have one sample and the distribution we use is related to population. But I don't know how to differentiate if it is a binomial distribution or sampling distribution from the statement
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      Binomial Distribution is used for experiments with two outcomes (success or failure) and is concerned with the number of successes in a fixed number of trials, with each trial being independent and having the same probability of success. If you're dealing with a scenario where you're counting the number of successes in a certain number of trials, and these trials are independent with a fixed probability of success, you're likely looking at a binomial distribution situation.

      Sampling Distribution, on the other hand, refers to the distribution of a particular statistic (like the mean or proportion) obtained from a large number of samples drawn from the same population. It is about understanding the behavior of a statistic across different samples from the same population. When a question involves drawing samples and then calculating a statistic (mean, proportion) for each sample to understand the variability of that statistic, it's dealing with a sampling distribution.
      (1 vote)
  • blobby green style avatar for user pankaj3856
    How can i calculate the probability value without calculator?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • piceratops seedling style avatar for user David Bryant
      You probably can't. At the very least you will need a table of the cumulative standard normal probability distribution. There are lots of these on the web. Here, for instance.

      https://www.thoughtco.com/standard-normal-distribution-table-3126264

      In this example, the population mean is given as .15. Assuming your sample is drawn randomly, this will also be the sample mean. The standard deviation is the square root of (0.15 * 0.85 / 160) ... you'll need a calculator for that, unless you're good at finding square roots with a pencil and paper. That can be done, but it isn't easy. Anyway, the square root mentioned is .0282, very nearly. The difference between the population mean and the observation is 0.15 less 0.10, or 0.05, which is 1.77 standard deviations (.05 / .0282 = 1.77). If you look in the table mentioned above you'll find the Z-value 0.962 listed under 1.77. So there's your answer of 96%.
      (3 votes)

Video transcript

- [Instructor] We're told suppose that 15% of the 1,750 students at a school have experienced extreme levels of stress during the past month. A high school newspaper doesn't know this figure, but they are curious what it is, so they decide to ask a simple random sample of 160 students if they have experienced extreme levels of stress during the past month. Subsequently, they find that 10% of the sample replied "yes" to the question. Assuming the true proportion is 15%, which they tell us up here, they say 15% of the population of the 1,750 students actually have experienced extreme levels of stress during the past month, so that is the true proportion, so let me just write that, the true proportion for our population is 0.15, what is the approximate probability that more than 10% of the sample would report that they experienced extreme levels of stress during the past month. So pause this video and see if you can answer it on your own and there are four choices here I'll scroll down a little bit and see if you can answer this on your own. So the way that we're going to tackle this is we're going to think about the sampling distribution of our sample proportions and first we're going to say well is this sampling distribution approximately normal, is it approximately normal and if it is, then we can use its mean and the standard deviation and create a normal distribution that has that same mean and standard deviation in order to approximate the probability that they're asking for. So first this first part, how do we decide this? Well the rule of thumb we have here and it is a rule of thumb, is that if we take our sample size times our population proportion and that is greater than or equal to ten and our sample size times one minus our population proportion is greater than or equal to ten, then if both of these are true then our sampling distribution of our sample proportions is going to be approximately normal. So in this case the newspaper is asking 160 students, that's the sample size, so 160, the true population proportion is 0.15 and that needs to be greater than or equal to ten and so let's see this is going to be 16 plus eight which is 24 and 24 is indeed greater than or equal to ten so that checks out and then if I take our sample size times one minus P well one minus 15 hundredths is going to be 85 hundredths and this is definitely going to be greater than or equal to ten. Let's see this is going to be 24 less than 160 so this is going to be 136 which is way larger than ten so that checks out and so the sampling distribution of our sample proportions is approximately going to be normal. And so what is the mean and standard deviation of our sampling distribution? So the mean of our sampling distribution is just going to be our population proportion, we've seen that in other videos, which is equal to 0.15. And our standard deviation of our sampling distribution of our sample proportions is going to be equal to the square root of P times one minus P over N which is equal to the square root of 0.15 times 0.85 all of that over our sample size 160, so now let's get our calculator out. So I'm going to take the square root of .15 times .85 divided by 160 and we close those parentheses and so what is this going to give me? So it's going to give me approximately 0.028 and I'll go to the thousandths place here. So this is approximately 0.028. This is going to be approximately a normal distribution so you could draw your classic bell curve for a normal distribution, so something like this. And our normal distribution is going to have a mean, it's going to have a mean right over here of, so this is the mean, of our sampling distribution, so this is going to be equal to the same thing as our population proportion 0.15 and we also know that our standard deviation here is going to be approximately equal to 0.028 and what we want to know is what is the approximate probability that more than 10% of the sample would report that they experienced extreme levels of stress during the past month, so we could say that 10% would be right over here, I'll say 0.10 and so the probability that in a sample of 160 you get a proportion for that sample, a sample proportion that is larger than 10% would be this area right over here. So this right over here would be the probability that your sample proportion is greater than, they say is more than 10%, is more than 0.10 just like that. And then to calculate it, I can get out our calculator again so here I'm going to go to my distribution menu right over there and then I'm going to do a normal cumulative distribution function, so let me click Enter there. And so what is my lower bound? Well my lower bound is 10% 0.1, what is my upper bound? Well we'll just make this one 'cause that is the highest proportion you could have for a sampling distribution of sample proportions. Now what is our mean well we already know that's 0.15. What is the standard deviation of our sampling distribution? Well it's approximately 0.028 and then I can click Enter and if you're taking an AP exam you actually should write this you should say, you should tell the graders what you're actually typing in in your normal CDF function, but if we click Enter right over here, and then Enter, there we have it, it's approximately 96%. So this is approximately 0.96 and then out of our choices it would be this one right over here. If you were taking this on the AP exam you would say that called, called normal normal CDF where you have your lower bound, lower bound, and you would put in your 0.10 you would say they use an upper bound, upper bound of one, you would say that you gave a mean of 0.15 and then you gave a standard deviation of 0.028, just so people know that you knew what you were doing. But hopefully this is helpful.