If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Data inferences — Harder example

Watch Sal work through a harder Data inferences problem. 

We make a confidence interval by starting with a sample result and adding and subtracting the margin of error. Consequently, the sample result is the exact middle of the interval. If we were to build a new interval based on the same data but with a reduced confidence level, the new interval would have the same center but a smaller margin of error. Here's a diagram illustrating this problem another way.

.

Want to join the conversation?

  • leafers seedling style avatar for user ~M
    Is there any way to solve this via calculation, rather than 'These sound shady, and I like the sound of this choice, so we'll go with this' ??
    (126 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Anthony Jacquez
      Yes, a confidence interval is defined by:
      sample mean ± margin of error

      It said the the 95% confidence interval is from 22.76 to 59.24
      Using this we can find the sample mean which would just be the mean of these two numbers: (59.24 + 22.76) / 2 = 41

      Moving across confidence intervals will only affect the margin of error but won't affect the sample mean so the 90% confidence interval would still have to have a sample mean of 41 and take into consideration that the range would decrease because were decreasing the confidence interval. Now we just test the answers given:

      17.10 to 64.90: (17.10 + 64.90) / 2 = 41
      This solution does have the same sample mean but is has a much greater range than the original interval so it would be more precise rather than less precise so this can't be the answer.

      20.48 to 53.32: (53.32 + 20.48) / 2 = 36.9
      This isn't 41 so we can rule this answer out.

      21.56 to 56.12: (21.56 + 56.12) / 2 = 38.84
      This isn't 41 so we can rule this answer out

      25.65 to 56.35: (25.65 + 56.35) / 2 = 41
      This answer does have a sample mean and has a lesser range than the 95% confidence interval so it is the correct answer.
      (211 votes)
  • piceratops seed style avatar for user Hayato Shiotsu
    Hi, I thought that the correct choice would be 3 due to the following reasons:
    Basically, the range (difference between max possible value and min possible value) which the median could be in with a 95% confidence level is
    59.24 - 27.76 = 36.48
    Therefore, the range which the median could be in with a 90% confidence level should be
    90/95 * 36.48 = 34.56

    Now, if we calculate the range for all the choices:
    Choice 1 : 64.90 - 17.10 = 47.8
    Choice 2: 53.32 - 20.48 = 32.84
    Choice 3: 56.12 - 21.56 = 34.56
    Choice 4: 56.35 - 25.65 = 30.7

    Therefore, we see that the range in choice 3 is exactly the same as the range I calculated previously for 90% confidence level, therefore the correct answer is choice 3. Can anyone explain what I am getting wrong here?
    (43 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Ishan Nandal
      You are pretty accurate about the difference of the two numbers, but an important factor you didn't consider is where the numbers are on the number line. Example: 41.56 and 76.12 also have the same difference, i.e. 34.56 but we know that our confidence level should be much less than 90% in this case.
      (15 votes)
  • male robot hal style avatar for user Nitish Gupta
    I think that if your confidence level is lower, then you will choose a broader range, you wouldn't be sure of the exact value and would make assumptions. If your confidence level is 100%, you will know the exact median, but if it is 0 %, then u may say that any observation could be the median. As your confidence level will increase, you would narrow down to get closer to the median until u find the exact value. At this point of time, you will obviously have 100 percent confidence. But, if u follow the video's logic, when u will have 100 % confidence (which means u know the answer), instead u will consider the whole data set as the range for the median! What do you think?
    (26 votes)
    Default Khan Academy avatar avatar for user
  • spunky sam blue style avatar for user mishra.umang2205
    could you please explain how do we actually calculate the confidence level for a given data estimate?
    (17 votes)
    Default Khan Academy avatar avatar for user
  • piceratops ultimate style avatar for user Onetwothree45
    So...what makes choices 2 and 3 incorrect? Sal just said, "I like this choice" & then picked choice 4....
    (11 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user خالد (Khaled) Allen
      The main issue with 2 and 3 is that they offer values that are outside the range given in the question, which brings up problems about the distribution of numbers which we can't answer with the info given.

      For example, it's possible there is a big gap between the lowest score in the 95% confidence range ($22.76) and the next lowest data point. It could be $15. If that were the case, expanding the range to $20 would have no effect on the median. If there are a ton of data points between 22.76 and 20 compared to the other end of the range, it could shift the median. But again, we have no way of knowing, so we can't assume.

      Choice 4 offers two numbers that are inside the range given for 95% confidence. That way, we haven't added any new values and the potential statistical issues, we're just narrowing the range of numbers we know contain the answer (95% sure). Thus, it's the only answer that MUST have a lower confidence than the original set.
      (7 votes)
  • duskpin seed style avatar for user Halah B
    what in the world is a confidence level
    (9 votes)
    Default Khan Academy avatar avatar for user
  • male robot hal style avatar for user Nipurna kunwar
    Can we do this type of question by following way?
    solution:
    Let x be the median hourly wage at 100% confidence level.
    By question,
    95% of x = 22.76 or 59.24 ( you can do any one part of it)
    so x = approx. 23.96 or 62.36
    Now,
    90 % of x ( approx 23.96 or 62.36) = approx. 21.56 or 56.12
    (9 votes)
    Default Khan Academy avatar avatar for user
  • orange juice squid orange style avatar for user Nazish
    Alright, I'm a bit conflicted now. Because I tried doing the question before watching, for practice, and I used a calculator to be specific (though I still don't know whether we're allowed to use calculators in Sal's examples.)
    My working was simply: 90/95 * $22.76, and 90/95 * $59.24.
    My answer was the third one, not the fourth.
    (6 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user rhamza3899
    Bro you can't just eyeball it and tell us the answer, teach us the equation and how to apply it because we obviously can't eyeball it on the real SAT like you did here.
    (8 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user KnightStryke
      Actually, you can just eyeball it. Check out the link; at its most basic level, we need to find the answer choice that-
      A) Has the same sample median and B) has a smaller range

      A) needs to be true because it is based on the same sample, this means that the range is not going to be shifting left and right as a whole, it's just going to be getting "wider" or "smaller".
      B) needs to be true because there is less of a chance that the range contains the true median hourly wage. ("Lower confidence level") The only way there can be less of a chance is for the range to be smaller.
      (0 votes)
  • mr pants teal style avatar for user aya baydoun
    hi.
    I used cross product in order to find the answer and reached in calculation to choice c.
    I am not sure if that is correct can someone help me ? And if its not to help me out to understand why the answer was another choice.
    (6 votes)
    Default Khan Academy avatar avatar for user

Video transcript

- [Instructor] A researcher collecting information about 1,000 randomly selected physical therapists concluded that the median hourly wage for physical therapists in the United States at the time of the study was between $22.76 and $59.24 with a 95% confidence level. Which of the following could represent the median hourly wage, based on the same sample, same sample, for physical therapists in the United States with a 90% confidence level? So let's just think about what confidence level means. That means, remember, the median is gonna be some number, it might be the actual the median hourly wage for physical therapists. It might be $30 an hour, $25 an hour. They're trying to estimate it by doing this random selection and then they're providing a range and they're saying, hey, there's some confidence level that this range captures that true median hourly wage. So when they say there's a 95% confidence level, they're saying that there's a 95% probability that the true median is between these two numbers. Now if we're talking about a 90% confidence level, if we're talking about a 90% confidence level that means we are less confident that the true median is between these two numbers. In order to be less confident, you would wanna have even a narrower range. You would want a range that is a subset of this range right over here. Let me make this clear. So the range is $22.76 all the way to $59.24. So they say it's a 95% confidence level. That means, I'm gonna actually draw a number line here, so let's say these are just points on the number line, these are points on the number line. So there's 22.76, this is 59.24, they're 95% confident that the true median that the true median is going to be that it's going to be between these two values. So they're 95% confident that the true median is there that the true median is there or that the true median is there but there's still a 5% chance that maybe the true media could be here. It could be below this range or above this range. Now if we're talking about a lower confidence level, 90% confidence level, that means that the reign, this range should be narrowed. If you're gonna be less confident, that means you wanna have, or the only way you're gonna be less confident is if you have a narrower range. If you had a broader range, if it went from say here to here, you'd be even more confident. This type of a range you might have a 97% confidence level. So at 90% less confident, you're looking at something that might look something like that might look something like that. Now let's look at the choices. So this is $17 to 64.90, so this is actually more like this one. You're starting lower and you're ending higher. So you should be even more if you're 95% confident that this range capture you should be even more confident that this range captured it. So this would actually maybe be a 97 or 98 who knows confidence level. Not a 90% confidence level so we can rule that out. 20.48 to 53.32. So this one's interesting because it starts lower but then it ends lower too. So I don't know what you could actually what you could say it's based on the same I don't know, this one's a little bit it depends kind of what the distribution that you selected was, they didn't tell us a lot about that. This is a little bit, this one feels a little bit shady. 21.56 to 56.12, this is also similar it starts a little bit lower and then it ends a little bit lower. So it's kind of shifted the range and so this is also, you don't know for sure that you're going to have, I mean you're probably going to have a well, they don't tell us a lot about the distribution so I'm not gonna make too many statements there. Now this last one goes from 25.65 to 56.35. So that's going to be something like that's going to be, actually, like from here to here. This is going to be a narrower range. So you would be less confident. So this could be something that represents a 90%, a 90% confidence level. So actually I would go with this one because this is kind of a purely this is a subset of the previous range so I like this choice right over here.