If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Reasonable samples

To make a valid conclusion, you'll need a representaive, not skewed, sample. Created by Sal Khan.

Want to join the conversation?

  • piceratops seedling style avatar for user georgefosberry
    At , why would asking the whole class be less efficient? Wouldn't the class be the only relevant demographic in this case?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • duskpin sapling style avatar for user DabbleOPS
    I thought when using a computer to generate something random, that it's not truly random? Rather pseudo-random?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • starky sapling style avatar for user DanikaHunt
    So what percent of a number is reasonable. So like if there were 60,000 students in a school, is 50 students a reasonable amount of students to ask their opinions'?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • mr pink green style avatar for user David Severin
      Rather than a exact percentage, the more important requirement is a random sampling, so 50 may be a reasonable amount if it is truly a random sampling. Think about a presidential race, there is no easy way to get even close to 1% of possible voters (138+ million people voted in 2016, so 1 percent is 1.3 million people, .1% is 138,000, and .01 is still 13,800 voters which is still a lot), so they have to define the parameters of who they chose as opinions.
      Statistics can be misleading if it is not random such as the number of toothpaste brands that 4 out of 5 dentists recommend.
      (1 vote)
  • blobby green style avatar for user Matthew Gavilan
    What is the main difference between random and systematic?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • starky seedling style avatar for user deka
      in a statistical context

      systematic means there is a time or spatial interval of sampling datapoints (say every o'clock to check the humidity in a room) which is predictable (1 hour interval)

      random means we have no predictable or biased way to draw a sample (say pick out 6 balls from 45 possible balls in a veiled box)

      but some sampling can be complicated and created by combining both (say pick every odd place in the decimals of pi; 3.'1' 4 '1' 5 '9' 2 and so forth)

      above case seems systematic since it has a predictable interval, but random too cause after some (quite long) sequences of picking you have no way to predict the next digit of pi thus it is random in some sense

      in a word, predictibility is the key difference between random sampling and systematic one
      (0 votes)
  • blobby green style avatar for user Buk Jhala
    I had exam questions and it's hard to solve, is there I can seek help. I could have attached those 6-10 questions need to answer them. Do you think I get help for this
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby purple style avatar for user Shlok
    what website do you get these questions on?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • leaf grey style avatar for user yay95
    A magazine asks people to visit its website to vote for Australia's most popular TV start

    Why would this survey's sample be biased?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • winston baby style avatar for user flashbolt
    with the first problem wouldn't it be better to ask the parents because they would be most affected by the plan.
    (0 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user aubriana.small
    what is the difference
    (0 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seedling style avatar for user bubbabumblebee1000
    what is your favorite thing you like
    (0 votes)
    Default Khan Academy avatar avatar for user

Video transcript

City Councilwomen Kelly wants to know how the residents of her district feel about a proposed school redistricting plan. Which of the following survey methods will allow councilwomen Kelly to make a valid conclusion about how residence of her district feel about the proposed plan? So before we even look at these, we just have to realize that if you're trying to make a valid conclusion about how the residents of her entire district feel about the proposed plan, she has to find a representative sample, or not kind of a skewed sample that would just sample parts of her district. So let's look at her choices. Should she just ask her neighbors? Well, she might live in a part of the neighborhood that might unusually benefit from the redistricting plan or might get hurt from the redistricting plan. And so just her neighbors wouldn't be representative of the district as a whole. So just asking her neighbors probably does not make sense. Ask the residents of Whispering Pines Retirement Community. So once again, the first one skews by geography. She's oversampling her neighbors and not the entire district. Here, she's oversampling a specific age demographic. So here she is oversampling older residents who might have very different opinions then middle aged or younger residents. So that doesn't make sense either. Ask 200 residents of her district whose names are chosen at random. Well, that seems reasonable. It doesn't seem like there's some chance that you somehow over sampled one direction or another. But it's most likely to give a reasonably representative sample. And this is a pretty large sample size. So it's important to say, what is the random process, where she getting these names from? But this actually does seem reasonable. Ask a group of parents at the local playground. Well once again, this is just like asking your neighbors. And it's also sampling a specific demographic. Now, this might be the demographic that cares most about the schools. But she wants to know how the whole district feels about the redistricting plan. And once ago, this is at a local playground. This isn't at all the playgrounds in the district somehow. So I wouldn't do this one either. Let's do one more of these. Mimi wants to conduct a survey of her 300 classmates to determine which candidate for class president-- Napoleon Dynamite or Blair Waldorf-- is in the lead in the upcoming election. Mimi will ask the question, if the election were today which candidate would get your vote? Which of the following methods of surveying her classmates will allow Mimi to make valid conclusions about which candidate is in the lead? So let's see, ask all of the students at Blair's lunch table? No. That would skew it in Blair's favor, probably. That's not a representative sample. Ask all the members of Napoleon's soccer team? No, same thing. They're likely to go Napoleon's way or maybe they don't like Napoleon, maybe they'll go against Napoleon. But either way this seems like a skewed sample. Put the names of all the students in a hat and draw 50 names. Ask those students whose names are drawn. Well this seems like a nice random sample that could be nicely representative of the entire population. Ask all students whose names begin with N or B? Well, this could be perceived as kind of random. But notice, N is the same starting letter as Napoleon, B is the same starter letter as Blair. You might say, well, that's fair. You're doing it for each of their letters. But maybe there's like 10 people whose names start with an N and only two people whose names start with a B. Once again, you're not even getting a large sample. And then on top of that, maybe there's some type of people with the same starting letter somehow like each other more. So I would steer clear of this one. Ask every student in the class? Well, that would work. There's 300 classmates, that might not be that time consuming. You can't get a better sample than asking everyone in the population. Which of the following methods of surveying our class which will allow Mimi to make a valid conclusion about which candidate in the lead? Well, that's a pretty good conclusion. People might change their mind. So it's not a done deal. You can't get a better sample size then the entire population. Assign numbers to each student in the class and use a computer program to generate 50 random numbers between 1 and 300. Ask those students whose numbers are selected. Well this is pretty close to put the names of all the student in a hat and draw 50 names. So I would give this one. That seems reasonable as well.