Main content

### Course: Statistics and probability > Unit 6

Lesson 2: Sampling and observational studies- Reasonable samples
- Valid claims
- Making inferences from random samples
- Identifying a sample and population
- Identify the population and sample
- Examples of bias in surveys
- Example of undercoverage introducing bias
- Correlation and causality
- Identifying bias in samples and surveys
- Simulation and randomness: Random digit tables

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Identifying a sample and population

Identifying a sample and population.

## Want to join the conversation?

- If I have a data set of one individual's running time over 10 wks,

Will it be a sample or a population data?(10 votes)- This will be a sample, as you are observing the running time of only one individual(10 votes)

- I feel some of these problems on population/sample are ambiguously worded.

There's one about interest in out-of-state cars crossing a multi-lane toll bridge, and they sample every tenth car in one lane via a camera. The correct choice was population = all the cars in the one lane.

To me, this is incorrect. The setup clearly said interested in out-of-state cars*crossing the toll bridge*. We weren't told that out-of-state traffic only uses one lane, and it would be a weird leap to infer that as well.

We also weren't told there was any difference between the lanes. For all we know they only had one camera available for this and they picked a lane at random from which to sample. So it shouldn't matter from which lane or lanes they are drawing their sample, to me the sample is every tenth car in the photographed lane and the population they were after was all out-of-state traffic on the bridge. The population is the ENTIRE group we are interested in, and as phrased that means all the out-state-traffic on the whole bridge.

There is nothing in the question as it is phrased to tell me that only traffic in one lane is their interest. They sampled one portion of one lane of the bridge to extrapolate to all traffic on the bridge.

Where am I wrong on this?(8 votes)- I feel like since the camera doesn't change from lane to lane periodically, it only is taking into account the one lane as the population. If you were, for instance, taking a measurement of all the cars in that lane, there would only be a measurement of the population and not a sample.

The misconception comes from the interpretation of what a sample is, it is a randomly chosen selection of a population. The question is trying to trick you into thinking that the cars on the entire bridge is the population, but the cars in the other lanes have no way of being randomly chosen, which means they are not part of the population.(2 votes)

- Tell me if I am correct. So Basically, The Population is the total? Then the Sample is the people getting surveyed?(5 votes)
- Hi Mitchell, The population is the total and 100 seniors are chosen to be survery.(1 vote)

- Is a research question different from a hypothesis?

Do we employ different strategies to gauge outcomes in both cases?

I read an article, and there was a definition of statistical analysis, and it goes like this;

Statistical analysis means investigating trends, patterns, and relationships using quantitative data. To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process.

You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

If you have noticed, in the second paragraph, it says that you need to specify your hypothesis first in order for the analysis to even take place. So does that mean we can't use statistical analysis with research questions?

But, it's indeed not the case as we, seldom, collect data for research questions as well, don't we? So, what's this conundrum?(2 votes)- Yes, a research question and a hypothesis are distinct concepts. A research question is a broad inquiry into a topic, seeking to understand or explore a phenomenon. On the other hand, a hypothesis is a specific, testable statement that predicts the relationship between variables. While both are integral to the research process, they serve different purposes. Strategies for gauging outcomes may vary depending on the nature of the research question or hypothesis, but statistical analysis can be applied to both scenarios.(2 votes)

- Which of these samples is most representative of the entire school population?(1 vote)
- The most representative sample of the entire school population would be one that includes a diverse selection of students across different grades and demographics. A random sample of students from each grade level, ensuring proportional representation, would provide a more comprehensive understanding of the school population's sentiments.(1 vote)

- What does senior mean exactly in USA ? All of the students in a high school or just the final year of a high school(1 vote)
- Senior means final year of high school (12th grade) or college in the US. In this scenario we use US highschools.(1 vote)

- Wouldn't the population be all students at Riverview High? Even though we are sampling the seniors, all Riverview High students still eat lunch there, so the data would be relevant to them, no?(1 vote)
- the population is all the same seniors(1 vote)
- If I have a data set of one individual's running time over 10 wks,

Will it be a sample or a population data?(1 vote)- You should use population variance if you have all the data. This is because we do not know the true mean as result the variance will be slightly higher then we expect. To account for this we use sample variance.

So to summarise it will depend on the precise context.(1 vote)

- A population is the entire group that you want to draw conclusions about. A sample is the specific group that you will collect data from.(1 vote)

## Video transcript

- [Narrator] Administrators
at Riverview High School surveyed a random sample
of 100 of their seniors to see how they felt
about the lunch offerings at the school's cafeteria. So you have all of the seniors, I'm assuming there's more
than a hundred of them, and then they sampled a hundred of them. So this is the sample. So the population is all of
the seniors at the school. That's the population, all of the seniors. And they sampled a hundred of them. So the hundred seniors that the talked to, that is the sample. That is the sample. So they tell us, identify the population and the sample this setting. So let's just see which if these choices actually match up to what I just said. And like always, I encourage
you to pause the video and see if you can work
through it on your own. So, the population is all high
school seniors in the world; the sample is all of the
seniors at Riverview High. No, this is not right. We're not trying to get an indication of how all of the high school
seniors in the world feel about the food at
Riverview High School. We're trying to get an indication of how the seniors at Riverview High School feel about the lunch at
the school's cafeteria. So they did a sample of hundred of them. So this is definitely not going to be-- let me cross this one out. The population is all
students at Riverview High; the sample is all of the
seniors at Riverview High. Well, they clearly didn't
sample all of the seniors, they sampled a hundred of the seniors. So this isn't gonna be right either. Let's hope that the third
choice is working out. The population is all
seniors at Riverview High; the sample is the
hundred seniors surveyed. Yep, that's exactly what we talked here. We're trying to get an
indication about how all of the seniors at Riverview High feel about the food, the lunch offerings. We probably think it's impractical or the administrators
feel it's impractical to talk to everyone. So, to get exactly what
the population thinks. So instead they're gonna do a random sample of a hundred of them. So the sample is the hundred seniors who are actually surveyed.