If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: Statistics and probability>Unit 6

Lesson 3: Sampling methods

# Techniques for random sampling and avoiding bias

Techniques for random sampling and avoiding bias.

## Want to join the conversation?

• what is the difference between a cluster sample and a stratified random sample?
• In clustered, there are many groups with equal balanceof level/gender. In stratified, There are many groups. Each group is completely one level/gender. other than that it is similar.
• Is it possible that clustering technique itself can introduce bias? Sal's example of sampling by classroom might allow selection of an even male/female sample but isn't this a bit risky? Factors that affect outcome (maybe more strongly than gender) may cluster in classrooms - e.g. teacher quality, classroom resources, social groups, or some unknown factor(s). This may return us to another problem of random/fair sampling from among clusters
• Yes, the clustering technique itself can introduce bias if certain factors that affect the outcome are clustered within the groups being sampled (in this case, classrooms). For example, if classrooms differ significantly in teacher quality, resources, or peer influences, sampling by classroom may not adequately represent the diversity within the school population. To mitigate this risk, careful consideration should be given to how clusters are defined and whether they truly represent distinct, homogeneous groups within the population.
(1 vote)
• When would you use non-random sampling?
• to be more realistic and somehow bittersweet, when you want to win an election or competition, and simply to show off yourself to others

i mean a statistical massage is in your hands with non-random sampling
• Sal mentions that in a stratified sample he could take 25 students from each year to make up the 100 student sample. But what if there are say 50% more seniors than juniors. Wouldn't you have to take more from the seniors in order to reduce the bias?
• In a stratified sample, the goal is to ensure proportional representation of each stratum (e.g., each year level). If there are more seniors than juniors, then indeed, you would need to sample more seniors to maintain proportionality and reduce bias. The sample size from each stratum should be proportional to the size of that stratum within the population to ensure accurate representation.
• to
Can't you just instead split your age group sample into genders too?
• That's definitely a possibility.

Cluster surveys are quick and effective, though.
Instead of tracking down people one by one, upon which half of them will probably say that they don't have time to answer,
you just go into the classroom and wait for five minutes, and because they are in a group it's much easier to get a response from everyone.
• Doesn't the clustered system introduce a lot of bias?

For instance, in the example in the video they seem to choose a single class in each of the 4 years.
- within students of a single class, there's a lot more shared history than between randomly chosen students. So that will probably influence the results.
- if one of the chosen classes is significantly smaller or bigger than one of the other chosesn classes, that year will be over- or under-representated ...
• Indeed, the clustered sampling method, as described, may introduce biases due to shared experiences within selected classes and variations in class sizes. By choosing only one class from each year level, the survey may inadvertently reflect the unique dynamics and characteristics of those specific classes rather than providing a representative sample of the entire student population. Additionally, unequal class sizes could lead to disproportionate representation of certain year levels, further skewing the results. To mitigate these biases, it's crucial to implement random or systematic approaches for class selection, consider stratification based on relevant factors, increase the number of sampled clusters, and employ robust data analysis techniques to account for discrepancies and ensure the reliability of the survey findings.
• When do you use stratified sampling vs clustered sampling besides cluster sampling being more for geographical purposes?
• Doesn't the stratified method also introduce some bias?

For instance, the method used in the example in this video assumes that the school has an equal number of students in each of the 4 years ...
(1 vote)
• While the stratified method aims to reduce bias by ensuring representation of different subgroups, it can still introduce bias if the stratification criteria are not chosen appropriately or if the sampling within each stratum is not conducted properly. For example, if the strata are defined based on characteristics that are not relevant to the research question or if the sample size within each stratum is not proportional to its size in the population, bias may occur. Additionally, stratification does not eliminate bias entirely but rather aims to minimize its impact by providing more accurate estimates for each subgroup.