## AP®︎/College Statistics

### Course: AP®︎/College Statistics>Unit 6

Lesson 2: Potential problems with sampling

# Example of undercoverage introducing bias

Nonresponse, undercoverage, and voluntary responses can all introduce bias when we sample a population for a study. Given the description of a study, we can think about potential sources of bias, and how they may have impacted the results of the study.

• It's not made particularly clear as to how to know if it's overestimate or an underestimate when we have a typical case of Undercoverage bias.

However, I can think of an explanation to suggest that it could be an underestimate for this example. You could argue that since they sampled only the landline users, not all of them might be using internet. However, the ones left out are mobile phone users who are more likely to use internet on their phones. Possibly a separate study asking if there is a difference in internet usage between landline users and mobile phone users can give us a better picture. Assuming that the mobile users are more likely to use internet compared to landline users, and the population involving potential non internet users show a 42% concern about internet privacy, you could argue that the number would go up if you include mobile phone users. Your thoughts on this?
• If undercoverage is a concern, is there also such a thing as overcoverage? Would it be something like including individuals in the population who shouldn't belong there, since undercoverage is not including those who should be included (e.g., counting men as part of a population for a pregnancy study.)?
• If you're overcovering some groups, you're also undercovering other groups. So overcoverage and undercoverage are the same thing. You just look at it from a different perspective
• What are all the types of biases?
• If the choices for the question presented here were 'Convenience sampling' and 'Under coverage', which one would have been the correct answer?
• It would still be under coverage. It can't be convenience sampling because it is not convenient to continuously call people until they respond to your survey.
• The videos on bias start off without an introduction on what types of biases exist and what they each mean. Perhaps I missed the video.
I would like to know the kinds of biases that exist and explanations on them. Can someone point me in the right direction?
• If there is an undercoverage bias, how do we know that it is a 42% underestimate? How about if the rest of those upsampled people (the unlisted people) were also 42% concerned? Or what if they brought up the percentage? Is that still an "underestimation"?
• Sal was saying that it is likely an underestimate because of the nature of the particular scenario. The people who were not included in the survey (mobile and unlisted phone numbers) may, according to Sal's logic, be more concerned about internet privacy than the people who are in the yellow pages. Therefore, if you took a broader survey including mobile and unlisted numbers, you would be likely to see that the percentage of people who are "very concerned about internet privacy" is more like 43% or 44%, if not higher. That is what is meant by the statement that 42% is an underestimate. I hope this has helped more than it has confused. :)
• a survey of high school students to measure teenage use of illegal drugs will be a biased sample because it does not include home-schooled students or dropouts. A sample is also biased if certain members are underrepresented or overrepresented relative to others in the population.
• Don't you also get pro-privacy bias from the repeated calling until you get a response? After all, you invaded people's privacy and made them mad by calling them a bunch of times, so they'll value privacy more than they usually do.
• Interesting point. That may definitely affect the responses. This is called a response bias since the way the survey is conducted affects the response of the individual.
• i have a question about the exercise, here is what is confusing:(People who listen to David's show probably like it in the first place, and those that choose to take the time to visit the website and respond to the poll probably feel even stronger than the typical listener.89,%percent,is probably an overestimate of the percentage of all listeners that love the show.)why is overestimate and not underestimate, if more people who like the show did not respond then 89%should be an underestimate not overestimate, does anyone know?