If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

## AP®︎/College Statistics

### Course: AP®︎/College Statistics>Unit 6

Lesson 5: Inference and experiments

# Random sampling vs. random assignment (scope of inference)

Hilary wants to determine if any relationship exists between Vitamin D and blood pressure.
She is considering using one of a few different designs for her study.
Determine what type of conclusions can be drawn from each study design.

## Scenario 1

Hilary obtains a random sample of residents from her town. She surveys those residents on whether or not they consume Vitamin D and how much Vitamin D they get. She also measures their blood pressures.
Suppose Hilary finds that among the people sampled, those who consume higher amounts of Vitamin D had significantly lower blood pressure than those who did not.
Problem a (scenario 1)
Based on this study, we can safely say this result probably holds true for:
Choose 1 answer:

PROBLEM B (SCENARIO 1)
Can we conclude that the difference in blood pressures is caused by the Vitamin D?
Choose 1 answer:

## Scenario 2

Hilary recruits residents from her town who have physical exams scheduled in the next month with the local doctor's office. She randomly assigns the volunteers to either a Vitamin D supplement pill or a placebo pill. Participants do not know which pill they are taking. They have their blood pressures measured before the study begins and at the end of the study.
Suppose Hilary finds that the group who took the Vitamin D supplements had a significant decrease in blood pressure, while the placebo group showed no significant change in blood pressure.
Problem a (scenario 2)
Based on this study, we can safely say this result probably holds true for:
Choose 1 answer:

PROBLEM B (scenario 2)
Can we conclude that the difference in blood pressures is caused by the Vitamin D?
Choose 1 answer:

Note: In the real world, we can't ethically take a random sample of people and make them participate in a study involving drugs. However, there are more advanced methods for controlling for this type of selection bias. When we rely on volunteers for testing new drugs and we see significant results, we need to be willing to assume that the volunteers are representative of the larger population. We can also repeat the study on a different group of volunteers to see if we get the same results.
Key idea: If a sample isn't randomly selected, it may not be representative of the larger population. On the AP test, be ready to apply this concept and some nuance when it comes to discussing if a sample is representative of the larger population.

## Summary

The table below summarizes what type of conclusions we can make based on the study design.
Random samplingNot random sampling
Random assignmentCan determine causal relationship in population. This design is relatively rare in the real world.Can determine causal relationship in that sample only. This design is where most experiments would fit.
No random assignmentCan detect relationships in population, but cannot determine causality. This design is where many surveys and observational studies would fit.Can detect relationships in that sample only, but cannot determine causality. This design is where many unscientific surveys and polls would fit.

## Want to join the conversation?

• what is the meaning of life
(11 votes)
• Can you delve a bit deeper into generalizability please?

~HarleyQuinn
(9 votes)
• Generalizability is a measure of how useful the results of a study are for a wider group of people. For example, if the results of a study are broadly applicable to several different types of people/situations, the study is said to have good generalizability. I hope this answers your question
(2 votes)
• I understand the basic idea of why randomization is so important in order to draw valid conclusions in any study design, but what I don't really get is, for scenario 1, how can we be so sure that Hilary's random sample is truly representative of all residents in the town itself? She could've randomly sampled 3 folks in her town to which I think may be insufficient amount of data to draw any valid conclusion. In the reverse, yeah she could have sampled say 100 or even a 1000 people. But we just don't know because it doesn't say so in the prompt. Do we just assume that, when it says "obtains a random sample blah blah blah," said random sample has sufficient amount of observations? OR does this even matter at all? Appreciate the help! :)
(8 votes)
• It is mathematically proven according to the Central Limit Thereom the larger the sample size the closer the sample mean will approach the population means. Thus samples are typically good if they have 30 or more. Randomization occurs to prevent bias. If the sample size is 30 or more we can assume its good.
(1 vote)
• For Scenario 2, why does the result only hold true for the people involved in the experiment and not the whole town? I'm not sure I understand this part.
(2 votes)
• Because the she only assignment this sample to the town who has physical exam. Thus, it could not represent the whole town cause it is not random sampling.
Hope this could help you!
(7 votes)
• Problem A scenario 2 is absolutely ridiculous. Coincidence does not equal causality. Just because the ones taking vitamin D happened to have lower blood pressure absolutely does not unequivocally make one the cause of the other. This is simply incorrect.
(4 votes)
• Actually, it is correct. Perfectly! Check it out:
Ok, so we have a group of adults with something in common: physical exams in the coming month. This study also assumes that the subjects are also at about the same health level. Because the only difference was the Vitamin D pill taking or not, this was a very effective experiment.
The pill caused blood pressure to go down.
Do you get it now? Hope this helps!
(1 vote)
• I got myself into a pretty deep hole by taking ap stats.
(3 votes)
• I do not agree with your contention that mere correlation (the result of a statistical analysis of a limited number of human subjects) can ever establish "causation". Cause and effect are categories of human action. One ought never conflate mere statistical correlation, no matter how "perfect" it appears to be, with causation. It is an epistemological error.
(2 votes)
• While it might not be perfectly established as causation, multiple experiments showing the same results, can as you say, reduce the error bars in size to a point at which they are no longer relevant. There is a small chance that the sun might not rise tomorrow, but would you change your plans for tomorrow based on that extremely small chance?
(1 vote)
• How can we differentiate between rbd and crd by observing an experimental design layout?
(2 votes)
• Regarding representation, for random sampling, each person has an equal chance to be withdrawn, also the conditions of those people are not known, it could be anyone from the population who has some conditions or no conditions, and thus it’s not selective.

For random assignment, researcher are selective and can choose from the population which group to conduct their study on, thus how can this be a representative of the population? It’s not because the selected group’s condition(s) don’t apply to the whole population

Regarding causality, for random sampling, there are many conditions applied for a person like taking vit D and C and sleep early etc…so there’s no causality inferred since the confounding factor(s) exist.

For Random assignment, causality is inferred. The treatment caused the effect because other irrelevant factors are more limited for the placebo group and treatment group, and thus, treatment causality is eligible to be inferred only for those two groups.
(1 vote)
• Jared is interested in finding out which of two types of soda the students in his school prefer. To find out he wants to randomly select 50 students to participate in a study Read the following options and determine if they do or do not represent a random selection. Drag each statement to the appropriate box All of the students vote, and the 50 students with the All of the students select a marble from a bag, and the most votes participate 50 students with green marbles participate Jared asks 50 of his friends to participate in the study. The names of all of the students in the school are put in a bowl and 50 names are drawn. The first 50 students who come into the cafeteria are asked to participate.

-Represents a Random Selection All of the students vote

1. All of the students select a marble from a bag, and the 50 students with green marbles participate.
2. Jared asks 50 of his friends to participate in the study.
3. The names of all of the students in the school are put in a bowl and 50 names are drawn.
4. The first 50 students who come into the cafeteria are asked to participate.
5. The 50 students with the most votes participate.

Which represents a Random Selection Does Not Represent a Random Selection?
(1 vote)