Main content

## AP®︎/College Statistics

### Course: AP®︎/College Statistics > Unit 1

Lesson 4: Statistics for two categorical variables# Marginal and conditional distributions

Marginal distributions are totals for each row or column in a two-way table (or joint distribution table), showing the distribution of one variable. Conditional distributions show the distribution of one variable given a condition on the other. They're usually in percentages.

## Want to join the conversation?

- i dont understand this at all. i need serious help(51 votes)
- From what I've understood, marginal distribution is the percentage of a certain margin/bucket over the total. Conditional distribution is observing the data following the given condition.(57 votes)

- I thought that I understood, but once I got to the exercise, i got none of them correct.(22 votes)
- That's actually such a classic(8 votes)

- Why do we need to find Marginal and Conditional Distribution?(15 votes)
- They are both different types of relative frequency and they both have two different functions.(13 votes)

- Is marginal distribution just the percentage of a certain margin or bucket out of the total? For reference, Sal starts talking about marginal distributions at1:00(4 votes)
- Well, basically yes. A marginal distribution is the percentages out of totals, and conditional distribution is the percentages out of some column.

UPD: Marginal distribution is the probability distribution of the**sums**of rows or columns expressed as percentages out of grand total. Conditional distribution, on the other hand, is the probability distribution of certain**values**in the table expressed as percentages out of sums (or local totals) of certain rows or columns. So you're basically going one level down here. These row and column totals is what's**given**in the conditional distribution. Intuitively, when you here the word**given**, think of it as your new total, out of which you'll calculate the percentage or the probability.

Hope this clears things out!(38 votes)

- So the name "marginal" here, is from being the edge of the table ?(10 votes)
- Yes that would be correct.(7 votes)

- I kind of think of it like, marginal distribution is the distribution of one factor and conditional is two or more factors or circumstances in the data collected in the table.(8 votes)
- at0:49, why do we need to know about the other types of distributions?(3 votes)
- Well, from what I see, you need to know about different types of distributions, to gain more knowledge, so you can build off of that knowledge...(8 votes)

- At the minute3:52, Sal mentioned one of the differences between marginal and conditional distribution in terms of representations. specifically3:57the standard practice for representing conditional distribution is to think in terms of percentage.

Is it safe to say Conditional distribution must be represented in terms of percentages ?(6 votes)- Yes that is what he says.(2 votes)

- What is the joint relative frequency of 51%(6 votes)
- I can't tell them apart at all, this is way too confusing. Aren't they basically the same thing?(6 votes)

## Video transcript

- [Instructor] Let's
say that we are trying to understand a relationship
in a classroom of 200 students between the amount of time
studied and the percent correct. What we could do is we
could set up some buckets of time studied and some
buckets of percent correct and then we could survey the students and/or look at the data
from the scores on the test. And then we can place
students in these buckets. So what you see right over
here, this is a two-way table. And you can also view this
as a joint distribution along these two dimensions. So one way to read this is that 20 out of the 200 total students got between a 60 and 79% on the test and studied between 21 and 40 minutes. So there's all sorts of interesting things that we could try to glean from this, but what we're going
to focus on this video is two more types of distributions other than the joint distribution
that we see in this data. One type is a marginal distribution. And a marginal distribution
is just focusing on one of these dimensions. And one way to think about
it is you can determine it by looking at the margin. So, for example, if you wanted to figure out the marginal distribution of the percent correct, what you could do is look
at the total of these rows. So these counts right over here give you the marginal distribution
of the percent correct. 40 out of the 200 got
between 80 and a hundred. 60 out of the 200 got between
60 and 79, so on and so forth. Now, a marginal distribution
could be represented as counts or as percentages. So if you represent it as percentages, you would divide each of
these counts by the total, which is 200. So 40 over 200, that would be 20%. 60 out of 200, that would be 30%. 70 out of 200, that would be 35%. 20 out of 200 is 10%. And 10 out of 200 is 5%. So this right over here
in terms of percentages gives you the marginal
distribution of the percent correct based on these buckets. So you can say 10% got
between a 20 and a 39. Now, you could also think about marginal
distributions the other way. You could think about
the marginal distribution for the time studied in the class. Then you would look at these
counts right over here. You would say a total
of 14 students studied between zero and 20 minutes. You're not thinking about
the percent correct anymore. A total of 30 studied
between 21 and 40 minutes. And likewise, you could
write these as percentages. This would be 7%. This would be 15%. This would be 43%. And this would be 35% right over there. Now, another idea that
you might sometimes see when people are trying to interpret a joint
distribution like this or get more information or
more realizations from it is to think about something known as a conditional distribution. Conditional distribution. And this is the
distribution of one variable given something true
about the other variable. So, for example, an example
of a conditional distribution would be the distribution
of percent correct given that students study between, let's
say, 41 and 60 minutes. Between 41 and 60 minutes. Well, to think about that, you would first look at your condition. Okay, let's look at the students who have studied between
41 and 60 minutes. That would be this column right over here. And then that column,
the information in it, can give you your
conditional distribution. Now, an important thing to realize is a marginal distribution
can be represented as counts for the various
buckets or percentages, while the standard practice
for conditional distribution is to think in terms of percentages. So the conditional distribution
of the percent correct given that students study
between 41 and 60 minutes, it would look something like this. Let me get a little bit more space. So if we set up the various
categories, 80 to 100, 60 to 79, 40 to 59, continue it over here, 20 to 39, and zero to 19, what we'd wanna do is
calculate the percentage that fall into each of these buckets given that we're studying
between 41 and 60 minutes. So this first one, 80 to a hundred, it would be 16 out of the 86 students. So we would write 16 out of 86, which is equal to 16 divided by 86 is equal to, I'll just
round to one decimal place. It's roughly 18.6%. 18.6. Approximately equal to 18.6%. And then to get the full
conditional distribution, we would keep doing that. We would figure out the percentage. 60 to 79, that would 30 out of 86. 30 out of 86, whatever percentage that is, and so on and so forth in order to get that entire distribution.