Question 1

This explanation of the distinction seems really confusing. If the population is Bernoulli distributed then the population proportion and population mean are the same thing! And yet we can estimate one with a Z-stat but the other needs a T-stat?

Also, when Sal calculates confidence intervals for the sample mean he uses the sample variance, which is presumably Bessel corrected and therefore less biased. But when he calculates the intervals for the sample proportion there's no Bessel correction!

Again, the population proportion and population mean are the same for a Bernoulli distribution. And the sample proportion and sample mean are also the same. Yet, when calculating confidence intervals, why do we use Z-stats for one and T-stats for the other? Why do we use Bessel's correction for one, and not for the other?

Finally, why is there no mention of the sample size? I thought that small n is the determining factor for when to use T-stats instead of Z-stats.

Accepted Answer

This should help. https://www.quora.com/Why-doesnt-the-sample-proportion-have-a-Students-t-distribution-like-the-sample-mean

Question 2

why when calculating p hat (sample proportion), we dont use t score?

Accepted Answer

When calculating phat, we know sigma. However, now we don't, as mentioned in 3:12, so we use a thing called a t score.

EDIT:
Sorry for my original unclear answer. Looking at Edexcel S3 and S4 manuals I am pleased to confirm that JW and chris are correct. When n is large(>30 for IAL) the Student-t tends toward a normal. Also remember that the t- and z-statistics are basically the same thing (s is unbiased estimate of \sigma) and the difference is that in one case s (sample variance) is also an r.v. and in the other it's not because of extra data given. So which on to use ultimately depends on whether you want to make the approximation that s==\sigma (which is accurate when n>30).

PS this vid is an intro to t-score so presumably he wants to connect the z- and t-scores first.

Question 3

What is the difference between a statistic and a parameter? Please explain like you would to someone who barely knew anything about statistics.

Accepted Answer

We use 'statistic' in order to approximately estimate 'parameter'.
 
Let's say we want to know what percent of all male population of USA (or another random country) do some jogging in the morning.  This percent is called 'parameter'.
Can we really survey and analyze every male in the USA ?
Well, maybe we can, but it would be to costly to do so in terms of the time, money or human rights infringements of those who don't want to share what the do in the morning.

So, in practice we just randomly select some men from all over the country and count what percent of them run in the morning. This percent is called 'statistic', which approximately estimates 'parameter'.

Question 4

At 3:10, Sal claims that using z* as part of making the confidence interval for a sample mean actually leads to an underestimate for the confidence interval. Why is that?

Accepted Answer

The actual sampling distribution of means doesn't really follow a normal distribution (which is what z is based on). The sampling distribution of means has more "extreme" values than does the normal distribution, particularly when you use small samples to estimate the mean. This means more of samples will have means further from the population mean than they would if the sampling distribution was normal. So the confidence interval is narrower than should be, and the intervals don't contain the parameter the "correct" proportion of time. The t-distribution accounts for these "fatter tails".

Question 5

Why using sample standard deviation leads to underestimate?!
It's square root of sample variance right? And sample variance is divided by "n-1" rather than "n", so it seems to have larger value. why doesn't it lead to overestimate?

Accepted Answer

It's not about the sample standard deviation (the standard error), it's about the shape of the sampling distribution (all the possible means for a particular sample size). This distribution is not a normal distribution, particularly if you have small samples. It actually follows what's called student's t-distribution. This distribution has "fatter tails" (ie, more values that are far from the mean) than the normal distribution, and this is what causes the underestimation.

Question 6

Where does sigma over square root of n come from? Why and how did we put it there?

Accepted Answer

Interesting question!  In this discussion, we use theoretical (or population) standard deviation and variance.

To derive this, we use the following properties:
1) The variance of a sum of *independent* random variables is the sum of their variances.
2) When a random variable is multiplied by a factor that doesn't depend on the random variable, the variance is multiplied by the *square* of this factor.
3) The standard deviation is the (non-negative) square root of the variance, and so the variance is the square of the standard deviation.

Let the random variables X_1, X_2, X_3, ... , X_n represent a random sample of n data values, each of which has standard deviation sigma >= 0 (and therefore variance sigma^2).  If we assume the population is very large, then it's reasonable to call these n random variables independent.

The sample mean is (X_1 + X_2 + X_3 + ... + X_n)/n, and the standard error of the mean is the standard deviation of the sample mean. Therefore, the standard error of the mean is

standard deviation[(X_1 + X_2 + X_3 + ... + X_n)/n]
= sqrt{variance[(X_1 + X_2 + X_3 + ... + X_n)/n]}
= sqrt[variance(X_1 + X_2 + X_3 + ... + X_n)/(n^2)]
= sqrt{[variance(X_1) + variance(X_2) + variance(X_3) + ... + variance(X_n)]/(n^2)}
= sqrt[n sigma^2 / (n^2)]
= sqrt(sigma^2 / n)
= sigma/sqrt(n).

Have a blessed, wonderful day!

Course: AP®︎/College Statistics > Unit 11

Introduction to t statistics

Want to join the conversation?

Video transcript