If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Confidence interval for a mean with paired data

Confidence interval for a mean difference with paired data.

## Want to join the conversation?

• Could anyone explain, why the standard deviation of the difference has these strange value? As I realized from previous videos, Variance of difference equals to sum of variances. Is it correct?
• Because they are dependent, the same person did 50 flips for example, if 30 was the dominant hand, 20 would be the non-dominant hand
When can it be independent? if we had 2 groups of people, then we can calculate the diff data set using the 2-way data table (columns names are the group 1 people's names, and row names are the group 2 people's names), then we can calculate the standard deviation
(1 vote)
• Why are we using t table when we are having the population statistic in this case? Seeing that the study is meant and conducted only for the five people. Also, why is the std. deviation derived for the difference of dom-non?. Kindly clarify. Thanks!
• We're not calculating the confidence of the sample proportion, so using a critical z value will cause an incorrect result. When calculating a sample mean, we must use t tables in order to get a better value. Also, we do not have the population statistic. The population would be the people in general, and the sample is those five people.
(1 vote)
• Why at does he say that we are 95% confident in capturing the true mean difference "for these friends"? Aren't we using these friends as a sample to find an interval that we are 95% confident of capturing the true mean difference for a wider population? If it is for these friends then surely either we are using the full population and don't need to use t tables or we are using a maximum sample size because if people can participate twice we lose independence?
• Perhaps the population is a group of people around their ages, hand sizes and etc.
• how would you calculate the standard deviation of the difference if you only had the standard deviation for each set( non-dominant and dominant) and their means??
• Why don't we have to add the variances of the dominant and non-dominant sets to find the variance and thereby the standard deviation of the difference set, and then use that number for our confidence interval. For example, I am right-handed but I snap with my left hand and when I did this experiment, I snapped almost 70 times with my left hand but only 22 times with my right hand, but the standard deviation of the difference would suggest that this is essentially impossible. Even if the difference isn't always that extreme, one would think that there are a number of people that can snap more with their non-dominant hand than their dominant.
• At , why does Sal divide the s(SD for sample) with n instead of the df? I am confused. Isn’t df used in the denominator if we are taking sample data?
(1 vote)
• df is used for selecting t critical value. To estimate the stddev of the sampling distribution we still refer to the central limit theorem which requires to use sqrt(n).
(1 vote)
• Why is this a matched-pair design? What is matching?
(1 vote)
• Matched pair design is when a researcher puts one group of the data with another based on a similar quality. A researcher would likely match 2 athletes together based on exercise, weight, work ethic, etc. Sometimes a researcher can a match a person with themselves (ex. 1st try on Nike shoe and then Adidas and then compare the comfort level). From a matched pair design you can do a paired t-test and compare differences in qualities.
(1 vote)
• I am confused as to why we cannot directly use S_diff in our interval, could someone explain what S_diff/sqrt(n) represents compared to S_diff? Does it represent the standard deviation of the sampling distribution of the mean of X_diff?
(1 vote)
• In previous lectures we saw that s = sqrt(s_d^2 + s_n^2) but that doesn't work here because of it being a paired relation - you never take a dominant and subtract someone else's non-dominant. I was curious if it was still possible to calculate s in terms of the other summary statistics. It took a fair amount of work but arrived at:

(N-1)*s^2 = (N-1)(s_d^2 + s_n^2) + 2N*mu_d*mu_n - 2*sum(d_i*n_i)

As far as I can tell, the answer is no because of that summed term that multiplies d's with corresponding n's.
(1 vote)
• is the sample size of 5 here the whole population? Then can we use Z-star to calculate the margin of error using the population standard deviation (same as the sample standard deviation here)?
(1 vote)