If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: AP®︎/College Statistics>Unit 3

Lesson 4: Effects of linear transformations

# How parameters change as data is shifted and scaled

See how transforming a data set by adding, subtracting, multiplying, or dividing a constant affects measures of center and spread.

## Want to join the conversation?

• I don't understand why the 1st and 3rd quartiles are different on the spreadsheet than when I do it by hand.
First I ordered the numbers
2,3,3,5,5,5,6,7,7,8,10,13
I got the same median = 5.5
but for the IQR I got = 3.5
1st quartile = 4
3rd quartile = 7.5
IQR= 3.5

At you can see the results of the spreadsheet

1st quartile = 4.5
3rd quartile = 7.25
IQR = 2.75

Is there a difference in how the spreadsheet computes the 1st and 3rd quartiles?
• The Excel function QUARTILE is considered inaccurate. It treats the quartile like a percentile and then use linear interpolation to get the output. It's newer version QUARTILE.EXC is more preferable.

https://superuser.com/questions/343339/excel-quartile-function-doesnt-work

The algorithm of QUARTILE.EXC can be described somewhat like below:
- Calculate i = (n-1)/4
- 1st quartile = [i]th number + {i} * ([i+1]th number - [i]th number).
Where [i] = the integral part of i, {i} = the decimal part of i.
- 2nd (or 3rd) quartile: Multiply i by 2 (or 3), then do the same process.

For example, in the lesson we have a set of data: 2,3,3,5,5,5,6,7,7,8,10,13

i = (12 + 1)/4 = 3.25
3 * i = 9.75

1st quartile = 3 + 0.25 * (5 - 3) = 3.5
3rd quartile = 7 + 0.75 * (8 - 7) = 7.75
IQR = 7.75 - 3.5 = 4.25
* Tested using QUARTILE.EXC in Excel.

An alternative recursive algorithm can be used where the data set is splitted into halves to find the median of each half, which is similar to what Sal taught.

Both algorithms produce values that separate the data set into groups of 25%. The algorithm implemented in QUARTILE doesn't.
• Here how standard deviation is scaling if we scale data? If i add 5 to all data, SD is not increasing but if we multiply its increasing. But Multiplication is repeated addition, its same thing like adding 5 five times, then why it is scaling?
• Because...

Say for example I have 4 and 8. The difference is 4, or in other words they are 4 apart. The ratio between them is 1 to 2 (8 = 2x4).

Now let's say I multiply both of those numbers by 5. They become 20 and 40. The ratio is exactly the same: 1 to 2. However, the difference between them is much bigger (20 now), because multiplying by the same number doesn't mean that you are adding the same thing. 5 x 4 = 4+4+4+4+4, and 5 x 8 = 8+8+8+8+8.

So, if we imagine that 4 is the mean in the original set, and 8 is another data point called X, X is now a lot farther away after scaling it (remember that standard deviation is just the average distance from the mean).

What DOESN'T change (I think) is the number of standard deviations away from the mean that data point x is. Say the standard deviation of the dataset is 4. 8 is 1 SD from 4. If we scale the data by 5, then the SD becomes 20, the mean is now 20, and data point x is now 40. Data point x is still 1 SD away.
• Is there a Khan Academy-like course for learning excel?
• I'm not sure about that but there are lots of courses on this kind of stuff on LinkedIn Learning.
• why sal evaluate mean and the standard deviation on population rather than on samples?
I think it could be more practical to evaluate them on sample
• Evaluating mean and standard deviation on the population rather than on samples provides a complete understanding of the entire dataset. While sample statistics can be useful for making inferences about a larger population, calculating parameters on the entire population allows for a more accurate representation of the data without potential sampling biases.
(1 vote)
• can you tell us how to convert units of measurement?
• I think the magnitude of the unit of measurement remains the same, because the domain of distribution remains the same after scaling or shifting, it still define on all real numbers
while the unit of the unit of measurement may change depends on the context, say the transformation transfer the sample from Fahrenheit to degree.
(1 vote)
• why when we scale the sd it changes even if the multiplication is repeated addition.
(1 vote)
• I was never taught that adding/subtracting is shifting and multipyling/dividing is scaling. what does it mean here? To me both of them are just the number increase and that is it!
(1 vote)
• In statistics, "shifting" refers to adding or subtracting a constant value from each data point, which moves the entire dataset up or down along the number line without changing its relative distribution. "Scaling," on the other hand, involves multiplying or dividing each data point by a constant, which changes the spread or dispersion of the data. The distinction between the two operations is important because they have different effects on the statistical properties of the data.
(1 vote)
• So bacically:

`For Standard deviation and IQR, they do not change if you shift(+or-) the data. But scaling(×) it would change.For mean and median, they both change if you shift or scale the data. In the case when they do change, they are only changed by the value it shifts or the scale factor . For instance, if X_i +5 , its mean will also be increased by 5. If its scaled up by 5, the mean will be 5 times of the original mean, and the Standard deviation or IQR would also be 5 times of it`.

data=X_i

tl;dr
+- changes mean and median only
× changes mean, median SD, and IQR.
(1 vote)
• Why does Sal suddenly skip to a completely different definition of the IQR?

For the standard deviation, he explained which function from the spreadsheetprogram he selected.
But during the calculation of the IQR he just used some function, without explaining why he chose that one, and without explaining why it leads to a different answer?

(Within excel, I can choose between 2 different quartile functions, but each one will lead to an IQR which is far away from the IQR we learned to compute in the previous unit ...)
(1 vote)
• The choice of function for computing quartiles may indeed lead to different results. Different functions may use distinct algorithms or assumptions about the distribution of the data, resulting in variations in the calculated quartiles. It's essential to understand the specific function being used and ensure it aligns with the method taught in previous units.