Main content

## Statistics and probability

### Course: Statistics and probability > Unit 4

Lesson 3: Effects of linear transformations# Transforming data problem

It is very common to take data and apply the same transformation to every data point in the set. For example, we may take a set of temperatures taken in degrees fahrenheit and convert them all to degrees celsius. How would this conversion impact the measures of center of spread in the data set? Let's look at a simpler example to think about this situation.

## Part 1: Adding a constant

Five friends took a 10 question multiple choice quiz in class. Their raw scores on the quiz are shown in the dotplot below along with summary statistics.

x, with, \bar, on top | s, start subscript, x, end subscript | start text, M, end text | start text, I, Q, R, end text | range | |
---|---|---|---|---|---|

Scores | 8 | 1, point, 41 | 8 | 3 | 4 |

The teacher told everyone that she would add 1 to every student's score as extra credit. Their new scores are shown below.

## Part 2: Multiplying a constant

The teacher always scores her quizzes out of 100 points. For this 10-question quiz, she multiplies the new scores by 10 to get the students' final grades which are shown in the dotplot below.

x, with, \bar, on top | s, start subscript, x, end subscript | start text, M, end text | start text, I, Q, R, end text | range | |
---|---|---|---|---|---|

Scores | 8 | 1, point, 41 | 8 | 3 | 4 |

New scores | 9 | 1, point, 41 | 9 | 3 | 4 |

Final grades | question mark | question mark | question mark | question mark | question mark |

## Part 3: Putting it all together

Let's look at a temperature conversion example. Suppose a set of temperature measurements has a mean of 104, degrees, start text, F, end text and a standard deviation of 9, degrees, start text, F, end text, and we convert all of the temperatures to degrees celsius.

Here's the conversion formula: degrees, start text, C, end text, equals, left parenthesis, degrees, start text, F, end text, minus, 32, right parenthesis, times, start fraction, 5, divided by, 9, end fraction

## Want to join the conversation?

- Why subtracting does not impact the standard deviation?

Could anyone break it down?(12 votes)- Shifting all of the data points left or right on a number line doesn't impact the standard deviation of a data set. This is because standard deviation is a measure of how spread out data points are. Because adding and subtracting don't change the spread of the data, the standard deviation doesn't change. This also holds true for the range and IQR.(43 votes)

- I'm confused about why this is the answer. What is the math/logic behind leaving one part of the conversion problem out of the scenario when dealing with SD? I see that it works, and get that we're measuring distances from the mean in degrees, not actual degrees, so subtracting 32 wouldn't make much sense and it would make the answer negative, which doesn't work for SD. But why does the multiplying by 5/9 part work? I guess what I'm confused about is what is the 5/9ths part to the temperature conversion formula?(2 votes)
- One way to think about this is to compare to distance measurements. If you look at a ruler, you will see that inches are wider than centimeters. If you look at a thermometer, you will see that degrees Celsius are bigger (sort of "wider") than degrees Fahrenheit.

A ruler starts counting both inches and centimeters in the same place, so the conversion formula only involves multiplication (0 inches = 0 cm; 1 inch = 2.5 cm, 2 inches = 5 cm; etc). Nothing would be added or subtracted from the mean or the SD.

But a thermometer starts counting in different places: 0 C = 32 F. That's why we add or subtract 32 when converting temperatures (how much warmer is it than the freezing point of water). Mean is a temperature (how warm it is on an average day), so we have to add or subtract 32 when converting between C and F.

Standard deviation does not measure how hot it is, but rather how**different**a set of days are**from the average day**. SD is a measure of**spread**. (In different places, the same "width" of spread could be around a mean temperature 0 C, 20 C, or 1000 C.) Because of that, the only thing that matters is that a degree Celsius is 9/5 "wider" than a degree Fahrenheit, aka, a degree Fahrenheit is 5/9 "narrower" than a degree Celsius. That's why we only multiply the units, we don't add or subtract.

Another comparison to reinforce the point: The Kelvin and Celsius scales use the same size ("width") units, but zero Kelvin is "absolute zero", the coldest possible temperature. Zero Celsius is the freezing point of water. Same units, just shifted by hundreds of degrees. What happens to the mean and standard deviation when you convert between Kelvin and Celsius? [For an explanation of absolute zero, take a physics course; here just trust me it's waaaay colder than 0 C. Right now, focus on the math, which is all you need to answer my questions.]

[pause to think, then scroll down for answers and a new question]

Answers: No multiplication involved. The mean shifts but the SD stays the same. Why?(24 votes)

- How do you arrive at 3 for the IQR? Quatile one equals 7 and quartile three equals 9, so wouldn't that make the IQR equal to 2?(4 votes)
- Khan Academy drops the median when computing IQR for an odd number of data points. So, ignore the median 8 and go halfway between 6 and 7 which is 6.5 and halfway between 9 and 10 which is 9.5.

9.5-6.5 = 3

Edit: Khan Video explaining

https://www.khanacademy.org/math/ap-statistics/summarizing-quantitative-data-ap/measuring-spread-quantitative/v/calculating-interquartile-range-iqr(6 votes)

- I don't understand how to convert a standard deviation of 9F to degrees celsius.

C = (F-32) * 5/9

Why we can replace (F-32) with a standard deviation of 9F?

Why we don't calculate using the formula? : C = (9F-32)*5/9(3 votes) - Hi!

Is the range affected when a set is shifted?

Also, for the last problem concerning standard deviation, why don't we subtract 32? Doesn't that result in an incorrect conversion?

Thanks!(2 votes) - So when given a problem where you must find the standard deviation or mean, how would you find that?(1 vote)
- Part 2: Multiplying a constant

If it's a 10-question quiz, and every grade is multiplied by 10, no result should be more than 100.(1 vote) - I understand that multiplication changes the SD and all, but I don't understand why addition/subtraction in the case of temperatures doesn't matter. 9 degrees Fahrenheit IS -12.78 degrees Celsius, so why is it 5 degrees Celsius?(1 vote)
- Remember, that the SD is not changed through a shift but by scaling.

Made the same exact error during the exercise.

Visualize, that by a shift in the data, the distribution doesn't change relatively but does so during scaling.(1 vote)

- For the standard deviation, the original value for the temperature conversion was C = (F - 32)* 5/9. Are the mean and standard deviation different when calculating the temperature conversion?(1 vote)
- when we are changing the scale itself then why the subtraction is not considered in formula?(1 vote)
- You would get a negative number if you do. The spread is for measuring how far apart the numbers are and it makes sense if you look at data with all the values converted. The temperatures in F are more spread apart like how 100 F is 37.8 C and 60 F is 15.6 C. The spread is bigger in Fahrenheit. The difference is 40 in F but if you multiply by 5/9 you get 22.2 and 22.2 is also the difference between 37.8 and 15.6 in Celsius. The subtraction part when you only convert the temperature is because they also start from a different point, 32 F for freezing but with Celsius it's 0. For measuring spread it's not measuring from a certain point but just the differences between the values.(1 vote)