If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## High school statistics

### Course: High school statistics>Unit 4

Lesson 2: Analyzing trend lines in scatterplots

# Correlation and causality

Understanding why correlation does not imply causality (even though many in the press and some researchers often imply otherwise). Created by Sal Khan.

## Want to join the conversation?

• So how do we know, given some data, that two variables are just correlated or there's some causality between them?
• I'm a statistician and I can categorically state that causality is ideological.

That is, if the data is related (correlated), and if you susplect one causes the other, you are making an ideological statement. It might be true, it might not be – there isn’t enough information to supported or rejected that assertion.

Sometimes the statement is very obvious - the temperature is correlated to the length of the day... well... the length of the day relates to the amount of sun shine, and therefore we can safely say that the length of the day causes changes in temperature. Sometimes the statment isn't so obvious, like above example. What appears to be a perfectly logical assumption has no basis. The same used to happen in history where people though bad smells gave you diseases (rather than both bad smells and diseases being related to poor hygene and microbial action).

So at the very least causation is a hypothesis (hypothetical thesis – unproven theory), and at best an accepted theory (i.e. previous studies have confirmed that one is likely to cause the other).

What does this mean? If you find that data are correlated (related), you should then determine if one causes the other.
• So what is the perfect definition for the causality?
• Causality is relation between something as cause and other thing as effect.
So, it's not "just" about relation (correlation), there must be cause and effect. To make it clear, we have to distinguish causality from correlation.

Let say we have two variables: A and B.
A and B correlates when the value of A and B changes together; for example, when A's values increase, B's values decrease. However, we cannot say yet that A causes the change of B.

Here are great examples that correlation doesn't equal causation:
http://www.tylervigen.com/spurious-correlations
• What is the difference between causality and causation?
• Hmmm, I think they are pretty close, but used in different contexts. "Causality" is a general, absolute property of the universe, which most scientists believe is an important building block of the real world. They want their theories to respect "causality" meaning that the cause (or causes) of every specific event must happen before the event (say, the decay of a radioactive atom must happen before the click in the geiger counter). "Causation" is usually used to refer to categories, and often only in a probabilistic sense, such as "smoking causes lung cancer", or "global warming causes floods".
• Maybe a combination of eating healthy meals and exercise can result in a decrease in obesity?
• Yes of course. Nutrition are and exercise repeated over long periods of time are the only significant causes for weight loss.
• idk if this is math
• i need help
• I like how the rest is about scatter plots but then this is about obesity
• Are there real world applications of causality?