### Course: Statistics and probability>Unit 2

Lesson 2: Describing and comparing distributions

# Clusters, gaps, peaks & outliers

This lesson explores the features of distributions in data sets, like clusters, gaps, and peaks. We learn how to identify outliers, which are data points far from the rest. We also discuss how to spot peaks and clusters in data.

• What is an outlier?
What is a range?
What is an interquartile range?
What is mean?
What is median?
What is mode?
What is a lower quartile?
What is an upper quartile?

I have them all mixed and am so confused.
• Outlier - a data value that is way different from the other data.
Range - the Highest number minus the lowest number
Interquarticel range - Q3 minus Q1
Mean- the average of the data (add up all the numbers then divide it by the total number of values that you originally added)
Median - the number in the middle of the data. If the numbers are all in order, whichever number is in the middle
Mode - whichever number there is the most of
Lower Quartile - Q1 - the middle of the bottom half of the data, if you take the median, it's the middle of the data on the right of the median(it's basically the number at the 1st quarter.
Upper Quartile - Q3 - the middle of the data above the median, the value at the 3rd quarter of the data.
• What is the exact meaning of an outlier?
• 1) A data point that is distinctly separate from the rest of the data.
2) Any data point more than 1.5 interquartile ranges (IQRs) below the first quartile or above the third quartile.

• Can you please explain peak?
• a peak, like he said in the video is the hight of the numbers. or the highest point.
• What is a gap?
• A gap in a distribution refers to an interval where there are no data points present. It represents a break or absence in the continuity of the data.
• What is cluster? explain please.
• It is data is is clustered like 2 or 3 groups together like if it was 4 - 9 and 6-8 had 3 dots then the cluster would be 6-9
• In statistics this is a measure of the variation of the data. For example, the range (difference between maximum and minimum values), the mean absolute deviation (average distance between each point and the median), and interquartile range (distance between the lower and upper quartiles).
• Is the peak the mode version of a frequency distribution?
Is the outlier the tail of the distribution?
The mode is how many times a number occurs. The peak doesn't have to be the mode. It can be but it depends on the data. An outlier is not the tail. An outlier is an outlier. I kind of think of it as being the odd one out of the bunch. for example, if everyone else was wearing pink, purple, yellow, blue, and green t shirts and you wore a black t-shirt you would be the outlier. The tail is more of how many people aren't wearing a certain color like 0 people are wearing Orange, red, and white and that is towards the left. So that would be left-tailed.

Hope this helps,
Aliana
• Whats a outlier
• An outlier is a piece of data that is far away from other data.
• outlier is a small set of data separated from all the big clusters? Right?