Statistics and probability
- Worked example: Creating a box plot (odd number of data points)
- Worked example: Creating a box plot (even number of data points)
- Constructing a box plot
- Creating box plots
- Reading box plots
- Reading box plots
- Interpreting box plots
- Interpreting quartiles
- Box plot review
- Judging outliers in a dataset
- Identifying outliers
- Identifying outliers with the 1.5xIQR rule
What is a box and whisker plot?
A box and whisker plot—also called a box plot—displays the five-number summary of a set of data. The five-number summary is the minimum, first quartile, median, third quartile, and maximum.
In a box plot, we draw a box from the first quartile to the third quartile. A vertical line goes through the box at the median. The whiskers go from each quartile to the minimum or maximum.
Example: Finding the five-number summary
A sample of
boxes of raisins has these weights (in grams):
Make a box plot of the data.
Step 1: Order the data from smallest to largest.
Our data is already in order.
Step 2: Find the median.
The median is the mean of the middle two numbers:
The median is
Step 3: Find the quartiles.
The first quartile is the median of the data points to the left of the median.
The third quartile is the median of the data points to the right of the median.
Step 4: Complete the five-number summary by finding the min and the max.
The min is the smallest data point, which is
The max is the largest data point, which is
The five-number summary is
, , , , .
Example (continued): Making a box plot
Let's make a box plot for the same dataset from above.
Step 1: Scale and label an axis that fits the five-number summary.
Step 2: Draw a box from
to with a vertical line through the median.
, the median is , and
Step 3: Draw a whisker from
to the min and from to the max.
Recall that the min is
and the max is .
The five-number summary divides the data into sections that each contain approximately
of the data in that set.
Example: Interpreting quartiles
About what percent of the boxes of raisins weighed more than
, about of data is lower than and about is above is .
of the boxes of raisins weighed more than grams.
Want to join the conversation?
- How do you find the MAD(23 votes)
- Step 1: Calculate the mean.
Step 2: Calculate how far away each data point is from the mean using positive distances. These are called absolute deviations.
Step 3: Add those deviations together.
Step 4: Divide the sum by the number of data points.(53 votes)
- how do you find the median,mode,mean,and range please help me on this somebody i'm doom if i don't get this(10 votes)
- The median is the middle number in the data set.
The mode is the number that shows up the most in the data set.
The mean is the average number of the data set (to find it, you have to add up all of the numbers (sum) and then divide it by how many numbers there are).
The range is the number when you subtract the highest number and the lowest number.
Ex. The highest number in the data set is 10. The lowest number in the data set is 5. To find the range, you would do 10-5, so the range of the data set is 5.(19 votes)
- How do you find the mean from the box-plot itself?(12 votes)
- You cannot find the mean from the box plot itself. The information that you get from the box plot is the five number summary, which is the minimum, first quartile, median, third quartile, and maximum.(13 votes)
- If the median is a number from the actual dataset then do you include that number when looking for Q1 and Q3 or do you exclude it and then find the median of the left and right numbers in the set?(4 votes)
- If the median is a number from the data set, it gets excluded when you calculate the Q1 and Q3. If the median is not a number from the data set and is instead the average of the two middle numbers, the lower middle number is used for the Q1 and the upper middle number is used for the Q3. :)(8 votes)
- How should I draw the box plot? Is there a certain way to draw it?(3 votes)
- How do you fund the mean for numbers with a %.(5 votes)
- Hey, I had a question. So, when you have the box plot but didn't sort out the data, how do you set up the proportion to find the percentage (not percentile). For example, take this question: "What percent of the students in class 2 scored between a 65 and an 85?"(2 votes)
- Ok so I'll try to explain it without a diagram
The space between the lowest value and quartile 1 is 25% or 1/4. Quartile 1 to the median is another 25% making it 50% so far. The median to the 3rd quartile is another 25% and the 3rd quartile to the highest value if obviously 25% more. So that's 100%.
If 65 is the lowest value and 85 is between the lowest value and quartile 1, then 25% of the students in class scored between 65 and 85. If 65 and 85 go through the lowest value to quartile 1 and to the median then that would be 50%.
I hope this helps? I would need the diagram to explain it better though. I think the Interpreting Quartiles section of the article will explain it better with the visual.
Note: If you ever come across a question with the mean of a box plot, just say there is none. It's impossible to calculate the mean since we don't have all the data; only parts of it. You can estimate the mean, but not calculate it exactly.
Again, hopes this helps :-)(2 votes)
- How do you find the best estimate for the mean at least?(2 votes)
- um. I dont think you really have to estimate. you just add all the values of something together, then divide by the number of numbers there are.(1 vote)