If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: Statistics and probability>Unit 4

Lesson 4: Density curves

# Worked example finding area under density curves

Worked example finding area under density curves.

## Want to join the conversation?

• tell now i dont really understand why the area always 1 , if we consider our set of data to be (1,3,4) and another set to be (321,4,21,4,51,3,321) how the two will confirm the same area of 1 . i understand the concept of the 100% that the curve represented the entire data of my set but why the area will be always 1 ?!
• 1 represents the 100% of the complete dataset. So given the fact you put all of your data in the density curve, it should be 100%. We use the decimal form of the percentage, because we want to use numbers to calculate area.
• could you help me get out of that paradox please? i know that 1 represents 100% in general but if we have intervals of 10 (mean 60-70 and 70-80 and 80-90) for example then the area under the relative frequency graph will be 100% of that 10 means 10 not 1.
say:
10(20%)+10(15)+10(65%) this will sum up an area of 10 not 1
• If we were to make a density curve from this bar graph we would assume that the 20% that are in the 60-70 bucket are spread out evenly, so there are 2% between 60-61, 2% between 61-62, and so on...
The same process would be done to the 70-80 and 80-90 buckets, and we would end up with the sum
10(2%) + 10(1.5%) + 10(6.5%) = 1
• How do you deduce that height of A is 0.25 and of B is 0.50. It is simply not known.
Better way would have been to divide area into rectangle & triangle and calculate it.
• The y-axis shows that three grid lines up corresponds to 0.75, which means one grid line up corresponds to 0.25. Looking at the diagonal line, we see that the leftmost point is one grid line up, and the midpoint is two grid lines up, meaning those points are at a height of 0.25 and 0.50, respectively.
• how do you know it was 0.25 i dont understand that
• How do you find Q1 and Q3 in this case?
• To find Q1 and Q3 (the first and third quartiles), you would need to integrate the density curve to determine the cumulative distribution function (CDF) and then locate the points where the CDF equals 0.25 and 0.75, respectively. This involves finding the values of x corresponding to these cumulative probabilities.
(1 vote)
• is there a mathematical reason why the vertical lines are dotted instead of drawn in a line? (perhaps because a density 'curve' must be a function?)
(1 vote)
• It indicates the boundary. The dotted lines imply '1' and '3' are not part of the probability density function.

As such it doesn`t matter in terms of finding the area as including '1' and '3' will make an infinitely small difference to the area.

In other word including 1 and 3 has no noticeable impact on the percentage for the probability density function.
• Right so in the first problem, when you find the area of the trapezoid, won't you be finding the area of the curve when x is more than or equal to two instead of when x is more than two?
(1 vote)
• The area of the trapezoid represents the total area under the density curve from x=1 to x=3, inclusive. Since the problem asks for the percent of the area where x is more than two, the entire area of the trapezoid is considered, including the portion where x equals two.
(1 vote)
• Suppose we have the data : 10,10.5,11,17,19 and we map it into two intervals 10-15 and 15-20. So, the interval 10-15 will contain 3/5 data points and the interval 15-20 will contain 2/5 data points.
Now if we calculate the area (as sum of areas of two rectangles), it comes out to be:
area=3/5*5 + 2/5*5 =5 but not 1.
Can anyone explain this to me?
Thanks.
(1 vote)
• In your example, if you have mapped the data into two intervals and calculated the area as the sum of the areas of two rectangles, it's important to note that the total area under a density curve should indeed be 1. However, your calculation seems to be based on rectangular approximations rather than directly integrating the density curve. The discrepancy in the area calculation might arise from this approximation method. To accurately represent the density curve, you would need to integrate the density function within each interval and then sum up the areas.
(1 vote)
• Why do you put a line through your 7? Should I be doing that?
(1 vote)
• Does anyone know how you could find the area of different percentiles if the hight was not given?

I'm working on a problem where the density curve is a perfect triangle. The base is given (1.6) and I understand that the total area is equal to 1.0. Using A=1/2(b)(h) you can find H.

But now I'm stuck in finding the percentages of selected areas within the triangle such as: What is the percentage of values that are below 0.4?

I see that a new triangle is formed and if you can find that area you can subtract from one and find your answer, but in this case, we have two variables and I'm stuck.