Subscribe by email

2009/09/25

The formula for the mode and the median of grouped data

Any student at GCSE level in the UK needs to know about the three averages- the mean, median and mode. Of these, the mode is the simplest to find- it is the most occurring data point. The median is not much more difficult- it's the middle data point. Even at the higher end of the GCSE spectrum, when you need to find averages from tables in which the data has been grouped, finding the mode is easy: just look for the group with the largest associated frequency. Finding the median is, again, not much more involved: work out in which group the middle value falls. Or is it as simple as that?

Don't get me wrong- the above paragraph describes what you, as a higher level GCSE student, need to understand in order to find the mode and median for a grouped frequency table, and if you're feeling confused you probably shouldn't read any further. But strictly speaking, you haven't found the mode or the median. You've found the modal class and median class of the data. That is, the class (or group) in which the median and mode lie.


So how do I find the mode and median?
Let me just stress again that for the purposes of your GCSE, what you've been told to do to find the modal and median classes is exactly what you need to do. This post is purely for those who want to stretch themselves a little further than the bounds of their GCSE course.

It must be noted that, such is the nature of grouped frequency tables, it is not possible to calculate a definite average for the data. A value calculated for the mean, median or mode of grouped data must be referred to as an estimate.

The mode for grouped data
You can calculate the mode for a grouped frequency table by using the following formula:






Where:
  • L is the lower class boundary of the modal class.
  • fm is the frequency associated with the modal class.
  • f1 is the frequency of the class before to the modal class.
  • f2 is the frequency of the class after the modal class.
  • h is the difference between the upper and lower bounds of the modal class.

The median for grouped data
You can calculate the median for a grouped frequency table using the following formula:






Where:

  • L is the lower class boundary of the median class.
  • f is the frequency associated with the median class.
  • n is the total number of observations (i.e. the total of the Frequency column).
  • c is the cumulative frequency up to the class before the median class.
  • h is the difference between the upper and lower bounds of the median class.

Where do these formulae come from?
For the mode, imagine plotting the frequency of each group with its midpoint, and then joining the points up with a  smooth curve. With most distributions you would see a single peak to this curve. The formula calculates an estimate for the point on the x-axis that is directly below this peak.

For the median, imagine plotting a cumulative frequency graph of the data in your table. To find an estimate of the median, you would find half of your total frequency on the cumulative frequency axis, draw a line horizontally to your hand-drawn c.f. curve, and then drop vertically to the x-axis. Whatever this value is would be your estimate for the median. The formula just does this for you.

No comments:

Post a Comment

Popular Posts

My Blog List

Creative Commons Licencing Information