Average. It's one of the most commonly used words in data discussions, and also one of the most ambiguous. There are three distinct types of average — mean, median, and mode — and they can produce dramatically different numbers from exactly the same dataset. Politicians, journalists, and businesses routinely choose whichever average makes their argument look most compelling. Understanding all three, and knowing which to use when, is one of the most practical skills in everyday numeracy.
The Mean: The One Everyone Defaults To
The mean — or arithmetic mean — is what most people mean when they say "average." Add up all the values and divide by the number of values. Ten salaries that sum to £400,000 produce a mean salary of £40,000. Simple, consistent, and widely understood.
The mean works beautifully when data is symmetrically distributed with no extreme outliers. It uses every data point and plays nicely with other statistical measures like standard deviation, making it the foundation of most quantitative analysis. For exam scores in a class where everyone performed reasonably similarly, or for repeated measurements of the same component in a manufacturing process, the mean tells you exactly what you want to know.
The problem is sensitivity to outliers. Add one £1,000,000 salary to those ten salaries and the mean jumps to approximately £131,000 — dramatically misrepresenting what a typical person in the group earns. The mean is mathematically dragged toward extreme values, which is why it systematically overstates "typical" in skewed distributions like income, house prices, or corporate pay.
The Median: The Middle Ground
The median is the middle value when all values are sorted in order. For an odd number of values, it's the exact middle. For an even number, it's the average of the two middle values. Finding it requires sorting your data first, which is a small extra step but immediately worth it for skewed datasets.
The median is completely unaffected by outliers. Add that £1,000,000 salary to the dataset and the median barely moves — it's still the value at the midpoint of the sorted distribution. This makes it far more representative of "typical" in any dataset with a long tail of high or low values.
Median house prices, median household income, median NHS wait times — these use the median deliberately because the distributions are right-skewed. The mean would be pulled upward by the few extremely high values, producing a figure that describes almost nobody's actual experience. When someone quotes "average earnings" and uses a mean in a highly unequal pay distribution, they are technically correct and practically misleading simultaneously.
The Mode: The Most Common Value
The mode is the most frequently occurring value in a dataset. It's the only average that can be used with non-numerical data — for instance, the most common shoe size sold, the most frequent customer complaint category, or the most common number of bedrooms in houses on a given street.
For continuous numerical data where every value is slightly different, the mode is often undefined or not meaningful for a small dataset. It comes into its own for categorical data, discrete counts, and "what's most popular" questions rather than "what's central" questions.
A dataset can have no mode (all values unique), one mode (unimodal), two modes (bimodal), or multiple modes (multimodal). Bimodal distributions — where data clusters around two different central values — are a genuinely useful signal that the dataset may contain two distinct sub-populations that shouldn't be lumped into a single average. A bimodal income distribution might reflect two distinct job types in a company; a bimodal age distribution in a survey might reveal two separate groups of respondents.
When Each Type Is Appropriate
Use the mean when: Data is roughly symmetrically distributed with no extreme outliers, or when mathematical combination with other statistics is needed. Examples: test scores in a class; temperature readings over a week; component dimensions in quality control.
Use the median when: Data is skewed or contains outliers that would distort the mean. Examples: incomes, house prices, response times, waiting lists, any dataset where a small number of very high or very low values are present.
Use the mode when: Data is categorical, or when "most common" is the meaningful question. Examples: most popular product configuration, most frequently occurring fault code, most common survey response.
Standard Deviation: The Essential Companion
An average alone tells you where the centre of a distribution sits. Standard deviation tells you how spread out the values are around that centre. Use our standard deviation calculator to find this measure of spread for any dataset you're working with.
A mean salary of £40,000 with a standard deviation of £3,000 tells you most people earn between roughly £34,000 and £46,000 (within two standard deviations). A mean salary of £40,000 with a standard deviation of £20,000 tells you the dataset is broadly distributed and the mean describes almost no individual accurately.
The mean and standard deviation together fully characterise a normally distributed dataset. For skewed data, the median and interquartile range (the middle 50% of values) provide a more honest summary. Choosing between these descriptors based on the data's shape — rather than based on which number looks more flattering — is the hallmark of honest statistical communication.
A Practical Example: House Prices
House prices on a street: £185k, £198k, £210k, £215k, £225k, £240k, £265k, £490k, £1.1m
Mean: £458,000 — heavily distorted by the two high-end properties. Suggests the "typical" house costs nearly half a million, wildly misrepresenting most transactions.
Median: £225,000 — the fifth value in a sorted list of nine. Much more reflective of what most buyers on this street actually paid.
If you want to calculate what percentage above the median the mean sits, or express the gap as a percentage change, our percentage calculator handles that arithmetic instantly.
Spotting Misleading Averages in the Wild
When anyone cites an "average" figure, it's worth asking which type they used and whether it's appropriate for the underlying data. Pay particular attention to income statistics, property market reports, and corporate pay disclosures — these are consistently reported using means when medians would be more representative, producing a gap between "the average" and most people's lived experience that suits the person quoting the number.
Understanding probability helps build the broader intuition for how distributions work. Our probability calculator explores how often different outcomes occur within distributions — directly connected to understanding why the mean and median diverge in skewed data.
The Office for National Statistics publishes detailed guidance on how averages are calculated and reported in official UK statistics, which is worth reading if you regularly engage with government data or need to cite official figures accurately.
