Using statistics

Past Papers

Multimedia

Forum

QuizHub

Tutorial

School

calendar_month Last update: 2025-09-07

visibilityViewed: 5

bug_report Crash report

Using statistics

calendar_month 2025-09-07

visibility 5

bug_report Crash report

Unit 1: Probability
Unit 2: Data Collection
Unit 3: Interpreting and discussing results

🎯 In this topic you will

Use mode, median, mean, and range to compare sets of data

🧠 Key Words

bimodal
mean
median
mode
range
statistical measures

Show Definitions

bimodal: A data set with two values that occur most frequently (two modes).
mean: The average of a set of numbers, calculated by dividing the total by the number of items.
median: The middle value of a set of ordered numbers.
mode: The value that appears most frequently in a data set.
range: The difference between the largest and smallest values in a data set.
statistical measures: Quantities such as mean, median, mode, and range used to describe data sets.

You already know how to work out some statistical measures, such as the mode, median, mean and range.

The mode is the most common value or number.

If a set of data has two modes, it is called bimodal.

The median is the middle value when they are listed in order of increasing size.

The mean is the sum of all the values divided by the number of values.

The range is the largest value minus the smallest value.

In a real situation, you must decide which measure to use.

If you want to measure how spread out a set of measurements is, the range is the most useful statistic.

If you want to find a representative measurement, you need an average.

But should the average be the mode, the median or the mean? Which average to use depends on the particular situation.

Here is a summary to help you decide which average to choose:

Choose the mode when you want to know which is the most commonly occurring number or numbers.
The median is the middle value when the data values are put in order of increasing size. Half the numbers are greater than the median and half the numbers are less than the median.
The mean depends on every value. When you change one number, you change the mean.

Worked example

Here are the ages, in years, of the players in a football team:

16, 17, 18, 18, 19, 20, 20, 21, 21, 32, 41

a. Work out the mode, median and mean age.
b. Which average best represents the data? Give a reason.
c. Work out the range in ages of the players.
d. A different football team has a range in ages of 14 years. Which team, the first or the second, has more variation in ages?

Answer:

Mode: 18, 20, 21 (all appear twice)
Median: With 11 players, the middle value is the 6th → 20 years
Mean: Sum of ages = 243, divide by 11 → 22.1 years

b. The median is best because the mean is skewed by two much older players. The mode is not helpful as there are three of them. The median balances the data with five players younger and five older.

c. Range = 41 − 16 = 25 years

d. First team’s range = 25, second team’s range = 14. Since 25 > 14, the first team has more variation in ages.

Summary of averages:

The mode is the most common value(s).
The median is the middle value when data are ordered.
The mean is the sum of values divided by the number of items.

The median is often best for skewed data, while the mean is useful when values are evenly spread. The range shows variation by subtracting the smallest from the largest value.

🧠 PROBLEM-SOLVING Strategy

Using Mode, Median, Mean & Range to Compare Data

Choose the right statistic for the question you’re answering, and justify why it’s appropriate.

Mode: most common value(s). If two modes → bimodal.
Median: middle value when ordered (half above, half below).
Mean: sum of all values ÷ number of values.
Range: largest − smallest (a simple measure of spread).

List the question goal. Typical goals: “representative value” or “compare variation.”
Scan for outliers/skew. Extreme values pull the mean; median resists them.
Check data type. Categories → mode; numerical (ordered) → median/mean.
Compute cleanly. Order data for median; tally frequencies for mode; use totals for mean; identify min/max for range.
Compare sets. Use the same statistic across sets (e.g., median vs. median) and comment on differences.
Justify. Link a feature of the data (skew, clusters, repeats) to your chosen measure.

When to use	Statistic	Why it fits	Watch out for…
Most common category/score	Mode	Identifies the typical or most frequent value.	Can be non-unique (bimodal) or unhelpful if flat.
Skewed data / outliers present	Median	Unaffected by extremes; splits data in half.	Requires ordering; doesn’t use all magnitudes.
Balanced, no strong outliers; need arithmetic average	Mean	Uses every value; good for further calculations.	Sensitive to extremes; can be misleading if skewed.
Compare variability (spread)	Range	Quick sense of spread (max − min).	Only two values; very sensitive to outliers.

Worked example — football team ages

Data: 16, 17, 18, 18, 19, 20, 20, 21, 21, 32, 41

Mode: 18, 20, 21 (each appears twice).
Median: 20 (6th of 11 after ordering).
Mean: total 243 ÷ 11 = 22.1 (yrs).
Range: 41 − 16 = 25 (yrs).
Best average: Median (mean pulled up by two older players; multiple modes are less informative).
Variation comparison: Team 1 range 25 vs. Team 2 range 14 → Team 1 has more variation.

Justification starters

“The data are skewed with outliers, so the median best represents a typical value.”
“We need a single number that uses every value, so the mean is appropriate.”
“We’re comparing how spread out the sets are, so I’ll compare the ranges.”
“These are categories and we want what happens most often, so the mode fits.”

Common pitfalls

Quoting the mean when a few extreme values dominate.
Using the range as the only measure of spread (it ignores most data).
Reporting the mode when several modes exist and none is clearly typical.

Apply this strategy to the exercises

1 (waiting times): Order values; find mode(s), median (10th/11th avg), mean (total ÷ 20); compare range (May) with 3 (June).
2 (fitness class ages): Use frequency tallies for mode; order for median; mean from total ÷ count; discuss which average best represents the group and compare ranges.
Think like a Mathematician (rain days): Use frequencies to compute mode, median (cumulative frequency), and mean (Σd·f ÷ Σf); justify the best average.
4 (belt sizes): Choose a stock size using the mode or a median of the demand distribution; justify.
5 (people per car): Correct the misunderstanding: the modal value is the number of people with the largest frequency, not the largest frequency itself.
6 (test scores): Use the distribution to count who is above median/mode/mean and decide which average best summarizes performance.
7 (two dice): After 40 trials, compute mode/median/mean of your scores and justify which best represents your class data.

❓ EXERCISES

1. Zaralia works for $20$ days each month. She records the time she waits in line for lunch each day in May. Here are the times, in minutes.

$2$	$5$	$3$	$8$	$5$	$2$	$10$	$7$	$8$	$8$
$4$	$7$	$2$	$2$	$3$	$6$	$10$	$3$	$4$	$7$

🧠 Tip: Start by writing the list of times in order from smallest to largest. Range $=\ \text{largest value} - \text{smallest value}$.

🧠 Tip: The month with the larger range has more variation in the waiting times.

a. Work out the:

i. $\text{mode}$ ii. $\text{median}$ iii. $\text{mean}$ time

b. Which average best represents the data? Give a reason for your choice of average.

c. Work out the range in Zaralia’s waiting times.

d. In June, Zaralia’s range in waiting times is $3$ minutes. In which month, May or June, is there more variation in her waiting times?

👀 Show answer

Ordered data: $2,2,2,2,3,3,3,4,4,5,5,6,7,7,7,8,8,8,10,10$.

i. Mode $=2$ (appears $4$ times).

ii. Median $=\dfrac{5+5}{2}=5$.

iii. Mean $=\dfrac{106}{20}=5.3\ \text{minutes}$.

b. The median ($5$) best represents the data because the distribution is slightly skewed by larger values ($8$–$10$), so the mean is pulled upward.

c. Range $=10-2=8\ \text{minutes}$.

d. May has more variation (range $8$) than June (range $3$).

2. These are the ages, in years, of the members of a fitness class.

$57,\ 56,\ 51,\ 59,\ 51,\ 56,\ 58,\ 58,\ 51,\ 53,\ 50,\ 51,\ 54,\ 51$

a. Work out the:

i. $\text{mode}$ ii. $\text{median}$ iii. $\text{mean}$ age

b. Marcus and Arun discuss which average best represents the data. What do you think? Which average would you use? Give a reason for your choice.

c. Work out the range in ages of the members of the fitness class.

d. A different fitness class has a range in ages of $16$ years. Which fitness class, the first or the second, has less variation in ages of the members?

👀 Show answer

Ordered data: $50,51,51,51,51,51,53,54,56,56,57,58,58,59$.

i. Mode $=51$ (most frequent).

ii. Median $=\dfrac{53+54}{2}=53.5$.

iii. Mean $=\dfrac{756}{14}=54$.

b. Use the mode ($51$). Nearly half the members are $51$, so the mode best describes the “typical” age; the mean ($54$) is higher because of a few older members.

c. Range $=59-50=9$ years.

d. The first class (range $9$) has less variation than the second class (range $16$).

🧠 Think like a Mathematician

Days of rain in first week of May (over 35 years)

Days of rain (d)	0	1	2	3	4	5	6	7
Frequency (f)	13	7	5	2	0	3	2	3

a) Copy and complete the working to find the mode, median and mean number of days of rain.
b) Petra says the mean best represents the data. Which average would you choose? Why?
c) Describe how algebra helped in part a.

👀 show answers

Mode: Greatest frequency is 13 at 0 days → mode = 0 days.

Median: There are 35 years, so the median position is $(35+1)/2 = 18$. Cumulative frequencies up to 0 days = 13; up to 1 day = 20. The 18th value lies in the 1-day category → median = 1 day.

Mean: Compute $\sum d\times f$: $0·13=0,\;1·7=7,\;2·5=10,\;3·2=6,\;4·0=0,\;5·3=15,\;6·2=12,\;7·3=21$. Total days $= 0+7+10+6+0+15+12+21 = \mathbf{71}$. Mean $= 71/35 = \mathbf{2.03}$ days (2 d.p.).

b) The data are skewed towards small numbers of rainy days (many 0s and 1s; a few large values). The median (1 day) is a better “typical” value because the mean (2.03) is pulled upward by the few wet years. So I’d choose the median to represent this distribution.

c) Algebra used: $ \sum f = 35$ to confirm the total number of years; median position $= (n+1)/2$; mean $= \dfrac{\sum d f}{\sum f} = \dfrac{71}{35}$. Forming and evaluating the products $d f$ is the key algebraic step.

❓ EXERCISES

4. This table shows the number of men’s belts sold in a store during one month.

$Length\ (cm)$	$80$	$85$	$90$	$95$	$100$	$105$	$110$	$115$
$Frequency$	$6$	$16$	$28$	$41$	$17$	$18$	$10$	$13$

Use an appropriate average to decide which size of belt the store owner should always try to keep in stock.

👀 Show answer

The appropriate average here is the mode (most sold size). The highest frequency is $41$ at $95\ \text{cm}$, so the store should always stock belts of length $95\ \text{cm}$.

5. Arun records the number of people in $60$ passing cars. Here are his results.

$Number\ of\ people$	$1$	$2$	$3$	$4$	$5$	$6$
$Frequency$	$28$		$3$	$6$	$2$	$1$

🧠 Tip

Remember: You can say $“$the mode is $28$” or $“$the modal value is $28$”. They mean the same thing.

a. Find the missing frequency.

b. Arun says: I think the modal number of people per car is $28$ because $28$ is the largest frequency.

Explain the mistake that Arun has made.

c. How can you tell, by looking at the table, that the median is $2$ people per car?

d. Arun works out that the mean is $1.95$ people per car. Show that Arun is correct.

e. Which average best represents the data? Give a reason for your choice of average.

👀 Show answer

a. Total cars $=60$. Missing frequency for $2$ people $=60-(28+3+6+2+1)=60-40=20$.

b. The mode is the value with the greatest frequency, not the frequency itself. Since the greatest frequency is $28$ for $1$ person, the modal number of people per car is $1$ (not $28$).

c. Cumulative frequencies: $1$ person $\to 28$ cars; $2$ people $\to 28+20=48$. The middle positions (the $30$th and $31$st of $60$) fall within the $2$-people group, so the median is $2$.

d. $\dfrac{1\cdot28+2\cdot20+3\cdot3+4\cdot6+5\cdot2+6\cdot1}{60}=\dfrac{28+40+9+24+10+6}{60}=\dfrac{117}{60}=1.95$.

e. The median ($2$) best represents the data: the distribution is skewed with a long tail to higher counts, so the median is a stable central value. (The mode is $1$, but median better reflects typical occupancy considering the tail.)

6. A test has ten questions. A total of $120$ students take the test. The table shows the students’ test scores.

$Questions\ answered\ correctly$	$4$	$5$	$6$	$7$	$8$	$9$	$10$
$Frequency$	$3$	$5$	$12$	$13$	$17$	$30$	$40$

a. How many students scored:

i. more than the median?

ii. more than the mode?

iii. more than the mean?

b. Which average best represents the data? Give a reason for your choice of average.

👀 Show answer

Median: with $n=120$, the $60^{\text{th}}$ and $61^{\text{st}}$ values lie in score $9$ (cumulatives: up to $8$ is $50$, up to $9$ is $80$), so median $=9$.

i. More than the median $\Rightarrow$ scores $>9$: only $10 \Rightarrow 40$ students.

Mode: the largest frequency is $40$ at score $10$, so mode $=10$.

ii. More than the mode $\Rightarrow$ scores $>10$: $0$ students.

Mean: $\dfrac{4\cdot3+5\cdot5+6\cdot12+7\cdot13+8\cdot17+9\cdot30+10\cdot40}{120}=\dfrac{1006}{120}\approx 8.383\ldots$

iii. More than the mean $\Rightarrow$ scores $\ge 9$: $30+40=70$ students.

b. The median ($9$) best represents the data: it indicates a central score without being affected by the ceiling at $10$ or by the lower outliers, while a modal score of $10$ would overstate typical performance.

Think like a Mathematician

7 Work with a partner or in a small group to answer this question.

You are going to roll two dice and add the numbers on the dice to give the score.
For example, if you roll these numbers, you get a score of 7.

a) What is the smallest score you can get?

👀 show answer

The smallest score is 2 (when both dice show 1).

b) What is the largest score you can get?

👀 show answer

The largest score is 12 (when both dice show 6).

You are going to roll the dice 40 times.

c) Draw a table ready to record the scores you will get.
Your table needs to have a ‘Tally’ column and a ‘Frequency’ column.

👀 show answer

A table with columns: Score | Tally | Frequency.
Example (structure only):

Score	Tally	Frequency
2
3
4
…
12

d) Now roll the dice 40 times and record all your scores. When you have finished, make sure your frequency column adds up to 40.

👀 show answer

Students’ results will vary. The frequency column must sum to 40.

e) For your set of data, work out the:

i) mode
ii) median
iii) mean score

👀 show answer

Answers depend on collected data.

Mode: the most frequent score.
Median: the middle score when data are ordered.
Mean: add all scores and divide by 40.

f) Which average best represents your data? Give a reason for your choice of average.

👀 show answer

Usually, the mean represents the data best, since it uses all values. But the mode may also be useful to highlight the most likely score.

g) Compare your data and averages with those of other learners in your class.
Do you have different averages? Do you have the same averages? Discuss why.

👀 show answer

Averages may differ due to chance variation in dice rolls, but overall they should be similar. With larger numbers of trials, results become closer to expected probabilities.

🔗 Learning Bridge · Averages & Range

You’ve just refreshed the definitions of mode, median, mean and range. Next, you’ll use them to summarise and compare real data sets — and justify which statistic is most appropriate.

Match goal → statistic:
- Most common → Mode (can be bimodal).
- Typical value with outliers → Median (resistant to extremes).
- Arithmetic average for balanced data → Mean (uses every value).
- Spread/consistency → Range (max − min).
Scan for skew/outliers — they pull the mean but not the median.
Compare like with like — median vs median, mean vs mean, range vs range.
Justify — link a data feature (e.g. “two very large values”) to your choice (“so median is better”).

Quick checks

Median (odd n): order data → take middle. (even n): average the two middles.
Mean: total ÷ count. A single extreme changes it.
Range: largest − smallest; larger = more variation.

Mini example

Ages: 16, 17, 18, 18, 19, 20, 20, 21, 21, 32, 41 → mode = 18, 20, 21 · median = 20 · mean = 243÷11 ≈ 22.1 · range = 41−16 = 25.
Best average? Median (mean is skewed by two older players).

Tip: When comparing two groups, pair an average (mean/median) with the range for a fuller story: “Group A is higher on average, Group B is more consistent.”

You can use an average to summarise a set of data. This could be the mode, median or mean.

You can use the range to measure the spread of the data. The larger the range, the more varied the data.

You already know how to work out the mode, median, mean and range. Here is a reminder:

The mode is the most common value or number.
The median is the middle value, when they are listed in order.
The mean is the sum of all the values divided by the number of values.
The range is the largest value minus the smallest value.

You can use these statistics to compare two or more sets of data.

Worked example

A health club recorded the masses (in kilograms) of eight men and six women.

Men: 65, 79, 68, 72, 77, 77, 81, 67
Women: 68, 52, 47, 49, 50, 58

Calculate the mean and the range of each set of data and use these values to compare the two sets.

Answer:

Men’s mean: $(65 + 79 + 68 + 72 + 77 + 77 + 81 + 67) \div 8 = 586 \div 8 = 73.25 \,\text{kg}$

Women’s mean: $(68 + 52 + 47 + 49 + 50 + 58) \div 6 = 324 \div 6 = 54 \,\text{kg}$

On average, the men are $73.25 - 54 = \mathbf{19.25 \,\text{kg}}$ heavier than the women.

Range (men): $81 - 65 = 16 \,\text{kg}$

Range (women): $68 - 47 = 21 \,\text{kg}$

The women’s masses are more varied than the men’s (21 kg compared with 16 kg).

Interpreting the results:

The mean gives the average value of each group. Men are about 19 kg heavier on average.
The range shows variation. Women’s weights are spread across 21 kg, while men’s span 16 kg.
This means women’s masses are less consistent (more variation) than men’s.

🧠 PROBLEM-SOLVING Strategy

Summarising & Comparing Data with Averages and Range

Use an average (mode, median, or mean) to summarise a set of data, and the range to compare how varied the data are.

Mode: most common value(s). (Two modes → bimodal.)
Median: middle value when ordered (half above, half below).
Mean: sum of values ÷ number of values.
Range: largest − smallest. Larger range → more variation.

Clarify the goal. Do you need a representative value (use an average) or to compare spread (use range)?
Scan the data. Are there outliers or skew? (If yes, prefer the median over the mean.)
Choose the average.
- Mode for the most common category/score.
- Median for skewed data or when extremes would distort the mean.
- Mean when values are fairly balanced and you want an arithmetic average.
Compute cleanly. Order for median; tally for mode; total ÷ count for mean; max − min for range.
Compare sets consistently. Use the same measure(s) for each set and interpret: who is larger “on average”? who is more/less varied?
Justify your choice. Link a feature (skew, repeats, extremes) to why your statistic is appropriate.

Question type	Best statistic	Reason	Caution
Most popular category/score	Mode	Identifies what occurs most often.	May be multiple modes or none useful.
Skewed data / outliers present	Median	Resistant to extremes; typical middle.	Doesn’t reflect all magnitudes.
Balanced data; arithmetic summary	Mean	Uses every value; good for further maths.	Pulled by outliers.
Compare spread/consistency	Range	Quick measure of variation.	Only uses extremes; sensitive to outliers.

Worked example — Health club masses (kg)

Men: 65, 79, 68, 72, 77, 77, 81, 67 Women: 68, 52, 47, 49, 50, 58

Mean (men): $(65+79+68+72+77+77+81+67)\div 8 = 586\div 8 = 73.25\ \text{kg}$
Mean (women): $(68+52+47+49+50+58)\div 6 = 324\div 6 = 54\ \text{kg}$
Comparison of means: men are $73.25-54=\mathbf{19.25\ \text{kg}}$ heavier on average.
Range (men): $81-65=16\ \text{kg}$ | Range (women): $68-47=21\ \text{kg}$
Interpretation: women’s masses are more varied (21 kg vs 16 kg).

Justification starters

“We are comparing averages, so I computed the means for each set and compared them.”
“Because the data seem skewed/with outliers, the median better represents a typical value.”
“To compare consistency, I compared the ranges; a smaller range means more consistent.”

Common pitfalls

Using the mean when a few extremes distort the result.
Claiming “more varied” without computing or citing the range.
Confusing the mode (most common value) with the largest frequency number itself.

Apply this strategy to the exercises

1 (World Cup goals): Compute each team’s mean and range; compare average goals and variability.
2 (Heights, two groups): Order values → median and range; compare who is taller “on average” (median) and who is more consistent (range).
3 (City temperatures): Order values → find mode and range; compare typical temperature (mode) and variation (range).
4 (Babies’ masses): Use totals ÷ counts → compare means to decide who is heavier on average.
5 (Test marks): Build a small summary (mode/median/mean, range) for each subject; choose which average best compares performance and use range for consistency.
6 (Two experiments): For each: compute mean, median, range; judge statements about “higher on average” and “more varied”; discuss if a mode exists.
7 (Two-dice difference): After collecting 40 scores, compute mode, median, mean; choose the best representative and justify.
8 (Hockey teams, frequency tables): From frequencies, find mode, mean (use Σx·f ÷ Σf), and range; explain which average best represents the data and why different students could all have reasonable choices.

❓ EXERCISES

1. In the 2010 football World Cup, Spain won and Brazil was knocked out in the quarter finals. The numbers of goals they scored in their matches are shown.

Spain: $0,2,2,1,1,1,1$

Brazil: $2,3,0,3,1$

a. Work out the mean score for each team.

b. Use the means to state which team scored more goals, on average, per match.

c. Work out the range for each team.

d. Use the ranges to state which team’s scores were more varied.

👀 Show answer

a. Spain: $\dfrac{0+2+2+1+1+1+1}{7}=\dfrac{8}{7}\approx1.14$ goals. Brazil: $\dfrac{2+3+0+3+1}{5}=\dfrac{9}{5}=1.8$ goals.

b. Brazil scored more goals on average.

c. Spain: $2-0=2$. Brazil: $3-0=3$.

d. Brazil’s scores were more varied (larger range).

2. A teacher measured the heights of two groups of children.

Group A: $84,73,89,80,77$ cm

Group B: $77,85,75,69,82,67,72$ cm

a. For each group:

i. write the heights in order of size

ii. write the median height

iii. work out the range in heights.

b. Use the medians to state which group is taller, on average.

c. Use the ranges to state which group’s heights are less varied.

👀 Show answer

Group A ordered: $73,77,80,84,89$. Median $=80$. Range $=89-73=16$.

Group B ordered: $67,69,72,75,77,82,85$. Median $=75$. Range $=85-67=18$.

b. Group A is taller on average (median $80$ vs $75$).

c. Group A’s heights are less varied (range $16$ vs $18$).

3. The maximum daytime temperature $(^\circ C)$ was recorded in Madrid and Cartagena during one week in August.

Madrid: $38,34,36,32,35,37,36$

Cartagena: $30,32,29,30,28,30,33$

a. For each city:

i. write the temperatures in order of size

ii. write the modal temperature

iii. work out the range in temperatures.

b. Use the modes to state which city is hotter, on average.

c. Use the ranges to state which city’s temperatures are more varied.

👀 Show answer

Madrid ordered: $32,34,35,36,36,37,38$. Mode $=36$. Range $=38-32=6$.

Cartagena ordered: $28,29,30,30,30,32,33$. Mode $=30$. Range $=33-28=5$.

b. Madrid is hotter on average (mode $36$ vs $30$).

c. Madrid’s temperatures are more varied (range $6$ vs $5$).

4. A nurse measured the total mass of $20$ baby boys as $64$ kg. The total mass of $15$ baby girls was $51$ kg. Which babies were heavier on average, the boys or the girls? Give a reason for your answer.

👀 Show answer

Boys’ mean mass $=\dfrac{64}{20}=3.2\ \text{kg}$. Girls’ mean mass $=\dfrac{51}{15}=3.4\ \text{kg}$. Girls were heavier on average.

Think like a Mathematician

5. The test marks of two groups of students are shown.

Maths: 77, 89, 75, 80, 80, 91, 78, 76, 76, 76
Science: 72, 79, 77, 87, 81, 62, 75, 87

a) Copy and complete this table.

👀 show answer

	Mean	Median	Mode	Range
Maths	79.8	77.5	76	16
Science	77.5	78	87	25

(Mean values rounded to 1 d.p.)

b) In which group, Maths or Science, do you think the students did better on average?

👀 show answer

Maths. Using the mean, Maths ≈ 79.8 vs Science ≈ 77.5.

c) In which group, Maths or Science, do you think the students had more consistent scores?

👀 show answer

Maths is more consistent. Its range is 16 compared with Science’s 25 (smaller spread).

d) Compare your answers to parts b and c with those of other learners in the class. Discuss these questions.
i) Which average did you use to compare the scores? Why did you use this average? Why did you not use the other averages?
ii) What does ‘more consistent’ mean? What statistic did you use to decide which group had more consistent scores?

👀 show answer

i) The mean is a good choice because it uses all the data. The median is also reasonable and is less affected by the low outlier 62 in Science. The mode is less helpful here because each list is fairly spread and a single most common value doesn’t summarise overall performance well.

ii) ‘More consistent’ means the scores are closer together (smaller spread). A simple measure is the range; Maths has the smaller range (16 vs 25), so it is more consistent.

e) Now you have discussed the answers of other learners in your class, which average do you think is the best to use to compare these scores? Explain why.

👀 show answer

The median is the best single comparison here because the Science data include a low outlier (62) that pulls the mean down. The median resists outliers and reflects the typical score more fairly. (Using the median: Maths 77.5 vs Science 78 — very close, so any conclusion should be cautious.)

❓ EXERCISES

6. Nialls recorded the temperatures in two experiments.

Experiment	Temperatures $(^\circ C)$
First experiment	$29, 28, 21, 33, 30$
Second experiment	$28, 29, 28, 33, 32, 31, 32, 29$

a. Work out the mean, median and range for each experiment.

b. State whether each of these statements is True (T) or False (F). Justify your answers.

i. The temperatures in the first experiment are higher, on average, than the temperatures in the second experiment.

ii. The temperatures in the first experiment are more varied than the temperatures in the second experiment.

c. Is it possible to work out the modal temperature for each experiment? Explain your answer.

👀 Show answer

First experiment: Mean $=\dfrac{29+28+21+33+30}{5}=\dfrac{141}{5}=28.2$. Median $=29$. Range $=33-21=12$.

Second experiment: Mean $=\dfrac{28+29+28+33+32+31+32+29}{8}=\dfrac{242}{8}=30.25$. Median $=(29+31)/2=30$. Range $=33-28=5$.

i. False. First experiment mean $=28.2$, second experiment mean $=30.25$.

ii. True. First experiment range $=12$, second experiment range $=5$.

c. Yes. First experiment has no repeated values, so no mode. Second experiment has a mode ($28$ and $32$, both appear twice) so it is bimodal.

Think like a Mathematician

7. Work with a partner or in a small group to answer this question.

You are going to roll two dice and subtract the numbers on the dice to give a score. Always subtract to give a positive, or zero, score (use the difference).

Tip: For example, if you roll a 6 and a 1, the score is 5.

a) What is the smallest score you can get?

👀 show answer

The smallest possible score is 0 (when both dice show the same number).

b) What is the largest score you can get?

👀 show answer

The largest possible score is 5 (difference between 6 and 1).

You are going to roll the dice 40 times.

c) Draw a table to record the scores you get. Your table needs a ‘Tally’ column and a ‘Frequency’ column.

👀 show answer

Score	Tally	Frequency
0
1
2
3
4
5

d) Now roll the dice 40 times and record all your scores. When you have finished, make sure the frequency column adds up to 40.

👀 show answer

Student results will vary. Check that the total frequency is 40.

e) Work out for your data:

i) the mode
ii) the median
iii) the mean score for your data

👀 show answer

Results depend on your table.

Mode: the score with the greatest frequency.
Median: the middle value when the 40 scores are ordered (average of the 20th and 21st).
Mean: $\displaystyle \text{mean}=\frac{\text{sum of all 40 scores}}{40}$.

Theoretical (fair dice) expectations: mode $=1$, median $=2$, mean $\approx 1.94$.

f) Which average best represents your data? Give a reason for your choice of average.

👀 show answer

The mean often best represents the data because it uses all values. The median is also reasonable if your sample is small or uneven; it is not affected by occasional unusual results. Either choice is acceptable with a clear justification.

g) Compare your data and averages with other learners in the class. Do you have different averages? Do you have the same averages? Have you chosen the same average to represent your data? Discuss your answers.

👀 show answer

Small differences are expected due to chance. With more trials, everyone’s distributions should approach the theoretical pattern for differences (0–5) where 1 is most common, then 2, 3, 0, 4, and 5. Means should cluster near $1.94$ and medians near $2$.

❓ EXERCISES

8. The frequency tables show the number of goals scored in each match by two hockey teams in 20 matches.

Team A Number of goals	$0$	$1$	$2$	$3$	$4$	$5$
Frequency	$4$	$1$	$4$	$2$	$4$	$5$

Team B Number of goals	$0$	$1$	$2$	$3$	$4$	$5$
Frequency	$0$	$6$	$1$	$5$	$4$	$4$

a. Show that Marcus, Zara and Arun say could all be correct.

b. Which average do you think best represents the data in the tables? Explain why. Who do you agree with, Marcus, Zara or Arun?

👀 Show answer

Team A: Mean $=\dfrac{0\cdot4+1\cdot1+2\cdot4+3\cdot2+4\cdot4+5\cdot5}{20}=\dfrac{60}{20}=3$ goals.

Team B: Mean $=\dfrac{0\cdot0+1\cdot6+2\cdot1+3\cdot5+4\cdot4+5\cdot4}{20}=\dfrac{56}{20}=2.8$ goals.

So, Marcus could argue for Team A (mean higher).

Median Team A: The middle two scores are around $3$. Median $=3$.

Median Team B: The middle scores are $3$. Median $=3$. So Arun is correct—they have the same median.

Mode Team A: $5$ goals (highest frequency $5$). Mode Team B: $1$ goal (highest frequency $6$). So Zara could argue Team B, but it depends on interpretation of "average": she might be using mode incorrectly (Team B has mode $=1$, fewer goals). Alternatively, Team B’s distribution could justify another interpretation.

b. The mean best represents performance over many matches. On this basis, Team A ($3$ goals per match) did slightly better than Team B ($2.8$ goals per match). I would agree with Marcus.

⚠️ Be careful!

Mode ≠ frequency: the mode is the value with the greatest frequency, not the frequency itself.
Order for the median: always sort the data. For even n, median = average of the two middle values.
Mean is sensitive to outliers: a few extreme values can pull the mean; consider using the median when data are skewed.
Range uses extremes only: range = max − min; double-check you used the true smallest and largest values.
Multiple/No mode: datasets can be bimodal (two modes) or have no mode—don’t force a single answer.
Choose the right “average” for the task: mode for most common category/score, median for typical value with outliers, mean for balanced data and further calculation.
Frequency tables: mean = Σ(x·f)/Σf, median by cumulative frequency; don’t forget to use all classes.
Don’t mix units or groups: compare like with like (same units/populations). When comparing two sets, use the same statistic (mean vs mean, median vs median) and comment on range for spread.
Round at the end: keep exact arithmetic until the final step to avoid rounding drift.

📘 What we've learned — Using Mode, Median, Mean & Range to Compare Data

Mode = most frequent value(s). If there are two, the data are bimodal.
Median = middle value when the data are ordered (half above, half below).
Mean = (sum of all values) ÷ (number of values); uses every data point.
Range = largest − smallest; quick measure of spread/variation.

Pick the right statistic

Goal	Best choice	Why
Most common category/score	Mode	Identifies what happens most often (can be bimodal).
Skewed data / outliers	Median	Resistant to extremes; good "typical" value.
Balanced data; arithmetic summary	Mean	Uses all values; supports further calculations.
Compare spread/consistency	Range	Bigger range ⇒ more variation.

How to compare two data sets

Choose an average (mean or median) appropriate to the data shape; compute it for both sets.
Compare ranges to discuss consistency/variation.
Justify your choices (e.g., “outliers present → median”).

Quick checks

Median (n odd): middle value. (n even): average of the two middles.
Mean: a single extreme changes it; re-check totals carefully.
Range: max − min; sensitive to outliers.
Mode: don’t confuse the modal value with the largest frequency.

Mini example (team ages)

Data: 16, 17, 18, 18, 19, 20, 20, 21, 21, 32, 41
Mode = 18, 20, 21 · Median = 20 · Mean = 243 ÷ 11 ≈ 22.1 · Range = 41 − 16 = 25.
Best average: Median (mean pulled up by two older players; multiple modes less helpful).

Common pitfalls

Quoting the mean for heavily skewed data.
Using only the range to judge spread.
Reporting "mode = 28" when 28 is actually a frequency, not a data value.