Look at these examples of statistical questions:
To answer a statistical question you need to collect data. The number of brothers you have, the mass of a baby, and a sport you watch are all examples of different types of data.
The type of data needed to answer Question 1 is discrete data. The values can be only 0, 1, 2, … Discrete data can take particular values only.
The type of data needed to answer Question 2 is continuous data. Masses, lengths and times are all examples of continuous data. They are measurements. They are numbers that can take any value.
The type of data needed to answer Question 3 is categorical data. The data are words, not numbers.
There are several ways to collect data. You can:
1. Choose the correct word to describe the following.
a. The mass of a book
b. The colour of a book
c. The number of pages in a book
2. Here are some facts about a person. Write down the type of data for each fact.
a. Age, in years
b. Shoe size
c. Height
d. Time taken to travel to school
e. Favourite subject
3. Liling is comparing different models of cars. She is collecting data about cars. Give some examples of data about cars that are:
a. categorical data
b. discrete data
c. continuous data
a. Examples (categorical): body type (sedan/SUV/hatchback), fuel type (petrol/diesel/electric), colour.
b. Examples (discrete): number of doors $2,4,5$, number of seats, cylinders.
c. Examples (continuous): fuel economy in $\text{L}/100\text{ km}$, mass in $\text{kg}$, top speed in $\text{km h}^{-1}$.
4. Here is a question from a questionnaire. The questionnaire is given to people who stayed at a hotel.
a. What is missing from the question?
This table shows some people’s replies to this question.
| Score | $1$ | $2$ | $3$ | $4$ | $5$ |
|---|---|---|---|---|---|
| Frequency | $2$ | $4$ | $9$ | $17$ | $21$ |
b. How many people replied?
c. What was the modal score?
a. Missing a labelled scale (e.g., $1=$ “very dirty”, $5=$ “very clean”) and a time frame (e.g., “during your stay / last night”).
b. Total replies $=2+4+9+17+21=53$.
c. Modal score $=5$ (highest frequency $21$).
5. Here is a question from a questionnaire.
How many hours of homework do you do? Tick one box.
Between $1$ and $2$ hours ☐
Between $2$ and $3$ hours ☐
More than $3$ hours ☐
a. Write down two things that are wrong with this question.
b. Write a better question.
a. Issues: no time period stated (per day/week?); intervals overlap at $2$ and $3$ hours and omit “less than $1$ hour”.
b. Example improved item: “In a typical week, how many hours of homework do you do? Tick one box.”
Options (non-overlapping):$0\!-\!<1$ ☐, $1\!-\!<2$ ☐, $2\!-\!<3$ ☐, $3\!-\!<4$ ☐, $\ge 4$ ☐.
6. You are investigating what people of your age do in their leisure time.
a. List some activities that you think should be included.
b. Write four questions you would ask in your investigation.
Each question should have several tick boxes to choose from that show the possible answers.
c. Ask your questions to a partner. Use their replies to help you decide whether you can improve your questions.
a. Examples: sport, gaming, reading, social media, music practice, volunteering.
b. Sample questions (with tick boxes):
• “How often do you play sport in a typical week? ” ☐ $0$ times ☐ $1\!-\!2$ ☐ $3\!-\!4$ ☐ $\ge 5$
• “About how many hours do you spend gaming per day? ” ☐ $0\!-\!<1$ ☐ $1\!-\!<2$ ☐ $2\!-\!<3$ ☐ $\ge 3$
• “Which activities do you do most weekends? (tick all that apply)” ☐ sport ☐ homework ☐ social media ☐ meet friends ☐ other
• “How do you usually get to leisure activities?” ☐ walk ☐ cycle ☐ bus ☐ car ☐ other.
c. Use feedback to refine wording, ensure options are exhaustive/non-overlapping, and add time frames/units where needed.
7. Work in pairs for this question. A teacher asks learners to estimate the number of sweets in a jar. She makes two predictions:
• The estimates of the boys will be too big.
• The estimates of the girls will be too small.
a. Explain how the teacher can test her predictions.
i. What type of data will the teacher need to collect?
ii. How can she collect the data?
iii. How can she analyse the data?
8. Compare your answers to part a with the answers of another pair in your class. Can your answer be improved?
9. Adekunle is investigating the number of emails people receive at work. He makes the prediction: • People get more emails on Mondays than on Fridays. a. How can Adekunle collect data to test his prediction? b. How can he analyse the results?
10. Sofia and Zara throw two dice and add the scores to get the total. Sofia makes this prediction: “7 is the most likely total.” Zara makes this prediction: “All totals are equally likely.” They throw the two dice 100 times. Their results are shown in the table.

a. Explain why this is not a good way to record the results.
b. Show the frequencies for each number in a suitable table.
c. Show the results in a bar chart.
d. Is Sofia’s prediction correct? Give a reason for your answer.
e. Is Zara’s prediction correct? Give a reason for your answer.
Question: A healthy diet includes fruit and vegetables. Do people your age eat enough fruit and vegetables? You are going to collect data to investigate this question.
Instructions (solo): Work individually for this investigation (rephrased from pair work).
Tasks:
Sample answers:
a. Example predictions to test:
b. Data collection plan:
c. Analysis ideas:
d. Self-review to improve answers: