Scatter graphs
Scatter graphs
A scatter graph is a useful way to compare two sets of data. You can use a scatter graph to find out whether there is a correlation, or relationship, between the two sets of data. Two sets of data could have:
When two sets of data have positive or negative correlation, you can draw a line of best fit on the scatter graph. The line of best fit shows the relationship between the two sets of data. You can use it to estimate other values.
If two sets of data have a strong correlation most of the points will be close to the line of best fit. If the data points are not close to the line of best fit, the sets of data have a weak correlation.
Examples of correlation strength:
1. Hassan carried out a survey on 15 students in his class. He asked them how many hours a week they spend doing homework, and how many hours a week they spend watching TV. The table shows the results of his survey.
Hours doing homework | 14 | 11 | 19 | 6 | 10 | 3 | 9 | 4 | 12 | 8 | 6 | 15 | 18 | 7 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Hours watching TV | 7 | 12 | 4 | 15 | 11 | 18 | 15 | 17 | 8 | 14 | 16 | 7 | 5 | 16 | 10 |
a) Draw a scatter graph to show this data. Mark each axis with a scale from 0 to 20. Show ‘Hours doing homework’ on the horizontal axis and ‘Hours watching TV’ on the vertical axis.
b) Does the scatter graph show positive or negative correlation? Explain your answer.
c) Draw a line of best fit on your graph and describe the strength of the correlation.
d) Hassan spends 6 hours watching TV one week. Use your line of best fit to estimate how many hours he spends doing homework that week.
a) Scatter graph required (plot homework hours against TV hours).
b) The scatter graph shows a negative correlation – as hours of homework increase, hours of TV decrease.
c) The correlation is fairly strong and negative. A line of best fit slopes downward from left to right.
d) From the line of best fit, if Hassan spends $6$ hours watching TV, he spends about $15$ hours doing homework.
Task: Explore the relationship between maximum daytime temperature and the number of cold drinks sold, using correlation and scatter graphs.
Data (14-day period):
Daytime temperature (°C) | 28 | 26 | 30 | 31 | 34 | 32 | 27 | 25 | 26 | 28 | 29 | 30 | 33 | 27 |
Cold drinks sold | 25 | 22 | 26 | 28 | 29 | 27 | 24 | 23 | 24 | 27 | 26 | 29 | 31 | 23 |
Questions:
3.The table shows the history and music exam results of 15 students. The results for both subjects are given as percentages.
History result | 12 | 15 | 22 | 25 | 32 | 36 | 45 | 52 | 58 | 68 | 75 | 77 | 80 | 82 | 85 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Music result | 25 | 64 | 18 | 42 | 65 | 23 | 48 | 24 | 60 | 45 | 68 | 55 | 42 | 32 | 76 |
a)Without looking at the percentages or drawing a graph, do you think there will be positive, negative, or no correlation between the history and music exam results of the students? Explain your answer.
b)Draw a scatter graph to show the data. Mark a scale from 0 to 100 on each axis. Show ‘History result’ on the horizontal axis and ‘Music result’ on the vertical axis.
c)What type of correlation does the scatter graph show? Explain your answer.
d)Was your conjecture in part a correct? Explain your answer.
a) Likely there is a positive correlation, since students who do well in history may also do well in music.
b) Scatter graph required (plot history results against music results).
c) The scatter graph shows a weak positive correlation – as history marks increase, music marks tend to increase, though not perfectly.
d) Yes, the conjecture in part a is broadly correct. The graph confirms a positive correlation, though it is not a strong one.
4.The scatter graph shows the distance travelled and the time taken by a taxi driver for the 12 journeys he made on one day.
a)What type and strength of correlation does the scatter graph show? Explain your answer.
b)One of the journeys doesn’t seem to fit the correlation. Which journey is this?
Explain why you think this journey might be different from the other journeys.
a) The scatter graph shows a strong positive correlation – as the distance travelled increases, the time taken also increases in a consistent pattern.
b) The journey at about $20$ km and $12$ minutes does not fit the pattern. It might be different because the driver could have taken a faster route (e.g., motorway), encountered less traffic, or recorded the time incorrectly.
Task:Critique two lines of best fit, describe how to draw a good one, and discuss using it for predictions.
Scenario:A scatter graph shows body length (cm) vs wingspan (cm) for 10 birds. Marcus has drawn a red line of best fit. Arun has drawn a black line of best fit.
Questions:
a) Critique
b) How to draw a good line of best fit (by eye)
d) Using the line beyond the data?
6.The table shows the number of fish recorded at 10 different points in the Red Sea. It also shows the temperature of the sea at each point.
Sea temperature (°C) | 25 | 26 | 21 | 20 | 22 | 24 | 28 | 23 | 21 | 19 |
---|---|---|---|---|---|---|---|---|---|---|
Number of fish | 102 | 75 | 122 | 129 | 120 | 92 | 75 | 95 | 138 | 146 |
a) Draw a scatter graph to show this data.
b)Describe the type and strength of the correlation between the number of fish and the temperature of the sea.
c)Draw a line of best fit on your scatter graph. Use your line of best fit to estimate the number of fish at a point where the temperature is 27°C.
d)Do you think it is a good idea to use your line of best fit to predict the number of fish in the Red Sea when the temperature of the sea is 30°C, 35°C or even higher? Explain your answer.
e)Scientists estimate that the sea temperature in the world is increasing every year. Use your graph to predict what will happen to the fish population in the sea as temperatures increase.
a) Scatter graph required (temperature on the x-axis, number of fish on the y-axis).
b) The correlation is negative and fairly strong: as the temperature increases, the number of fish decreases.
c) Using a line of best fit, at $27^\circ$C the number of fish is about $85$.
d) No. Extrapolating beyond the observed data (above $28^\circ$C) is unreliable, since the relationship may not continue the same way.
e) As sea temperatures rise, the fish population is predicted to fall, leading to fewer fish in the Red Sea.
7.Twenty learners in a school completed the same maths test. The length of their right foot was also measured. This scatter graph shows the results:
Sofia says: “The scatter graph shows a positive correlation. This means that the longer your foot, the better you are at maths.”
Zara says: “That can’t be true! Being good at maths is not related to your foot length.”
a) Explain why Zara is correct.
b) Discuss your answer to part a with other learners in your class.
a)Zara is correct because correlation does not mean causation. The scatter graph shows a positive correlation, but this is likely due to age: older students have longer feet and also tend to do better at maths. Foot length itself does not cause better maths results.
b)(Discussion) Learners should note that other factors, such as age or experience, explain the pattern. It is important to understand that two variables being correlated does not mean one causes the other.