chevron_left Stem-and-leaf Diagram chevron_right

Stem-and-leaf Diagram
Anna Kowalski
share
visibility95
calendar_month2025-10-13

Understanding Stem-and-Leaf Diagrams

A simple yet powerful way to organize and visualize numerical data.
This article explores the stem-and-leaf diagram, a clever method for organizing and displaying numerical data by splitting each number into a "stem" (the leading digit(s)) and a "leaf" (the trailing digit). We will learn how to construct these diagrams step-by-step, interpret the patterns they reveal, and understand why they are such valuable tools in exploratory data analysis. Key topics include the basic components of stem-and-leaf plots, their advantages over simple lists, how they preserve the original data, and their practical applications in real-world scenarios from classroom tests to scientific measurements.

What is a Stem-and-Leaf Diagram?

A stem-and-leaf diagram, often called a stem-and-leaf plot, is a way to display data that shows the shape of a distribution while keeping the raw, original numbers intact. It is a useful alternative to a histogram[1] and is particularly effective for small to moderately sized datasets.

Think of it like a hybrid between a list of numbers and a graph. Each number in your dataset is split into two parts:

  • The Stem: This is the first part of the number, typically all digits except the last one. It represents the larger, common value shared by a group of numbers.
  • The Leaf: This is the last digit of the number. It provides the specific, individual value that differentiates numbers within the same stem group.

For example, the number 57 would be split with 5 as the stem and 7 as the leaf. The number 123 could be split with 12 as the stem and 3 as the leaf. When you list all the leaves for a given stem, you create a visual representation of how the data is clustered.

Key Idea: A stem-and-leaf plot organizes data to show its distribution, just like a histogram, but with a major advantage: you can recover the original data values from the plot itself.

Building Your First Stem-and-Leaf Diagram

Let's create a diagram from start to finish. Imagine you recorded the following test scores from a class of students: 65, 72, 78, 81, 84, 84, 87, 89, 91, 93, 95, 97.

Step 1: Identify the Stems and Leaves.
For these two-digit numbers, the "tens" digit will be the stem, and the "ones" digit will be the leaf. So, for the score 65, the stem is 6 and the leaf is 5.

Step 2: List the Stems in a Column.
The smallest stem is 6 (from 65) and the largest is 9 (from 97). We list all stems from 6 to 9 in a vertical column.

Step 3: Add the Leaves.
Go through each data point and write its leaf next to the corresponding stem. For the score 65, you write a 5 next to the stem 6.

Step 4: Order the Leaves.
For the diagram to be useful, the leaves on each stem must be arranged in ascending order from left to right.

StepDescriptionVisual Output
1 & 2List the stems from minimum to maximum.Stem: 6
Stem: 7
Stem: 8
Stem: 9
3Add all leaves next to their corresponding stems.6 | 5
7 | 2 8
8 | 1 4 4 7 9
9 | 1 3 5 7
4Order the leaves on each stem.6 | 5
7 | 2 8
8 | 1 4 4 7 9
9 | 1 3 5 7

Our final diagram tells us a story. We can see that most students scored in the 80s, with two students getting an 84. We can also see the lowest score (65) and the highest score (97). The "shape" of the data is clear: it clusters towards the higher end.

Advanced Techniques: Splitting Stems and Handling Decimals

Sometimes, data is bunched up on just a few stems, making the diagram less informative. To solve this, we can split the stems. A common method is to split each stem into two lines: the first for leaves 0-4 and the second for leaves 5-9.

Let's use a new dataset: 41, 44, 46, 47, 48, 49, 50, 51, 52, 53, 55, 57, 59. A regular stem-and-leaf plot would have only two stems (4 and 5), which doesn't show much detail. By splitting the stems, we get a more detailed view.

TypeDiagramInterpretation
Regular4 | 1 4 6 7 8 9
5 | 0 1 2 3 5 7 9
Data is clustered, but the distribution within each cluster is not clear.
Split Stems4 | 1 4
4 | 6 7 8 9
5 | 0 1 2 3
5 | 5 7 9
Reveals a gap in the data and shows a more nuanced distribution, with most values in the high 40s and mid-50s.

You can also handle decimal numbers. For data like 1.3, 1.5, 1.7, 2.1, 2.2, you can define the stem as the units digit and the leaf as the tenths digit. Just remember to include a key! The plot would look like this:

1 | 3 5 7
2 | 1 2
Key: 1|3 = 1.3

Why Use a Stem-and-Leaf Diagram? Advantages and Limitations

Stem-and-leaf diagrams offer several unique benefits, especially for students and anyone new to data analysis.

AdvantagesLimitations
Data Preservation: The original data values are not lost. You can always reconstruct the full dataset from the plot.Not for Large Datasets: They become messy and impractical with more than about 50-100 data points.
Shows Distribution: It clearly shows the shape, center, and spread of the data, including gaps and clusters.Manual Construction: For very large datasets, it's time-consuming to create by hand compared to using software.
Easy to Construct: It requires no complex tools or software; just pen and paper.Discrete Data Focus: They work best with data that has a finite number of trailing digits. Truly continuous data must be rounded.

Putting It Into Practice: A Science Experiment

Let's apply this to a real scenario. A biology class measures the lengths (in cm) of 20 leaves from a specific plant. They record the following data:

5.1, 5.3, 5.4, 5.6, 5.7, 5.8, 5.8, 5.9, 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.9, 7.0, 7.2, 7.5

We'll use the units digit as the stem and the tenths digit as the leaf. The key will be 5|1 = 5.1 cm.

Stem-and-Leaf Plot for Leaf Lengths:

5 | 1 3 4 6 7 8 8 9
6 | 0 1 2 3 4 5 6 7 9
7 | 0 2 5

What can we learn? The diagram instantly shows that leaf lengths are not random. They are clustered between 5.1 cm and 6.9 cm, with a clear concentration in the 6.x cm range. There are only three leaves longer than 7.0 cm. This visual summary is much more powerful than staring at a list of 20 numbers.

Common Mistakes and Important Questions

Q: What is the most common error when making a stem-and-leaf plot?

The most frequent error is forgetting to order the leaves. An unordered plot is just a rearranged list of numbers and fails to show the distribution of the data. Another common mistake is misplacing the decimal point or forgetting to include a key, which makes the plot impossible to interpret correctly.

Q: When should I split the stems?

You should consider splitting stems when your plot has very few stems (like only 2 or 3) and the leaves are crowded, making it hard to see the pattern. If one stem has more than 8-10 leaves, splitting it will provide a more detailed and informative picture of how the data is distributed within that range.

Q: How is a stem-and-leaf plot different from a histogram?

Both show the shape of a distribution, but a stem-and-leaf plot preserves the actual data values, while a histogram does not. In a histogram, data is grouped into bins, and you lose the individual data points. A stem-and-leaf plot is like a histogram turned on its side, but with the original numbers written in. Histograms are generally better for very large datasets.

Conclusion
The stem-and-leaf diagram is a deceptively simple yet incredibly powerful tool for anyone beginning their journey into data analysis. It provides a clear, visual snapshot of a dataset's distribution, revealing its center, spread, and shape, all while preserving the original data. By mastering the skill of splitting numbers into stems and leaves, you gain the ability to quickly organize, summarize, and interpret numerical information. Whether you're analyzing test scores, scientific measurements, or any other set of numbers, the stem-and-leaf plot is a fundamental technique that builds a strong foundation for understanding more complex statistical methods.

Footnote

[1] Histogram: A type of bar graph used for numerical data where the data is grouped into ranges (called "bins") and the height of each bar represents the frequency of data points within that range. Unlike a bar chart, the bars in a histogram touch each other to indicate that the data is continuous.

Did you like this article?

home
grid_view
add
explore
account_circle