How to Create a Stem Plot in Matplotlib
Stem plots display discrete data as vertical lines extending from a baseline to markers representing data values. Unlike line plots that suggest continuity between points, stem plots emphasize that...
Key Insights
- Stem plots excel at visualizing discrete data points where the vertical position matters more than continuity between points—think impulse responses, event occurrences, or probability mass functions
- The
plt.stem()function returns three separate artist objects (markerline, stemlines, baseline) that you can individually customize for complete control over appearance - Stem plots become cluttered above 50-100 data points; for larger datasets, consider switching to scatter plots or aggregating your data first
Introduction to Stem Plots
Stem plots display discrete data as vertical lines extending from a baseline to markers representing data values. Unlike line plots that suggest continuity between points, stem plots emphasize that each data point is independent—there’s no implied relationship between consecutive values.
Use stem plots when your data represents discrete events, measurements at specific intervals, or probability distributions where each outcome is distinct. They’re particularly valuable in signal processing for impulse responses, in statistics for discrete probability mass functions, and in business analytics for event-based metrics like daily transaction counts or weekly bug reports.
The visual separation between points makes stem plots superior to scatter plots when you need to emphasize both the magnitude and the discrete nature of measurements. The vertical lines provide clear visual anchors that make it easy to trace each value back to its x-axis position.
Basic Stem Plot Syntax
The plt.stem() function creates stem plots with minimal code. At its simplest, you pass x and y coordinates, and Matplotlib handles the rest. The function returns three components: the marker line (data points), stem lines (vertical connectors), and the baseline (horizontal reference line).
import matplotlib.pyplot as plt
import numpy as np
# Generate sample data
x = np.arange(0, 10, 1)
y = np.array([2, 5, 3, 8, 4, 7, 6, 3, 5, 4])
# Create basic stem plot
markerline, stemlines, baseline = plt.stem(x, y)
plt.xlabel('Sample Index')
plt.ylabel('Value')
plt.title('Basic Stem Plot')
plt.grid(True, alpha=0.3)
plt.show()
This creates a standard stem plot with default blue markers and lines. The baseline sits at y=0 by default. Notice that we capture the three returned objects—this becomes crucial for customization.
If you only provide y values, Matplotlib automatically generates x values starting from 0:
# Simplified syntax with auto-generated x values
y = [2, 5, 3, 8, 4, 7, 6, 3, 5, 4]
plt.stem(y)
plt.show()
Customizing Stem Plot Appearance
The real power of stem plots emerges when you customize their appearance. Matplotlib provides three key parameters for styling: linefmt for stem lines, markerfmt for markers, and basefmt for the baseline.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 2*np.pi, 20)
y = np.sin(x)
# Create customized stem plot
markerline, stemlines, baseline = plt.stem(
x, y,
linefmt='C3-', # Red stem lines
markerfmt='C3o', # Red circular markers
basefmt='C2--' # Green dashed baseline
)
# Further customize individual components
markerline.set_markersize(8)
markerline.set_markerfacecolor('red')
markerline.set_markeredgewidth(2)
markerline.set_markeredgecolor('darkred')
stemlines.set_linewidth(1.5)
baseline.set_linewidth(2)
plt.xlabel('Radians')
plt.ylabel('Amplitude')
plt.title('Customized Stem Plot of Sine Wave')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
The format strings follow Matplotlib’s standard notation: color codes (like ‘C3’ for the fourth default color) combined with line styles (’-’, ‘–’, ‘:’) and marker shapes (‘o’, ’s’, ‘^’). You can also use named colors like ‘red’, ‘blue’, or hex codes like ‘#FF5733’.
After creation, you can modify properties of the returned objects directly. This two-stage approach—initial formatting through parameters, then fine-tuning through object methods—gives you complete control.
Advanced Formatting Techniques
For complex visualizations, you’ll often need multiple stem series, custom baselines, or horizontal orientations. Here’s how to implement these advanced techniques:
import matplotlib.pyplot as plt
import numpy as np
# Generate two data series
x = np.arange(0, 10)
y1 = np.random.randint(3, 10, size=10)
y2 = np.random.randint(1, 7, size=10)
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Multi-series stem plot with legend
markerline1, stemlines1, baseline1 = ax1.stem(
x, y1,
linefmt='C0-',
markerfmt='C0o',
label='Series A'
)
markerline2, stemlines2, baseline2 = ax1.stem(
x + 0.2, y2, # Offset x values for visibility
linefmt='C1-',
markerfmt='C1s',
label='Series B'
)
ax1.set_xlabel('Category')
ax1.set_ylabel('Count')
ax1.set_title('Multi-Series Stem Plot')
ax1.legend()
ax1.grid(True, alpha=0.3)
# Horizontal stem plot with custom baseline
markerline3, stemlines3, baseline3 = ax2.stem(
y1, x,
orientation='horizontal',
linefmt='C4-',
markerfmt='C4D',
basefmt='C7--'
)
# Move baseline to x=5 instead of x=0
baseline3.set_xdata([5, 5])
for stem in stemlines3:
xdata = stem.get_xdata()
stem.set_xdata([5, xdata[1]]) # Start stems from x=5
ax2.set_ylabel('Category')
ax2.set_xlabel('Count')
ax2.set_title('Horizontal Stem Plot with Custom Baseline')
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
For horizontal stem plots, use orientation='horizontal'. This swaps x and y axes—useful for categorical data where you want categories listed vertically.
The bottom parameter shifts the baseline position:
# Stem plot with raised baseline
plt.stem(x, y, bottom=3) # Baseline at y=3 instead of y=0
Practical Applications
Stem plots shine in real-world scenarios involving discrete events or measurements. Here’s a practical example visualizing daily website signups over a month:
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime, timedelta
# Simulate daily signup data for 30 days
np.random.seed(42)
days = 30
start_date = datetime(2024, 1, 1)
dates = [start_date + timedelta(days=i) for i in range(days)]
signups = np.random.poisson(25, days) # Poisson distribution, average 25/day
# Highlight days with exceptional performance (>35 signups)
colors = ['red' if s > 35 else 'steelblue' for s in signups]
fig, ax = plt.subplots(figsize=(14, 6))
# Create stem plot
markerline, stemlines, baseline = ax.stem(
dates, signups,
linefmt='steelblue',
markerfmt='o',
basefmt='gray'
)
# Color exceptional days differently
for i, (stem, marker, color) in enumerate(zip(stemlines, markerline.get_children(), colors)):
stem.set_color(color)
if color == 'red':
markerline.get_children()[i].set_markersize(10)
markerline.get_children()[i].set_color('red')
ax.set_xlabel('Date')
ax.set_ylabel('Daily Signups')
ax.set_title('Daily Website Signups - January 2024')
ax.grid(True, alpha=0.3, axis='y')
# Format x-axis dates
fig.autofmt_xdate()
# Add reference line for target
ax.axhline(y=30, color='green', linestyle='--', alpha=0.5, label='Target: 30/day')
ax.legend()
plt.tight_layout()
plt.show()
This example demonstrates several best practices: using appropriate statistical distributions for realistic data, highlighting outliers with color, adding reference lines for context, and formatting date axes properly.
Another common application is comparing discrete probability distributions:
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import poisson, binom
x = np.arange(0, 15)
poisson_pmf = poisson.pmf(x, mu=5)
binom_pmf = binom.pmf(x, n=14, p=0.36)
fig, ax = plt.subplots(figsize=(10, 6))
ax.stem(x - 0.15, poisson_pmf, linefmt='C0-', markerfmt='C0o',
basefmt=' ', label='Poisson(λ=5)')
ax.stem(x + 0.15, binom_pmf, linefmt='C1-', markerfmt='C1s',
basefmt=' ', label='Binomial(n=14, p=0.36)')
ax.set_xlabel('Number of Events')
ax.set_ylabel('Probability')
ax.set_title('Comparing Discrete Probability Distributions')
ax.legend()
ax.grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()
Common Pitfalls and Best Practices
Don’t use stem plots for continuous data or large datasets. Above 50-100 points, stem plots become visually cluttered and lose their clarity advantage. For dense data, switch to line plots, scatter plots, or aggregate your data into bins.
Avoid stem plots when showing trends over time with continuous measurements. A temperature reading every hour throughout a day works better as a line plot because temperature changes continuously. Use stem plots when each measurement is a distinct event—like the number of customers entering a store each hour.
Always set appropriate axis limits. Matplotlib’s auto-scaling sometimes creates excessive whitespace above or below your data:
plt.stem(x, y)
plt.ylim(0, max(y) * 1.1) # Add 10% headroom above max value
When comparing multiple series, offset the x-coordinates slightly (as shown earlier) to prevent overlap. Alternatively, use different subplots for each series if they share the same x-values.
For publication-quality figures, increase the DPI and adjust font sizes:
plt.rcParams['figure.dpi'] = 150
plt.rcParams['font.size'] = 11
Finally, remember that stem plots emphasize individual values. If your goal is to show overall patterns or distributions, histograms or density plots might serve you better. Choose stem plots when the discrete nature of your data is itself meaningful to your analysis.