How to Create a Stacked Area Chart in Matplotlib

Key Insights

Stacked area charts excel at showing cumulative totals and part-to-whole relationships over time, but become unreadable with more than 5-7 categories or when comparing non-adjacent layers
Matplotlib’s stackplot() function automatically handles the cumulative stacking calculation, requiring only your raw data series as separate arrays
Always place your most important or stable data series at the bottom of the stack where changes are easiest to perceive against the baseline

Understanding Stacked Area Charts

Stacked area charts visualize multiple quantitative variables over a continuous interval, stacking each series on top of the previous one. Unlike line charts that show individual trends independently, stacked area charts emphasize the cumulative total while showing how each component contributes to that whole.

Use stacked area charts when you need to answer two questions simultaneously: “What’s the overall trend?” and “How do the parts contribute to that trend?” They’re ideal for budget allocations over time, market share evolution, or resource utilization across categories. However, they fail when you need precise comparisons between non-adjacent series or when dealing with negative values.

Building Your First Stacked Area Chart

The stackplot() function does the heavy lifting. You provide the x-axis values and multiple y-axis series, and Matplotlib handles the cumulative calculations.

import matplotlib.pyplot as plt
import numpy as np

# Generate time series data
months = np.arange(1, 13)
direct_traffic = np.array([120, 135, 145, 160, 155, 170, 185, 190, 200, 210, 225, 240])
referral_traffic = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135])
social_traffic = np.array([40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95])

# Create the stacked area chart
fig, ax = plt.subplots(figsize=(10, 6))
ax.stackplot(months, direct_traffic, referral_traffic, social_traffic)

ax.set_xlabel('Month')
ax.set_ylabel('Visitors')
ax.set_title('Website Traffic Sources Over Time')

plt.tight_layout()
plt.show()

This creates a functional stacked area chart, but it’s visually bland. The default colors don’t convey meaning, and without labels, viewers can’t identify which series is which.

Customizing Visual Appearance

Color choice matters significantly in stacked area charts. Use colors that align with your brand or data semantics. Add transparency to soften the visual impact and include a legend for clarity.

import matplotlib.pyplot as plt
import numpy as np

months = np.arange(1, 13)
direct_traffic = np.array([120, 135, 145, 160, 155, 170, 185, 190, 200, 210, 225, 240])
referral_traffic = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135])
social_traffic = np.array([40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95])

fig, ax = plt.subplots(figsize=(10, 6))

# Define custom colors and labels
colors = ['#2E86AB', '#A23B72', '#F18F01']
labels = ['Direct', 'Referral', 'Social']

# Create stacked area chart with customization
ax.stackplot(months, direct_traffic, referral_traffic, social_traffic,
             labels=labels, colors=colors, alpha=0.8, edgecolor='white', linewidth=0.5)

ax.set_xlabel('Month', fontsize=12)
ax.set_ylabel('Visitors', fontsize=12)
ax.set_title('Website Traffic Sources Over Time', fontsize=14, fontweight='bold')
ax.legend(loc='upper left', frameon=True, shadow=True)

plt.tight_layout()
plt.show()

The alpha parameter controls transparency (0.8 gives a subtle see-through effect), while edgecolor and linewidth add thin white borders between series, improving visual separation. Always position the legend where it doesn’t obscure data—typically upper left or upper right.

Processing Real-World Data

Real data rarely arrives in the perfect format for stackplot(). You’ll typically work with pandas DataFrames containing dates and multiple columns.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Simulate loading CSV data
data = {
    'date': pd.date_range('2023-01-01', periods=12, freq='M'),
    'electronics': [45000, 47000, 49000, 52000, 51000, 54000, 57000, 59000, 62000, 65000, 68000, 71000],
    'clothing': [32000, 33000, 35000, 36000, 37000, 38000, 39000, 40000, 41000, 42000, 43000, 44000],
    'home_goods': [28000, 29000, 30000, 31000, 32000, 33000, 34000, 35000, 36000, 37000, 38000, 39000],
    'books': [15000, 15500, 16000, 16500, 17000, 17500, 18000, 18500, 19000, 19500, 20000, 20500]
}

df = pd.DataFrame(data)

# Prepare data for stackplot
x = df['date']
y_data = [df['electronics'], df['clothing'], df['home_goods'], df['books']]
labels = ['Electronics', 'Clothing', 'Home Goods', 'Books']
colors = ['#264653', '#2A9D8F', '#E9C46A', '#F4A261']

fig, ax = plt.subplots(figsize=(12, 6))
ax.stackplot(x, *y_data, labels=labels, colors=colors, alpha=0.85)

ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Sales ($)', fontsize=12)
ax.set_title('Sales by Product Category', fontsize=14, fontweight='bold')
ax.legend(loc='upper left')

# Format y-axis to show currency
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

plt.tight_layout()
plt.show()

Notice the unpacking operator *y_data which passes each series as a separate argument to stackplot(). The custom y-axis formatter converts raw numbers to readable currency format. When working with dates, pandas’ datetime objects integrate seamlessly with Matplotlib’s date handling.

Polishing for Publication

Professional charts require attention to detail: formatted axes, strategic gridlines, and contextual annotations.

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd

data = {
    'date': pd.date_range('2023-01-01', periods=12, freq='M'),
    'electronics': [45000, 47000, 49000, 52000, 51000, 54000, 57000, 59000, 62000, 65000, 68000, 71000],
    'clothing': [32000, 33000, 35000, 36000, 37000, 38000, 39000, 40000, 41000, 42000, 43000, 44000],
    'home_goods': [28000, 29000, 30000, 31000, 32000, 33000, 34000, 35000, 36000, 37000, 38000, 39000],
    'books': [15000, 15500, 16000, 16500, 17000, 17500, 18000, 18500, 19000, 19500, 20000, 20500]
}

df = pd.DataFrame(data)
df['total'] = df[['electronics', 'clothing', 'home_goods', 'books']].sum(axis=1)

x = df['date']
y_data = [df['electronics'], df['clothing'], df['home_goods'], df['books']]
labels = ['Electronics', 'Clothing', 'Home Goods', 'Books']
colors = ['#264653', '#2A9D8F', '#E9C46A', '#F4A261']

fig, ax = plt.subplots(figsize=(12, 7))
ax.stackplot(x, *y_data, labels=labels, colors=colors, alpha=0.85)

# Advanced formatting
ax.set_xlabel('Month', fontsize=12, fontweight='bold')
ax.set_ylabel('Revenue', fontsize=12, fontweight='bold')
ax.set_title('Quarterly Sales Performance by Category', fontsize=16, fontweight='bold', pad=20)

# Format x-axis dates
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
ax.xaxis.set_major_locator(mdates.MonthLocator())

# Format y-axis
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# Add gridlines
ax.grid(True, alpha=0.3, linestyle='--', linewidth=0.5)
ax.set_axisbelow(True)

# Add legend with better styling
ax.legend(loc='upper left', frameon=True, shadow=True, fancybox=True, fontsize=10)

# Annotate peak total
peak_idx = df['total'].idxmax()
peak_date = df.loc[peak_idx, 'date']
peak_value = df.loc[peak_idx, 'total']

ax.annotate(f'Peak: ${peak_value/1000:.0f}K',
            xy=(peak_date, peak_value),
            xytext=(peak_date, peak_value + 15000),
            ha='center',
            fontsize=10,
            fontweight='bold',
            arrowprops=dict(arrowstyle='->', color='black', lw=1.5))

plt.tight_layout()
plt.show()

The set_axisbelow(True) call ensures gridlines render behind your data. Date formatting with mdates provides clean, readable month labels. Strategic annotations draw attention to significant data points without cluttering the visualization.

Avoiding Common Mistakes

Stacked area charts fail in specific scenarios. With more than seven categories, the chart becomes illegible—readers can’t distinguish between thin layers or track changes in middle series. Negative values break the stacking logic entirely since areas can’t meaningfully overlap in negative space.

import matplotlib.pyplot as plt
import numpy as np

# Data with too many categories
months = np.arange(1, 13)
categories = 10
data = [np.random.randint(20, 50, 12) for _ in range(categories)]

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# Bad: Too many stacked categories
ax1.stackplot(months, *data, alpha=0.8)
ax1.set_title('Bad: 10 Categories - Unreadable')
ax1.set_xlabel('Month')
ax1.set_ylabel('Value')

# Better: Grouped bar chart for many categories
x = np.arange(len(months))
width = 0.8
bottom = np.zeros(len(months))

for i, series in enumerate(data[:5]):  # Show only top 5
    ax2.bar(x, series, width, bottom=bottom, label=f'Category {i+1}', alpha=0.8)
    bottom += series

ax2.set_title('Better: Top 5 Categories with Bar Chart')
ax2.set_xlabel('Month')
ax2.set_ylabel('Value')
ax2.legend()

plt.tight_layout()
plt.show()

When you have many categories, aggregate smaller ones into an “Other” category or switch to a different visualization. For data requiring precise value comparisons, use line charts or small multiples instead. Stacked area charts prioritize showing the cumulative trend and relative proportions, not exact values for individual series.

Moving Forward

Stacked area charts serve a specific purpose: showing how parts contribute to a changing whole over time. Master the basics with stackplot(), customize thoughtfully with colors and transparency, and know when to choose alternative visualizations.

For interactive exploration, consider Plotly’s stacked area charts which support hover tooltips showing exact values. Combine stacked area charts with other plot types using plt.subplots() to provide multiple perspectives on the same dataset. The key is matching the visualization to your analytical question—stacked area charts answer “How do contributions change over time?” exceptionally well, but they’re not universal solutions.