How to Create a Waterfall Chart in Matplotlib
Waterfall charts show how an initial value increases and decreases through a series of intermediate steps to reach a final value. Unlike standard bar charts that start each bar from zero, waterfall...
Key Insights
- Waterfall charts visualize cumulative effects by stacking bars that represent sequential positive and negative changes, making them ideal for financial analysis and variance reporting
- Matplotlib doesn’t have native waterfall support, but you can build them using bar charts with calculated positions and strategic color coding for increases versus decreases
- The key technical challenge is maintaining accurate running totals while positioning bars correctly—each bar’s bottom position must equal the previous cumulative value
Introduction to Waterfall Charts
Waterfall charts show how an initial value increases and decreases through a series of intermediate steps to reach a final value. Unlike standard bar charts that start each bar from zero, waterfall charts stack changes sequentially, creating a “cascading” visual effect that makes it easy to trace the path from start to finish.
These charts excel at financial analysis scenarios: tracking how revenue flows to net profit after deductions, explaining budget variances, analyzing inventory changes, or showing how individual components contribute to year-over-year growth. The visual format immediately highlights which factors drive value up or down.
Matplotlib doesn’t include a built-in waterfall chart function, but its flexible bar chart capabilities make it straightforward to build custom implementations. This gives you complete control over styling and behavior—crucial when presenting financial data to stakeholders who need specific formatting.
Basic Waterfall Chart Implementation
The fundamental technique uses Matplotlib’s bar() function with calculated positions. Each bar needs three values: its position on the x-axis, its height (the change amount), and its bottom position (where the bar starts on the y-axis).
The logic works like this: start with an initial value, then for each subsequent change, position the bar’s bottom at the running total before that change. Track the cumulative sum as you go, using it to position the next bar.
import matplotlib.pyplot as plt
import numpy as np
# Quarterly revenue changes
categories = ['Q1', 'Q2 Change', 'Q3 Change', 'Q4 Change', 'Year End']
changes = [100, 15, -8, 22, 0] # Q1 is starting value, Year End is total
# Calculate cumulative values and bar positions
cumulative = [0]
for change in changes[:-1]:
cumulative.append(cumulative[-1] + change)
# For the final total bar, calculate the actual total
total = sum(changes[:-1])
changes[-1] = total
# Create the plot
fig, ax = plt.subplots(figsize=(10, 6))
# Plot bars with calculated bottom positions
colors = ['#2E86AB', '#A23B72', '#F18F01', '#C73E1D', '#2E86AB']
bar_positions = np.arange(len(categories))
# Starting value and total are full bars from zero
bottom_values = [0] + cumulative[1:-1] + [0]
bar_heights = [changes[0]] + changes[1:-1] + [changes[-1]]
for i, (pos, height, bottom, color) in enumerate(zip(bar_positions, bar_heights, bottom_values, colors)):
if i == 0 or i == len(categories) - 1: # Start and end totals
ax.bar(pos, height, bottom=0, color=color, edgecolor='black', linewidth=1.2)
else: # Changes
color = '#06A77D' if height > 0 else '#D62828'
ax.bar(pos, height, bottom=bottom, color=color, edgecolor='black', linewidth=1.2)
ax.set_xticks(bar_positions)
ax.set_xticklabels(categories)
ax.set_ylabel('Revenue ($K)')
ax.set_title('Quarterly Revenue Waterfall')
ax.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()
This creates a basic waterfall showing how Q1’s starting revenue of $100K changes through the year. The color coding (green for positive, red for negative) makes trends immediately obvious.
Styling and Customization
Professional waterfall charts need connector lines between bars to emphasize the flow. These lines run from the top of one bar to the bottom of the next, creating the “waterfall” visual.
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
categories = ['Starting\nBalance', 'Sales', 'Returns', 'Discounts', 'Ending\nBalance']
changes = [1000, 450, -120, -80, 0]
# Calculate positions
cumulative = [changes[0]]
for change in changes[1:-1]:
cumulative.append(cumulative[-1] + change)
changes[-1] = cumulative[-1]
fig, ax = plt.subplots(figsize=(12, 7))
# Plot bars with enhanced styling
for i in range(len(categories)):
if i == 0 or i == len(categories) - 1: # Totals
bar = ax.bar(i, changes[i], color='#34495E', edgecolor='black', linewidth=1.5, width=0.6)
bottom = 0
height = changes[i]
else: # Changes
bottom = cumulative[i-1]
height = changes[i]
color = '#27AE60' if height > 0 else '#E74C3C'
bar = ax.bar(i, height, bottom=bottom, color=color, edgecolor='black', linewidth=1.5, width=0.6)
# Add value labels on bars
label_y = bottom + height/2 if i > 0 and i < len(categories)-1 else height/2
ax.text(i, label_y, f'${abs(height):.0f}', ha='center', va='center',
fontweight='bold', fontsize=10, color='white')
# Add connector lines
if i < len(categories) - 1:
next_bottom = 0 if i == len(categories) - 2 else cumulative[i]
current_top = bottom + height if i > 0 else height
ax.plot([i + 0.3, i + 0.7], [current_top, next_bottom],
'k--', linewidth=1.5, alpha=0.6)
ax.set_xticks(range(len(categories)))
ax.set_xticklabels(categories, fontsize=11)
ax.set_ylabel('Amount ($)', fontsize=12, fontweight='bold')
ax.set_title('Revenue Waterfall with Connectors', fontsize=14, fontweight='bold', pad=20)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.grid(axis='y', alpha=0.3, linestyle='--')
plt.tight_layout()
plt.show()
The connector lines make the flow explicit. Value labels on each bar eliminate the need to reference the y-axis constantly. This styling approach works well for executive presentations where clarity trumps minimalism.
Advanced Features
Real financial analysis often requires subtotals—intermediate checkpoints that show cumulative progress. A profit and loss statement might show gross profit, operating income, and net income as distinct totals within the same waterfall.
import matplotlib.pyplot as plt
import numpy as np
# P&L components with subtotals
categories = ['Revenue', 'COGS', 'Gross\nProfit', 'OpEx', 'EBIT', 'Interest', 'Taxes', 'Net\nIncome']
values = [5000, -2000, 0, -1500, 0, -200, -390, 0] # 0 indicates subtotal
is_subtotal = [False, False, True, False, True, False, False, True]
# Calculate running totals
running_total = 0
bar_heights = []
bar_bottoms = []
bar_colors = []
for i, (value, subtotal) in enumerate(zip(values, is_subtotal)):
if subtotal:
# Subtotal bar shows cumulative value
bar_heights.append(running_total)
bar_bottoms.append(0)
bar_colors.append('#34495E')
else:
# Change bar
bar_heights.append(abs(value))
bar_bottoms.append(running_total if value < 0 else running_total)
bar_colors.append('#E74C3C' if value < 0 else '#27AE60')
running_total += value
fig, ax = plt.subplots(figsize=(14, 8))
# Plot bars
for i in range(len(categories)):
bar = ax.bar(i, bar_heights[i], bottom=bar_bottoms[i],
color=bar_colors[i], edgecolor='black', linewidth=1.5, width=0.65)
# Value labels
if is_subtotal[i]:
label_text = f'${bar_heights[i]:.0f}'
label_y = bar_heights[i] / 2
else:
label_text = f'${values[i]:.0f}' if values[i] < 0 else f'+${values[i]:.0f}'
label_y = bar_bottoms[i] + bar_heights[i] / 2
ax.text(i, label_y, label_text, ha='center', va='center',
fontweight='bold', fontsize=9, color='white')
# Connector lines (skip before subtotals)
if i < len(categories) - 1 and not is_subtotal[i+1]:
current_top = bar_bottoms[i] + bar_heights[i]
next_bottom = bar_bottoms[i+1] if values[i+1] < 0 else current_top
ax.plot([i + 0.325, i + 0.675], [current_top, next_bottom],
'k--', linewidth=1.2, alpha=0.5)
ax.set_xticks(range(len(categories)))
ax.set_xticklabels(categories, fontsize=11)
ax.set_ylabel('Amount ($K)', fontsize=12, fontweight='bold')
ax.set_title('Income Statement Waterfall with Subtotals', fontsize=14, fontweight='bold', pad=20)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.grid(axis='y', alpha=0.3, linestyle='--')
ax.axhline(y=0, color='black', linewidth=0.8)
plt.tight_layout()
plt.show()
This pattern handles complex financial statements where you need both incremental changes and cumulative checkpoints. The subtotal bars (in dark gray) provide visual anchors that break the chart into logical sections.
Real-World Application: Financial Statement Analysis
Here’s a complete example analyzing year-over-year profit changes—a common executive reporting requirement. This combines all the techniques into production-ready code.
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
# YoY profit bridge
categories = ['2023\nNet Income', 'Revenue\nGrowth', 'COGS\nIncrease', 'SG&A\nReduction',
'R&D\nInvestment', 'Tax\nBenefit', '2024\nNet Income']
changes = [850, 320, -180, 95, -125, 60, 0]
# Calculate cumulative
cumulative = [changes[0]]
for change in changes[1:-1]:
cumulative.append(cumulative[-1] + change)
changes[-1] = cumulative[-1]
fig, ax = plt.subplots(figsize=(14, 8))
# Enhanced plotting with annotations
for i in range(len(categories)):
if i == 0 or i == len(categories) - 1:
bar = ax.bar(i, changes[i], color='#2C3E50', edgecolor='black',
linewidth=2, width=0.7, zorder=3)
bottom, height = 0, changes[i]
# Highlight final value
if i == len(categories) - 1:
delta = changes[i] - changes[0]
pct_change = (delta / changes[0]) * 100
ax.annotate(f'+{pct_change:.1f}%', xy=(i, changes[i]),
xytext=(10, 10), textcoords='offset points',
fontsize=11, fontweight='bold', color='#27AE60',
bbox=dict(boxstyle='round,pad=0.5', facecolor='white', edgecolor='#27AE60'))
else:
bottom = cumulative[i-1]
height = changes[i]
color = '#27AE60' if height > 0 else '#E74C3C'
bar = ax.bar(i, abs(height), bottom=bottom if height < 0 else cumulative[i-1],
color=color, edgecolor='black', linewidth=2, width=0.7, zorder=3)
# Value labels
label_y = bottom + height/2 if i > 0 and i < len(categories)-1 else height/2
label_text = f'${abs(changes[i]):.0f}M'
ax.text(i, label_y, label_text, ha='center', va='center',
fontweight='bold', fontsize=10, color='white', zorder=4)
# Connector lines
if i < len(categories) - 1:
current_top = bottom + height if i > 0 else height
next_bottom = 0 if i == len(categories) - 2 else cumulative[i]
ax.plot([i + 0.35, i + 0.65], [current_top, next_bottom],
'k-', linewidth=2, alpha=0.4, zorder=2)
ax.set_xticks(range(len(categories)))
ax.set_xticklabels(categories, fontsize=11, fontweight='bold')
ax.set_ylabel('Net Income ($M)', fontsize=13, fontweight='bold')
ax.set_title('Year-over-Year Net Income Bridge Analysis',
fontsize=15, fontweight='bold', pad=20)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.grid(axis='y', alpha=0.25, linestyle='--', zorder=1)
ax.axhline(y=0, color='black', linewidth=1, zorder=2)
# Add legend
green_patch = mpatches.Patch(color='#27AE60', label='Positive Impact')
red_patch = mpatches.Patch(color='#E74C3C', label='Negative Impact')
ax.legend(handles=[green_patch, red_patch], loc='upper left', frameon=True)
plt.tight_layout()
plt.show()
This visualization tells a clear story: net income grew from $850M to $1,020M, driven primarily by revenue growth, partially offset by COGS increases and R&D investment. The percentage annotation on the final bar quantifies the overall improvement.
Best Practices and Troubleshooting
Use waterfall charts when you need to explain how a value changes through sequential steps. They’re perfect for variance analysis, budget-to-actual comparisons, and profit bridges. Don’t use them for unrelated categories—that’s what standard bar charts are for.
The most common error is miscalculating cumulative positions. Always validate that your final total matches the sum of all changes. Use assertions in your code: assert changes[-1] == sum(changes[:-1]).
For large datasets (50+ categories), waterfalls become cluttered. Consider grouping minor items into an “Other” category or switching to a different visualization. Horizontal waterfalls can help with long category names, but they’re less intuitive for most audiences.
If you need interactivity (hover tooltips, drill-downs), consider Plotly’s go.Waterfall() instead. It handles the positioning logic automatically and generates interactive HTML. The tradeoff is less styling control and a steeper learning curve.
Performance is rarely an issue with Matplotlib waterfalls since you’re typically visualizing summary data, not raw transactions. If rendering is slow, the bottleneck is probably elsewhere in your data pipeline, not the plotting code.
Master these techniques and you’ll be able to create publication-ready financial visualizations that clearly communicate complex variance analysis to any audience.