How to Create a Grouped Bar Chart in Matplotlib
Grouped bar charts excel at comparing multiple series across the same categories. Unlike stacked bars that show composition, grouped bars let viewers directly compare values between groups without...
Key Insights
- Grouped bar charts require manual calculation of x-positions using bar width and offsets to position bars side-by-side for each category
- The fundamental pattern is
x + (bar_width * group_index)for positioning each group’s bars, with proper spacing controlled by the width parameter - Pandas DataFrames simplify grouped bar chart creation, but understanding the underlying matplotlib mechanics gives you full control over customization
Introduction & Use Cases
Grouped bar charts excel at comparing multiple series across the same categories. Unlike stacked bars that show composition, grouped bars let viewers directly compare values between groups without mental arithmetic.
Use grouped bar charts when you need to compare performance metrics across time periods (Q1 vs Q2 sales), test variants (A/B testing conversion rates), or demographic segments (survey responses by age group). They’re particularly effective when you have 2-4 groups and 3-8 categories. Beyond that, consider small multiples or alternative visualizations.
Common applications include comparing product sales across quarters, analyzing website metrics across different user segments, displaying survey results broken down by demographics, or showing before/after comparisons across multiple metrics.
Basic Grouped Bar Chart
The core technique involves calculating x-positions for each bar group manually. You’ll position bars side-by-side by offsetting their x-coordinates based on bar width and group index.
import matplotlib.pyplot as plt
import numpy as np
# Data setup
categories = ['Electronics', 'Clothing', 'Home & Garden', 'Sports']
q1_sales = [45000, 32000, 28000, 19000]
q2_sales = [52000, 29000, 31000, 24000]
# Bar positioning
x = np.arange(len(categories)) # Label positions
width = 0.35 # Bar width
fig, ax = plt.subplots(figsize=(10, 6))
# Create bars with offset positions
bars1 = ax.bar(x - width/2, q1_sales, width, label='Q1 2024')
bars2 = ax.bar(x + width/2, q2_sales, width, label='Q2 2024')
# Labels and formatting
ax.set_xlabel('Product Category')
ax.set_ylabel('Sales ($)')
ax.set_title('Quarterly Sales Comparison by Category')
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.legend()
plt.tight_layout()
plt.show()
The critical line is x - width/2 and x + width/2. For the first group, we shift left by half the bar width; for the second, we shift right. This centers both bars around each category position. The width parameter controls both bar thickness and spacing between groups.
Customizing Bar Appearance
Visual distinction between groups is crucial. Use color, edge styling, and patterns to make groups immediately recognizable.
import matplotlib.pyplot as plt
import numpy as np
categories = ['Electronics', 'Clothing', 'Home & Garden', 'Sports']
q1_sales = [45000, 32000, 28000, 19000]
q2_sales = [52000, 29000, 31000, 24000]
x = np.arange(len(categories))
width = 0.35
fig, ax = plt.subplots(figsize=(10, 6))
# Customized bars with colors and edges
bars1 = ax.bar(x - width/2, q1_sales, width,
label='Q1 2024',
color='#3498db',
edgecolor='#2c3e50',
linewidth=1.5,
alpha=0.8)
bars2 = ax.bar(x + width/2, q2_sales, width,
label='Q2 2024',
color='#e74c3c',
edgecolor='#c0392b',
linewidth=1.5,
alpha=0.8)
ax.set_xlabel('Product Category', fontsize=12, fontweight='bold')
ax.set_ylabel('Sales ($)', fontsize=12, fontweight='bold')
ax.set_title('Quarterly Sales Comparison by Category', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.legend(frameon=True, shadow=True)
ax.grid(axis='y', alpha=0.3, linestyle='--')
plt.tight_layout()
plt.show()
The edgecolor parameter adds definition to bars, preventing them from blending together. The alpha parameter controls transparency—useful when bars might overlap or when you want a softer appearance. Grid lines on the y-axis help viewers estimate values without cluttering the chart.
Adding Labels and Legends
Value labels on bars eliminate guesswork and make your chart self-sufficient for presentations and reports.
import matplotlib.pyplot as plt
import numpy as np
categories = ['Electronics', 'Clothing', 'Home & Garden', 'Sports']
q1_sales = [45000, 32000, 28000, 19000]
q2_sales = [52000, 29000, 31000, 24000]
x = np.arange(len(categories))
width = 0.35
fig, ax = plt.subplots(figsize=(10, 6))
bars1 = ax.bar(x - width/2, q1_sales, width, label='Q1 2024', color='#3498db')
bars2 = ax.bar(x + width/2, q2_sales, width, label='Q2 2024', color='#e74c3c')
# Add value labels on bars
def add_value_labels(bars):
for bar in bars:
height = bar.get_height()
ax.text(bar.get_x() + bar.get_width()/2., height,
f'${height:,.0f}',
ha='center', va='bottom', fontsize=9, fontweight='bold')
add_value_labels(bars1)
add_value_labels(bars2)
ax.set_xlabel('Product Category', fontsize=12)
ax.set_ylabel('Sales ($)', fontsize=12)
ax.set_title('Quarterly Sales Comparison by Category', fontsize=14, pad=20)
ax.set_xticks(x)
ax.set_xticklabels(categories, rotation=45, ha='right')
ax.legend(loc='upper left', frameon=True)
# Format y-axis with thousands separator
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x:,.0f}'))
plt.tight_layout()
plt.show()
The add_value_labels() function iterates through bars, positioning text at the center-top of each bar. The ha='center' and va='bottom' parameters ensure proper alignment. Rotating x-axis labels with rotation=45 prevents overlap with longer category names.
Working with Pandas DataFrames
Real-world data typically lives in DataFrames. Here’s how to bridge pandas and matplotlib for grouped bar charts.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Create DataFrame
data = {
'Category': ['Electronics', 'Clothing', 'Home & Garden', 'Sports'],
'Q1': [45000, 32000, 28000, 19000],
'Q2': [52000, 29000, 31000, 24000],
'Q3': [48000, 35000, 33000, 22000]
}
df = pd.DataFrame(data)
# Method 1: Using pandas plot method
ax = df.set_index('Category')[['Q1', 'Q2', 'Q3']].plot(
kind='bar',
figsize=(10, 6),
color=['#3498db', '#e74c3c', '#2ecc71'],
edgecolor='black',
linewidth=1
)
ax.set_ylabel('Sales ($)')
ax.set_title('Quarterly Sales by Category')
ax.legend(title='Quarter')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()
# Method 2: Manual with full control
fig, ax = plt.subplots(figsize=(10, 6))
x = np.arange(len(df))
width = 0.25
ax.bar(x - width, df['Q1'], width, label='Q1', color='#3498db')
ax.bar(x, df['Q2'], width, label='Q2', color='#e74c3c')
ax.bar(x + width, df['Q3'], width, label='Q3', color='#2ecc71')
ax.set_xticks(x)
ax.set_xticklabels(df['Category'])
ax.set_ylabel('Sales ($)')
ax.set_title('Quarterly Sales by Category')
ax.legend()
plt.tight_layout()
plt.show()
The pandas .plot() method is convenient for quick visualizations, but manual matplotlib construction gives you pixel-perfect control over positioning, colors, and annotations. Choose based on your needs: speed versus customization.
Advanced Techniques
For more complex scenarios, you’ll need error bars, additional groups, or horizontal orientation.
import matplotlib.pyplot as plt
import numpy as np
# Four groups with error margins
categories = ['Product A', 'Product B', 'Product C', 'Product D']
region1 = [45, 52, 38, 61]
region2 = [42, 48, 41, 58]
region3 = [50, 55, 35, 63]
region4 = [38, 45, 43, 55]
# Error margins (confidence intervals)
errors1 = [3, 4, 2, 5]
errors2 = [2, 3, 3, 4]
errors3 = [4, 5, 2, 6]
errors4 = [3, 3, 4, 4]
x = np.arange(len(categories))
width = 0.2
fig, ax = plt.subplots(figsize=(12, 6))
# Four groups with error bars
ax.bar(x - 1.5*width, region1, width, label='North', yerr=errors1,
capsize=3, color='#3498db', alpha=0.8)
ax.bar(x - 0.5*width, region2, width, label='South', yerr=errors2,
capsize=3, color='#e74c3c', alpha=0.8)
ax.bar(x + 0.5*width, region3, width, label='East', yerr=errors3,
capsize=3, color='#2ecc71', alpha=0.8)
ax.bar(x + 1.5*width, region4, width, label='West', yerr=errors4,
capsize=3, color='#f39c12', alpha=0.8)
ax.set_xlabel('Product')
ax.set_ylabel('Sales Performance Score')
ax.set_title('Regional Sales Performance with Confidence Intervals')
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.legend(ncol=4, loc='upper left')
ax.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()
With four groups, the offset calculation becomes x + (width * (group_index - 1.5)) to center all bars around each category. The yerr parameter adds error bars, and capsize controls the cap width. This visualization is perfect for showing uncertainty in measurements or confidence intervals from statistical analysis.
Best Practices & Common Pitfalls
Limit your groups and categories. More than four groups or eight categories creates visual chaos. If you have more data, consider faceting into multiple charts or using a heatmap instead.
Choose distinguishable colors. Use a proper color palette that works for colorblind viewers. Tools like ColorBrewer or matplotlib’s built-in palettes help. Avoid red-green combinations.
Mind your bar width. Too wide and bars touch or overlap; too narrow and they look like sticks. A width of 0.3-0.4 works well for two groups, 0.2-0.25 for three or four groups.
Label strategically. Don’t label every bar if values are similar or space is tight. Consider showing only significant differences or using a data table alongside the chart.
Consider alternatives. If groups have very different scales, grouped bars mislead. Use separate subplots or normalized values. If showing parts of a whole, stacked bars or a grouped stacked combination might work better.
Watch for overlap. When bars have similar heights, value labels can overlap. Adjust font size, use alternating positions, or skip labels for less important values.
The key to effective grouped bar charts is restraint. Show enough to make your point, but not so much that viewers get lost in the details. When in doubt, simplify.