How to Create a Bar Chart in Matplotlib

Bar charts are the workhorse of data visualization. They excel at comparing discrete categories and showing magnitude differences at a glance. Matplotlib gives you granular control over every aspect...

Key Insights

  • Matplotlib’s plt.bar() and plt.barh() functions provide complete control over bar chart creation, from basic plots to complex grouped and stacked visualizations
  • Always start your y-axis at zero for bar charts to avoid misleading visual comparisons—bar length must accurately represent magnitude
  • Use plt.text() for value labels, adjust bar width for visual clarity, and leverage color arrays to highlight specific categories in your data

Bar charts are the workhorse of data visualization. They excel at comparing discrete categories and showing magnitude differences at a glance. Matplotlib gives you granular control over every aspect of bar chart creation, from basic plots to sophisticated multi-series comparisons.

Installation and Setup

Install Matplotlib using pip if you haven’t already:

pip install matplotlib numpy

Import the necessary modules:

import matplotlib.pyplot as plt
import numpy as np

That’s all you need to get started. Matplotlib’s pyplot interface provides a MATLAB-like experience that’s intuitive for quick visualizations.

Creating a Basic Bar Chart

The fundamental bar chart requires just two arguments: x-positions and heights. Here’s the simplest implementation:

import matplotlib.pyplot as plt

categories = ['Q1', 'Q2', 'Q3', 'Q4']
revenue = [45000, 52000, 48000, 61000]

plt.bar(categories, revenue)
plt.show()

This creates a vertical bar chart with default styling. The plt.bar() function automatically spaces bars evenly and applies Matplotlib’s default color scheme. The x-axis takes your category labels, and the y-axis scales automatically to fit your data range.

For numerical x-values, you have more control:

x_positions = [0, 1, 2, 3]
values = [23, 45, 56, 78]

plt.bar(x_positions, values)
plt.xticks(x_positions, ['Product A', 'Product B', 'Product C', 'Product D'])
plt.show()

Using numerical positions lets you manipulate spacing and alignment precisely—critical for grouped bar charts later.

Customizing Bar Appearance

Default bars are functional but bland. Customize colors, widths, edges, and transparency to match your brand or improve readability:

categories = ['Marketing', 'Sales', 'Engineering', 'Operations', 'Support']
expenses = [45000, 62000, 98000, 34000, 28000]

# Single color with custom styling
plt.bar(categories, expenses, 
        color='steelblue',
        width=0.6,
        edgecolor='navy',
        linewidth=2,
        alpha=0.8)

plt.show()

The width parameter controls bar thickness (default is 0.8). Reduce it for narrower bars with more whitespace. The edgecolor and linewidth parameters add borders, while alpha controls transparency.

For more visual impact, assign different colors to each bar:

colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#FFA07A', '#98D8C8']

plt.bar(categories, expenses,
        color=colors,
        width=0.7,
        edgecolor='black',
        linewidth=1.5)

plt.show()

Color arrays let you highlight specific categories—use red for underperforming segments or green for targets exceeded. This targeted use of color guides viewer attention effectively.

Horizontal Bars and Grouped/Stacked Charts

Horizontal bars work better for long category names or when you have many categories:

departments = ['Customer Success', 'Product Management', 'Business Development', 'Human Resources']
headcount = [12, 8, 15, 6]

plt.barh(departments, headcount, color='coral')
plt.xlabel('Number of Employees')
plt.show()

The plt.barh() function flips orientation. Everything else works identically to vertical bars.

Grouped bars compare multiple datasets side-by-side:

categories = ['Jan', 'Feb', 'Mar', 'Apr', 'May']
product_a = [45, 52, 48, 61, 58]
product_b = [38, 42, 51, 49, 55]

x = np.arange(len(categories))
width = 0.35

plt.bar(x - width/2, product_a, width, label='Product A', color='skyblue')
plt.bar(x + width/2, product_b, width, label='Product B', color='lightcoral')

plt.xlabel('Month')
plt.ylabel('Sales (thousands)')
plt.xticks(x, categories)
plt.legend()
plt.show()

The key technique: offset each series by half the bar width. This positions bars side-by-side without overlap. Use np.arange() to create evenly-spaced numerical positions.

Stacked bars show composition:

categories = ['Q1', 'Q2', 'Q3', 'Q4']
online_sales = [120, 135, 142, 158]
retail_sales = [89, 95, 87, 102]

plt.bar(categories, online_sales, label='Online', color='steelblue')
plt.bar(categories, retail_sales, bottom=online_sales, label='Retail', color='lightgreen')

plt.ylabel('Revenue ($K)')
plt.legend()
plt.show()

The bottom parameter stacks the second series on top of the first. For more than two series, calculate cumulative sums for each bottom value.

Adding Labels, Titles, and Annotations

Raw charts lack context. Add descriptive labels, titles, and value annotations:

categories = ['Mobile', 'Desktop', 'Tablet', 'Other']
users = [4523, 3241, 892, 334]

bars = plt.bar(categories, users, color='mediumseagreen', edgecolor='darkgreen', linewidth=2)

# Add value labels on top of bars
for bar in bars:
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height,
             f'{int(height):,}',
             ha='center', va='bottom', fontweight='bold')

plt.title('Active Users by Device Type', fontsize=16, fontweight='bold', pad=20)
plt.xlabel('Device Category', fontsize=12)
plt.ylabel('Number of Users', fontsize=12)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

The plt.text() function places annotations. Calculate x-position as bar center and y-position as bar height. The ha and va parameters control alignment.

Rotate x-tick labels with rotation when category names are long. Always call plt.tight_layout() to prevent label clipping.

Real-World Example

Here’s a complete example analyzing quarterly performance across regions:

import matplotlib.pyplot as plt
import numpy as np

# Data preparation
regions = ['North America', 'Europe', 'Asia Pacific', 'Latin America']
q1_revenue = [245, 198, 312, 87]
q2_revenue = [267, 215, 334, 92]
q3_revenue = [289, 223, 356, 98]
q4_revenue = [312, 241, 389, 105]

# Calculate growth rates
total_revenue = [sum(x) for x in zip(q1_revenue, q2_revenue, q3_revenue, q4_revenue)]
growth_rate = [(q4_revenue[i] - q1_revenue[i]) / q1_revenue[i] * 100 for i in range(len(regions))]

# Create visualization
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))

# Total revenue by region
colors = ['#2E86AB' if rate > 20 else '#A23B72' for rate in growth_rate]
bars = ax1.bar(regions, total_revenue, color=colors, edgecolor='black', linewidth=1.5)

for bar, total in zip(bars, total_revenue):
    ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height(),
             f'${total}M', ha='center', va='bottom', fontweight='bold')

ax1.set_title('Total Annual Revenue by Region', fontsize=14, fontweight='bold')
ax1.set_ylabel('Revenue (Millions USD)', fontsize=11)
ax1.tick_params(axis='x', rotation=45)

# Quarterly breakdown for top region
quarters = ['Q1', 'Q2', 'Q3', 'Q4']
asia_revenue = [312, 334, 356, 389]
ax2.bar(quarters, asia_revenue, color='#06A77D', edgecolor='darkgreen', linewidth=2)

for i, (q, rev) in enumerate(zip(quarters, asia_revenue)):
    ax2.text(i, rev, f'${rev}M', ha='center', va='bottom', fontweight='bold')

ax2.set_title('Asia Pacific Quarterly Performance', fontsize=14, fontweight='bold')
ax2.set_ylabel('Revenue (Millions USD)', fontsize=11)

plt.tight_layout()
plt.savefig('revenue_analysis.png', dpi=300, bbox_inches='tight')
plt.show()

This example demonstrates data preparation, conditional coloring based on growth rates, multi-panel layouts, and exporting high-resolution images with plt.savefig(). The 300 DPI setting ensures print quality.

Common Pitfalls and Best Practices

Always start bar charts at zero. Truncated y-axes distort magnitude comparisons:

# WRONG: Misleading truncated axis
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

categories = ['Product A', 'Product B', 'Product C']
sales = [87, 91, 89]

# Misleading version
ax1.bar(categories, sales, color='crimson')
ax1.set_ylim(85, 92)
ax1.set_title('Misleading: Truncated Y-Axis', fontweight='bold')
ax1.set_ylabel('Sales')

# Correct version
ax2.bar(categories, sales, color='steelblue')
ax2.set_ylim(0, 100)
ax2.set_title('Correct: Y-Axis Starts at Zero', fontweight='bold')
ax2.set_ylabel('Sales')

plt.tight_layout()
plt.show()

The misleading chart exaggerates small differences. The correct version shows the true proportional relationship.

Choose the right chart type. Bar charts work for discrete categories. Use line charts for continuous time series and scatter plots for correlation analysis. Don’t force bar charts where they don’t fit.

Limit categories. More than 10-12 bars become cluttered. Group smaller categories into “Other” or use horizontal bars for better readability.

Sort meaningfully. Arrange bars by value (descending or ascending) unless natural ordering exists (time periods, ordinal scales). Random ordering makes patterns harder to spot.

Performance matters for large datasets. Bar charts render quickly, but animations and interactive updates benefit from blitting or using specialized libraries like Plotly for web applications.

Matplotlib’s bar charts give you complete control over your visualizations. Master these fundamentals, apply consistent styling, and always prioritize clarity over decoration. Your data deserves accurate, honest representation.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.