How to Create a Heatmap in Seaborn

Heatmaps transform numerical data into color-coded matrices, making patterns immediately visible that would be buried in spreadsheets. They're essential for correlation analysis, model evaluation...

Key Insights

  • Seaborn’s heatmap function excels at visualizing matrix-like data including correlation matrices, confusion matrices, and time-series patterns with minimal code
  • Always annotate heatmaps when working with smaller datasets (under 20x20) and choose diverging colormaps for data with meaningful zero points
  • Use sns.clustermap() instead of basic heatmaps when you need to discover natural groupings in your data through hierarchical clustering

Introduction to Heatmaps and Seaborn

Heatmaps transform numerical data into color-coded matrices, making patterns immediately visible that would be buried in spreadsheets. They’re essential for correlation analysis, model evaluation through confusion matrices, and spotting temporal patterns in time-series data.

Seaborn simplifies heatmap creation compared to raw matplotlib, offering sensible defaults and tight integration with pandas DataFrames. Where matplotlib requires manual color mapping and annotation loops, Seaborn handles these with single parameters. This efficiency matters when you’re iterating through exploratory data analysis or building production dashboards.

Setting Up Your Environment

Start with the standard data science stack. You’ll need Seaborn (built on matplotlib), pandas for data manipulation, and numpy for numerical operations.

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Set style for better-looking plots
sns.set_theme(style="white")

# Load sample dataset
df = sns.load_dataset('tips')

# For demonstration, create a correlation matrix
tips_corr = df[['total_bill', 'tip', 'size']].corr()
print(tips_corr)

The correlation matrix is your first real-world heatmap use case. It shows relationships between numerical variables, with values ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation).

Creating a Basic Heatmap

The sns.heatmap() function requires only a 2D dataset. Feed it a correlation matrix, and you get an immediate visualization.

# Basic heatmap
plt.figure(figsize=(8, 6))
sns.heatmap(tips_corr)
plt.title('Basic Correlation Heatmap')
plt.tight_layout()
plt.show()

This generates a functional heatmap, but it’s bare-bones. The default blue gradient works, but you can’t read exact values without hovering (which doesn’t work in static images). The colorbar appears on the right, showing the scale.

For quick exploration, this suffices. For presentations or reports, you need customization.

Customizing Heatmap Appearance

The real power comes from Seaborn’s parameters. Four parameters transform basic heatmaps into publication-ready visualizations.

plt.figure(figsize=(10, 8))

# Enhanced heatmap with multiple customizations
sns.heatmap(
    tips_corr,
    annot=True,           # Show values in cells
    fmt='.2f',            # Format to 2 decimal places
    cmap='coolwarm',      # Diverging colormap
    center=0,             # Center colormap at zero
    square=True,          # Make cells square
    linewidths=1,         # Add gridlines
    cbar_kws={'label': 'Correlation Coefficient'}
)

plt.title('Customized Correlation Heatmap', fontsize=16, pad=20)
plt.tight_layout()
plt.show()

Parameter breakdown:

  • annot=True: Displays the actual values. Critical for smaller matrices where readers need exact numbers.
  • fmt='.2f': Controls number formatting. Use ‘.0%’ for percentages or ‘.3f’ for more precision.
  • cmap='coolwarm': Diverging colormaps (coolwarm, RdBu_r, seismic) work best for correlation data because they emphasize deviation from zero. Sequential colormaps (viridis, rocket) suit count data or measurements without meaningful zero points.
  • center=0: Ensures zero correlation appears neutral (white/gray), making positive and negative correlations visually distinct.
  • square=True: Maintains aspect ratio, preventing distorted cells in non-square matrices.
  • linewidths: Adds cell borders, improving readability in dense heatmaps.

Choose colormaps deliberately. The default viridis is perceptually uniform but doesn’t highlight zero. For correlation matrices, always use diverging colormaps centered at zero.

Advanced Heatmap Techniques

Masking Values

Sometimes you only need half a correlation matrix since it’s symmetric. Masking eliminates redundancy.

# Create mask for upper triangle
mask = np.triu(np.ones_like(tips_corr, dtype=bool))

plt.figure(figsize=(10, 8))
sns.heatmap(
    tips_corr,
    mask=mask,
    annot=True,
    fmt='.2f',
    cmap='coolwarm',
    center=0,
    square=True,
    linewidths=1,
    cbar_kws={'label': 'Correlation'}
)

plt.title('Masked Correlation Heatmap (Lower Triangle)', fontsize=16)
plt.tight_layout()
plt.show()

The np.triu() function creates a boolean mask for the upper triangle. This technique works for any symmetric matrix.

Hierarchical Clustering with Clustermap

When your data has many features, sns.clustermap() automatically groups similar rows and columns through hierarchical clustering.

# Load dataset with more features
iris = sns.load_dataset('iris')
iris_numeric = iris.drop('species', axis=1)

# Create clustermap
sns.clustermap(
    iris_numeric.corr(),
    annot=True,
    fmt='.2f',
    cmap='coolwarm',
    center=0,
    figsize=(10, 10),
    cbar_kws={'label': 'Correlation'},
    dendrogram_ratio=0.15
)

plt.tight_layout()
plt.show()

Clustermaps reorder your data to place similar items together, revealing structure that alphabetical ordering hides. The dendrograms on the sides show the clustering hierarchy. Use dendrogram_ratio to control dendrogram size relative to the heatmap.

Real-World Applications

Confusion Matrix Visualization

Classification models need confusion matrices to diagnose errors. Heatmaps make patterns obvious.

from sklearn.metrics import confusion_matrix
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# Generate sample classification data
X, y = make_classification(n_samples=1000, n_classes=3, n_informative=10, 
                          n_clusters_per_class=1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train classifier
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

# Create confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize
plt.figure(figsize=(8, 6))
sns.heatmap(
    cm,
    annot=True,
    fmt='d',              # Integer format for counts
    cmap='Blues',         # Sequential colormap for counts
    square=True,
    cbar_kws={'label': 'Count'}
)

plt.xlabel('Predicted Label', fontsize=12)
plt.ylabel('True Label', fontsize=12)
plt.title('Confusion Matrix', fontsize=14, pad=15)
plt.tight_layout()
plt.show()

For confusion matrices, use sequential colormaps (Blues, Greens) since counts don’t have meaningful zero points. The fmt='d' parameter ensures integer display—fractional predictions make no sense here.

Time-Series Patterns

Heatmaps excel at showing temporal patterns like hourly website traffic or seasonal trends.

# Create sample time-series data (hourly traffic by day of week)
np.random.seed(42)
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
hours = [f'{h:02d}:00' for h in range(24)]

# Simulate traffic patterns (higher during business hours on weekdays)
traffic_data = np.random.poisson(50, (7, 24))
traffic_data[0:5, 9:17] += np.random.poisson(100, (5, 8))  # Weekday business hours

traffic_df = pd.DataFrame(traffic_data, index=days, columns=hours)

# Create heatmap
plt.figure(figsize=(16, 6))
sns.heatmap(
    traffic_df,
    annot=False,          # Too many cells for annotations
    cmap='YlOrRd',        # Sequential colormap
    cbar_kws={'label': 'Visitors per Hour'}
)

plt.xlabel('Hour of Day', fontsize=12)
plt.ylabel('Day of Week', fontsize=12)
plt.title('Website Traffic Patterns', fontsize=14, pad=15)
plt.tight_layout()
plt.show()

For large heatmaps (over 20x20 cells), skip annotations. They become unreadable and clutter the visualization. Let the color gradient tell the story.

Best Practices and Tips

Choose colormaps wisely: Diverging colormaps (coolwarm, RdBu_r) for data with meaningful midpoints. Sequential colormaps (viridis, rocket) for counts or measurements. Never use rainbow/jet—they’re perceptually non-uniform and mislead viewers.

Handle large datasets: For matrices over 50x50, consider aggregation or clustering. Display every cell, and the heatmap becomes a colored blur. Use sns.clustermap() to group similar items, or bin your data into coarser categories.

Annotation strategy: Annotate when you have fewer than 400 cells (roughly 20x20). Beyond that, annotations overlap and become counterproductive. For medium-sized matrices (20-50 cells per dimension), increase figure size proportionally.

Figure sizing: Use figsize=(width, height) generously. Small heatmaps compress cells, making colors indistinguishable. A good rule: allocate at least 0.4 inches per cell dimension. For a 10x10 heatmap, use figsize=(8, 8) minimum.

Color accessibility: Test your visualizations in grayscale or use colorblind-friendly palettes. Seaborn’s ‘crest’ and ‘flare’ palettes work well for colorblind viewers. Avoid red-green combinations.

Heatmaps are workhorses of data visualization. Master these techniques, and you’ll communicate complex numerical relationships with clarity that tables and line charts can’t match. The key is matching visualization choices to your data’s structure and your audience’s needs.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.