How to Create a QQ Plot in Python

Key Insights

QQ plots provide a visual method for assessing whether your data follows a specific distribution, with deviations from the diagonal line revealing skewness, heavy tails, or outliers at a glance.
Python offers two primary approaches: scipy.stats.probplot() for quick generation and statsmodels.graphics.gofplots.qqplot() for more customization and statistical rigor.
Always combine QQ plots with formal statistical tests like Shapiro-Wilk for small samples or Anderson-Darling for larger datasets—visual inspection alone isn’t sufficient for production decisions.

Introduction to QQ Plots

A quantile-quantile plot, or QQ plot, is one of the most powerful visual tools for assessing whether your data follows a particular theoretical distribution. While histograms and density plots give you a general sense of your data’s shape, QQ plots provide a more rigorous comparison by plotting your sample quantiles against the quantiles you’d expect if the data came from your target distribution.

The most common use case is testing for normality. Before running linear regression, ANOVA, or t-tests, you need to verify that your residuals or data approximate a normal distribution. A QQ plot answers this question visually: if your points fall along a straight diagonal line, your data matches the theoretical distribution. Deviations from that line tell you exactly how your data differs.

You should reach for a QQ plot when you need to validate distributional assumptions before statistical modeling, diagnose why a model’s residuals look suspicious, or compare your empirical data against any theoretical distribution—not just normal.

Understanding the Theory Behind QQ Plots

The mechanics of a QQ plot are straightforward once you understand quantiles. A quantile represents a value below which a certain proportion of data falls. The median is the 50th percentile (0.5 quantile), meaning half the data falls below it.

A QQ plot works by sorting your sample data and pairing each observation with its corresponding theoretical quantile. For a dataset of n points, you calculate the expected quantile positions (typically using the formula (i - 0.5) / n for the i-th sorted observation) and then find what value the theoretical distribution would have at that quantile.

The resulting scatter plot has theoretical quantiles on the x-axis and your sample quantiles on the y-axis. If your data comes from the theoretical distribution, the points should fall along the line y = x. In practice, the reference line is often fitted to account for differences in location and scale—your data might be normally distributed but with a different mean or standard deviation than the standard normal.

The key insight is that systematic deviations from the line indicate systematic departures from the assumed distribution. Random scatter around the line is expected due to sampling variability.

Creating QQ Plots with Scipy and Statsmodels

Python provides two excellent libraries for generating QQ plots. Let’s start with the basics using both approaches.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import statsmodels.api as sm

# Generate sample data
np.random.seed(42)
normal_data = np.random.normal(loc=50, scale=10, size=200)

# Method 1: Using scipy.stats.probplot
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

stats.probplot(normal_data, dist="norm", plot=axes[0])
axes[0].set_title("QQ Plot using scipy.stats.probplot")

# Method 2: Using statsmodels qqplot
sm.qqplot(normal_data, line='45', ax=axes[1])
axes[1].set_title("QQ Plot using statsmodels.qqplot")

plt.tight_layout()
plt.show()

The scipy.stats.probplot() function returns the ordered values and the theoretical quantiles, along with a tuple containing the slope, intercept, and R-squared of the fitted line. The plot parameter accepts a matplotlib axes object for direct plotting.

The statsmodels.qqplot() function offers more options for the reference line through the line parameter: '45' draws a 45-degree reference line, 's' standardizes the data and draws a line through the first and third quartiles, 'r' draws a regression line, and 'q' draws a line through the quartiles.

# Accessing the underlying data from probplot
osm, osr = stats.probplot(normal_data, dist="norm", fit=False)
theoretical_quantiles = osm
sample_quantiles = osr

print(f"First 5 theoretical quantiles: {theoretical_quantiles[:5]}")
print(f"First 5 sample quantiles: {sample_quantiles[:5]}")

Customizing Your QQ Plot

Default plots rarely meet publication standards. Here’s how to create professional-quality QQ plots with proper styling.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

np.random.seed(42)
data = np.random.normal(loc=100, scale=15, size=150)

fig, ax = plt.subplots(figsize=(8, 6))

# Generate QQ plot data
(osm, osr), (slope, intercept, r) = stats.probplot(data, dist="norm")

# Plot points with custom styling
ax.scatter(osm, osr, c='steelblue', alpha=0.6, edgecolors='navy', 
           linewidths=0.5, s=50, label='Sample quantiles')

# Add reference line
line_x = np.array([osm.min(), osm.max()])
line_y = slope * line_x + intercept
ax.plot(line_x, line_y, 'r-', linewidth=2, label=f'Reference line (R² = {r**2:.4f})')

# Add confidence bands (approximate 95% CI)
n = len(data)
se = np.sqrt((1 + osm**2) / n)  # Standard error approximation
ci = 1.96 * se * np.std(data)

ax.fill_between(osm, slope * osm + intercept - ci, 
                slope * osm + intercept + ci,
                alpha=0.2, color='red', label='95% Confidence band')

# Styling
ax.set_xlabel('Theoretical Quantiles', fontsize=12)
ax.set_ylabel('Sample Quantiles', fontsize=12)
ax.set_title('QQ Plot with Confidence Bands', fontsize=14, fontweight='bold')
ax.legend(loc='upper left')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

The confidence bands help you assess whether deviations from the line are statistically meaningful or just random variation. Points falling outside the bands warrant further investigation.

Interpreting QQ Plot Patterns

Understanding what different patterns mean is crucial for practical data analysis. Let’s generate examples of common departures from normality.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

np.random.seed(42)
n = 300

# Generate different distributions
normal = np.random.normal(0, 1, n)
right_skewed = np.random.exponential(2, n)
left_skewed = -np.random.exponential(2, n)
heavy_tails = np.random.standard_t(df=3, size=n)
light_tails = np.random.uniform(-2, 2, n)
with_outliers = np.concatenate([np.random.normal(0, 1, n-5), [5, 6, -5, -6, 7]])

datasets = [
    (normal, "Normal Distribution"),
    (right_skewed, "Right Skewed"),
    (left_skewed, "Left Skewed"),
    (heavy_tails, "Heavy Tails (t-distribution)"),
    (light_tails, "Light Tails (Uniform)"),
    (with_outliers, "Normal with Outliers")
]

fig, axes = plt.subplots(2, 3, figsize=(14, 9))
axes = axes.flatten()

for ax, (data, title) in zip(axes, datasets):
    stats.probplot(data, dist="norm", plot=ax)
    ax.set_title(title, fontsize=11, fontweight='bold')
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Here’s how to interpret each pattern:

Normal distribution: Points follow the diagonal line closely with only minor random deviations.

Right skewed: The plot curves upward at both ends, with the right tail pulling away more dramatically. Points in the upper right exceed what normality predicts.

Left skewed: The mirror image—the plot curves downward, with the left tail showing values more extreme than expected.

Heavy tails: Both ends of the plot deviate from the line, with lower-than-expected values on the left and higher-than-expected on the right. This S-shaped pattern indicates more extreme values than a normal distribution produces.

Light tails: The opposite S-shape—values at both extremes are less extreme than normality would predict. The uniform distribution is a classic example.

Outliers: Most points follow the line, but a few isolated points at the extremes break away dramatically.

Comparing Against Non-Normal Distributions

QQ plots aren’t limited to testing normality. You can compare your data against any distribution available in scipy.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

np.random.seed(42)

# Generate exponential data
exp_data = np.random.exponential(scale=2.0, size=200)

fig, axes = plt.subplots(1, 3, figsize=(14, 4))

# Test against normal (wrong assumption)
stats.probplot(exp_data, dist="norm", plot=axes[0])
axes[0].set_title("Exponential Data vs Normal\n(Poor Fit)", fontweight='bold')

# Test against exponential (correct)
stats.probplot(exp_data, dist="expon", plot=axes[1])
axes[1].set_title("Exponential Data vs Exponential\n(Good Fit)", fontweight='bold')

# Test uniform data against uniform distribution
uniform_data = np.random.uniform(0, 10, 200)
stats.probplot(uniform_data, dist="uniform", plot=axes[2])
axes[2].set_title("Uniform Data vs Uniform\n(Good Fit)", fontweight='bold')

for ax in axes:
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

The dist parameter accepts any continuous distribution from scipy.stats. For distributions with parameters, you can pass them using the sparams argument:

# Compare against gamma distribution with specific shape parameter
gamma_data = np.random.gamma(shape=2, scale=2, size=200)
stats.probplot(gamma_data, dist="gamma", sparams=(2,), plot=plt.gca())

Practical Example and Best Practices

Let’s work through a complete workflow using a realistic scenario: validating assumptions before running a t-test on experimental data.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import statsmodels.api as sm

# Simulate experimental data
np.random.seed(42)
df = pd.DataFrame({
    'control': np.random.normal(loc=45, scale=8, size=50),
    'treatment': np.random.normal(loc=52, scale=10, size=50)
})

# Add a few outliers to treatment to make it realistic
df.loc[48:49, 'treatment'] = [85, 88]

def assess_normality(data, name, ax):
    """Complete normality assessment with QQ plot and statistical tests."""
    
    # QQ Plot
    stats.probplot(data, dist="norm", plot=ax)
    ax.set_title(f'QQ Plot: {name}', fontweight='bold')
    ax.grid(True, alpha=0.3)
    
    # Statistical tests
    n = len(data)
    
    # Shapiro-Wilk (best for n < 50)
    if n <= 5000:
        shapiro_stat, shapiro_p = stats.shapiro(data)
    else:
        shapiro_stat, shapiro_p = np.nan, np.nan
    
    # Anderson-Darling (good for larger samples)
    anderson_result = stats.anderson(data, dist='norm')
    
    return {
        'n': n,
        'mean': np.mean(data),
        'std': np.std(data),
        'skewness': stats.skew(data),
        'kurtosis': stats.kurtosis(data),
        'shapiro_stat': shapiro_stat,
        'shapiro_p': shapiro_p,
        'anderson_stat': anderson_result.statistic,
        'anderson_cv_5pct': anderson_result.critical_values[2]
    }

# Run assessment
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

results = {}
for ax, col in zip(axes, ['control', 'treatment']):
    results[col] = assess_normality(df[col].values, col.capitalize(), ax)

plt.tight_layout()
plt.show()

# Print summary
print("\nNormality Assessment Summary")
print("=" * 60)
for group, stats_dict in results.items():
    print(f"\n{group.upper()}")
    print(f"  Sample size: {stats_dict['n']}")
    print(f"  Mean: {stats_dict['mean']:.2f}, Std: {stats_dict['std']:.2f}")
    print(f"  Skewness: {stats_dict['skewness']:.3f}")
    print(f"  Kurtosis: {stats_dict['kurtosis']:.3f}")
    print(f"  Shapiro-Wilk p-value: {stats_dict['shapiro_p']:.4f}")
    
    anderson_pass = stats_dict['anderson_stat'] < stats_dict['anderson_cv_5pct']
    print(f"  Anderson-Darling: {'PASS' if anderson_pass else 'FAIL'} at 5% level")

Best practices to remember:

Sample size matters: QQ plots become more reliable with larger samples. Below 20 observations, even normally distributed data can look irregular. Above 200, minor deviations become visible but may not be practically significant.
Combine visual and statistical tests: Use QQ plots for diagnosis and understanding, but back up your conclusions with formal tests. Shapiro-Wilk works well for samples under 50; Anderson-Darling handles larger datasets better.
Consider practical significance: Perfect normality is rare in real data. Ask whether deviations are large enough to affect your analysis. Many statistical methods are robust to mild departures from normality, especially with larger samples.
Check residuals, not raw data: For regression and ANOVA, it’s the residuals that need to be normal, not the original variables.
Document your assessment: Include QQ plots in your analysis reports. They communicate distributional properties more intuitively than test statistics alone.