Chi-Square Distribution in Python: Complete Guide

Key Insights

The chi-square distribution is fundamental to categorical data analysis, with degrees of freedom controlling its shape—low df creates right-skewed distributions while higher df approaches normality.
Python’s scipy.stats provides two distinct chi-square tests: chisquare() for goodness-of-fit (one variable against expected frequencies) and chi2_contingency() for independence testing (relationships between two categorical variables).
Always verify assumptions before running chi-square tests: expected frequencies should be at least 5 in each cell, observations must be independent, and sample sizes need to be adequate for meaningful results.

Introduction to Chi-Square Distribution

The chi-square (χ²) distribution is a continuous probability distribution that emerges naturally when you square standard normal random variables. If you take k independent standard normal variables and sum their squares, the result follows a chi-square distribution with k degrees of freedom.

This mathematical property makes chi-square invaluable for three core statistical tasks:

Goodness-of-fit tests: Determining if observed categorical data matches expected frequencies
Independence tests: Checking if two categorical variables are related
Variance analysis: Constructing confidence intervals for population variance

In practice, you’ll encounter chi-square most often when working with categorical data—survey responses, classification outcomes, or any scenario where you’re counting observations in discrete categories rather than measuring continuous values.

Chi-Square Distribution Properties and Parameters

The chi-square distribution has a single parameter: degrees of freedom (df). This parameter fundamentally shapes the distribution’s behavior:

Low df (1-3): Heavily right-skewed, concentrated near zero
Moderate df (5-10): Still skewed but more spread out
High df (30+): Approaches a normal distribution (Central Limit Theorem in action)

The mean of a chi-square distribution equals its degrees of freedom, and the variance equals twice the degrees of freedom. This relationship explains why the distribution spreads out as df increases.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# Visualize chi-square PDFs with different degrees of freedom
fig, ax = plt.subplots(figsize=(10, 6))
x = np.linspace(0, 30, 500)

degrees_of_freedom = [1, 2, 3, 5, 10, 15]
colors = plt.cm.viridis(np.linspace(0, 0.9, len(degrees_of_freedom)))

for df, color in zip(degrees_of_freedom, colors):
    chi2_dist = stats.chi2(df=df)
    y = chi2_dist.pdf(x)
    ax.plot(x, y, label=f'df = {df}', color=color, linewidth=2)

ax.set_xlabel('x')
ax.set_ylabel('Probability Density')
ax.set_title('Chi-Square Distribution: Effect of Degrees of Freedom')
ax.legend()
ax.set_ylim(0, 0.5)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Notice how df=1 creates an extreme right skew with values concentrated near zero, while df=15 produces a more symmetric, bell-like shape centered around 15.

Generating Chi-Square Distributions in Python

Python offers multiple approaches for working with chi-square distributions. Here’s how to generate samples and compute key statistics:

import numpy as np
from scipy import stats

# Set seed for reproducibility
np.random.seed(42)

df = 5  # degrees of freedom
n_samples = 10000

# Method 1: NumPy's random generator
samples_numpy = np.random.chisquare(df=df, size=n_samples)

# Method 2: SciPy's chi2 distribution
chi2_dist = stats.chi2(df=df)
samples_scipy = chi2_dist.rvs(size=n_samples)

# Compare theoretical vs empirical statistics
print(f"Theoretical mean: {df}")
print(f"NumPy sample mean: {samples_numpy.mean():.3f}")
print(f"SciPy sample mean: {samples_scipy.mean():.3f}")
print(f"\nTheoretical variance: {2 * df}")
print(f"NumPy sample variance: {samples_numpy.var():.3f}")
print(f"SciPy sample variance: {samples_scipy.var():.3f}")

For statistical inference, you’ll need PDF, CDF, and critical values:

from scipy import stats

df = 5
chi2_dist = stats.chi2(df=df)

# Probability density at x=3
pdf_value = chi2_dist.pdf(3)
print(f"PDF at x=3: {pdf_value:.4f}")

# Cumulative probability P(X <= 3)
cdf_value = chi2_dist.cdf(3)
print(f"P(X <= 3): {cdf_value:.4f}")

# Critical values for common significance levels
alpha_levels = [0.10, 0.05, 0.01]
for alpha in alpha_levels:
    critical_value = chi2_dist.ppf(1 - alpha)
    print(f"Critical value (α={alpha}): {critical_value:.4f}")

# Survival function: P(X > x) - useful for p-values
test_statistic = 11.07
p_value = chi2_dist.sf(test_statistic)
print(f"\nP(X > {test_statistic}): {p_value:.4f}")

Chi-Square Goodness-of-Fit Test

The goodness-of-fit test compares observed frequencies against expected frequencies for a single categorical variable. The test statistic is:

χ² = Σ (observed - expected)² / expected

Let’s test whether a die is fair:

import numpy as np
from scipy import stats

# Observed frequencies from 600 dice rolls
observed = np.array([89, 113, 98, 104, 92, 104])

# Expected frequencies if die is fair (600/6 = 100 each)
expected = np.array([100, 100, 100, 100, 100, 100])

# Perform chi-square goodness-of-fit test
chi2_stat, p_value = stats.chisquare(f_obs=observed, f_exp=expected)

print("Chi-Square Goodness-of-Fit Test: Is the die fair?")
print(f"Observed frequencies: {observed}")
print(f"Expected frequencies: {expected}")
print(f"\nChi-square statistic: {chi2_stat:.4f}")
print(f"Degrees of freedom: {len(observed) - 1}")
print(f"P-value: {p_value:.4f}")

alpha = 0.05
if p_value < alpha:
    print(f"\nReject H0 at α={alpha}: Die appears biased")
else:
    print(f"\nFail to reject H0 at α={alpha}: No evidence die is biased")

The degrees of freedom for goodness-of-fit equals (number of categories - 1). If you estimate parameters from the data (like fitting a Poisson distribution), subtract additional degrees of freedom for each estimated parameter.

Chi-Square Test for Independence

The independence test examines whether two categorical variables are related. It uses a contingency table (crosstab) of observed frequencies:

import pandas as pd
import numpy as np
from scipy import stats

# Survey data: Treatment type vs. Outcome
data = {
    'treatment': ['A']*150 + ['B']*150 + ['C']*150,
    'outcome': (
        ['improved']*75 + ['no_change']*50 + ['worsened']*25 +
        ['improved']*60 + ['no_change']*55 + ['worsened']*35 +
        ['improved']*90 + ['no_change']*40 + ['worsened']*20
    )
}
df = pd.DataFrame(data)

# Create contingency table
contingency_table = pd.crosstab(df['treatment'], df['outcome'])
print("Contingency Table:")
print(contingency_table)
print()

# Perform chi-square test for independence
chi2_stat, p_value, dof, expected_freq = stats.chi2_contingency(contingency_table)

print(f"Chi-square statistic: {chi2_stat:.4f}")
print(f"Degrees of freedom: {dof}")
print(f"P-value: {p_value:.4f}")
print(f"\nExpected frequencies (if independent):")
print(pd.DataFrame(
    expected_freq,
    index=contingency_table.index,
    columns=contingency_table.columns
).round(2))

alpha = 0.05
if p_value < alpha:
    print(f"\nReject H0: Treatment and outcome are NOT independent")
else:
    print(f"\nFail to reject H0: No evidence of association")

Degrees of freedom for independence tests equal (rows - 1) × (columns - 1).

Practical Applications and Interpretation

Statistical significance alone doesn’t tell you how strong an association is. Cramér’s V provides an effect size measure ranging from 0 (no association) to 1 (perfect association):

import pandas as pd
import numpy as np
from scipy import stats

def cramers_v(contingency_table):
    """Calculate Cramér's V for effect size."""
    chi2 = stats.chi2_contingency(contingency_table)[0]
    n = contingency_table.sum().sum()
    min_dim = min(contingency_table.shape) - 1
    return np.sqrt(chi2 / (n * min_dim))

def check_chi2_assumptions(contingency_table):
    """Check assumptions for chi-square test."""
    _, _, _, expected = stats.chi2_contingency(contingency_table)
    
    min_expected = expected.min()
    cells_below_5 = (expected < 5).sum()
    total_cells = expected.size
    pct_below_5 = (cells_below_5 / total_cells) * 100
    
    print("Assumption Check: Expected Frequencies")
    print(f"  Minimum expected frequency: {min_expected:.2f}")
    print(f"  Cells with expected < 5: {cells_below_5}/{total_cells} ({pct_below_5:.1f}%)")
    
    if min_expected < 1:
        print("  WARNING: Expected frequency below 1 - results unreliable")
        return False
    elif pct_below_5 > 20:
        print("  WARNING: >20% cells below 5 - consider Fisher's exact test")
        return False
    else:
        print("  Assumptions satisfied")
        return True

def complete_chi2_analysis(df, var1, var2, alpha=0.05):
    """Complete chi-square analysis workflow."""
    # Create contingency table
    table = pd.crosstab(df[var1], df[var2])
    print(f"Analyzing relationship: {var1} vs {var2}\n")
    print("Contingency Table:")
    print(table)
    print()
    
    # Check assumptions
    assumptions_met = check_chi2_assumptions(table)
    print()
    
    # Perform test
    chi2_stat, p_value, dof, expected = stats.chi2_contingency(table)
    
    # Calculate effect size
    v = cramers_v(table)
    
    # Interpret effect size
    if v < 0.1:
        effect_interpretation = "negligible"
    elif v < 0.3:
        effect_interpretation = "small"
    elif v < 0.5:
        effect_interpretation = "medium"
    else:
        effect_interpretation = "large"
    
    print("Results:")
    print(f"  χ² = {chi2_stat:.4f}, df = {dof}, p = {p_value:.4f}")
    print(f"  Cramér's V = {v:.4f} ({effect_interpretation} effect)")
    print()
    
    if p_value < alpha:
        print(f"Conclusion: Significant association (p < {alpha})")
    else:
        print(f"Conclusion: No significant association (p >= {alpha})")
    
    return {'chi2': chi2_stat, 'p_value': p_value, 'dof': dof, 'cramers_v': v}

# Example usage with customer data
np.random.seed(42)
n = 500
customer_data = pd.DataFrame({
    'age_group': np.random.choice(['18-30', '31-50', '51+'], n, p=[0.3, 0.45, 0.25]),
    'purchase_category': np.random.choice(['electronics', 'clothing', 'home'], n, p=[0.35, 0.40, 0.25])
})

results = complete_chi2_analysis(customer_data, 'age_group', 'purchase_category')

Summary and Best Practices

Choosing the Right Test:

Single categorical variable against theory → Goodness-of-fit (scipy.stats.chisquare)
Two categorical variables → Independence test (scipy.stats.chi2_contingency)
Small samples or sparse tables → Fisher’s exact test (scipy.stats.fisher_exact)

Minimum Sample Size Guidelines:

All expected frequencies ≥ 5 (traditional rule)
No expected frequency < 1 (hard requirement)
If >20% of cells have expected < 5, use Fisher’s exact test

Library Comparison:

scipy.stats: Clean API, sufficient for most needs
statsmodels: More detailed output, better for regression-style workflows
pingouin: Excellent for effect sizes and assumption checking

Common Pitfalls to Avoid:

Using chi-square on continuous data (discretize first or use different tests)
Including the same observation in multiple cells
Confusing statistical significance with practical importance—always report effect sizes
Ignoring assumption violations, especially with small samples

Chi-square tests are workhorses of categorical analysis. Master these fundamentals, and you’ll handle the majority of categorical data problems you encounter in practice.