Chi-Square Distribution in Python: Complete Guide
The chi-square (χ²) distribution is a continuous probability distribution that emerges naturally when you square standard normal random variables. If you take k independent standard normal variables...
Key Insights
- The chi-square distribution is fundamental to categorical data analysis, with degrees of freedom controlling its shape—low df creates right-skewed distributions while higher df approaches normality.
- Python’s scipy.stats provides two distinct chi-square tests:
chisquare()for goodness-of-fit (one variable against expected frequencies) andchi2_contingency()for independence testing (relationships between two categorical variables). - Always verify assumptions before running chi-square tests: expected frequencies should be at least 5 in each cell, observations must be independent, and sample sizes need to be adequate for meaningful results.
Introduction to Chi-Square Distribution
The chi-square (χ²) distribution is a continuous probability distribution that emerges naturally when you square standard normal random variables. If you take k independent standard normal variables and sum their squares, the result follows a chi-square distribution with k degrees of freedom.
This mathematical property makes chi-square invaluable for three core statistical tasks:
- Goodness-of-fit tests: Determining if observed categorical data matches expected frequencies
- Independence tests: Checking if two categorical variables are related
- Variance analysis: Constructing confidence intervals for population variance
In practice, you’ll encounter chi-square most often when working with categorical data—survey responses, classification outcomes, or any scenario where you’re counting observations in discrete categories rather than measuring continuous values.
Chi-Square Distribution Properties and Parameters
The chi-square distribution has a single parameter: degrees of freedom (df). This parameter fundamentally shapes the distribution’s behavior:
- Low df (1-3): Heavily right-skewed, concentrated near zero
- Moderate df (5-10): Still skewed but more spread out
- High df (30+): Approaches a normal distribution (Central Limit Theorem in action)
The mean of a chi-square distribution equals its degrees of freedom, and the variance equals twice the degrees of freedom. This relationship explains why the distribution spreads out as df increases.
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
# Visualize chi-square PDFs with different degrees of freedom
fig, ax = plt.subplots(figsize=(10, 6))
x = np.linspace(0, 30, 500)
degrees_of_freedom = [1, 2, 3, 5, 10, 15]
colors = plt.cm.viridis(np.linspace(0, 0.9, len(degrees_of_freedom)))
for df, color in zip(degrees_of_freedom, colors):
chi2_dist = stats.chi2(df=df)
y = chi2_dist.pdf(x)
ax.plot(x, y, label=f'df = {df}', color=color, linewidth=2)
ax.set_xlabel('x')
ax.set_ylabel('Probability Density')
ax.set_title('Chi-Square Distribution: Effect of Degrees of Freedom')
ax.legend()
ax.set_ylim(0, 0.5)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
Notice how df=1 creates an extreme right skew with values concentrated near zero, while df=15 produces a more symmetric, bell-like shape centered around 15.
Generating Chi-Square Distributions in Python
Python offers multiple approaches for working with chi-square distributions. Here’s how to generate samples and compute key statistics:
import numpy as np
from scipy import stats
# Set seed for reproducibility
np.random.seed(42)
df = 5 # degrees of freedom
n_samples = 10000
# Method 1: NumPy's random generator
samples_numpy = np.random.chisquare(df=df, size=n_samples)
# Method 2: SciPy's chi2 distribution
chi2_dist = stats.chi2(df=df)
samples_scipy = chi2_dist.rvs(size=n_samples)
# Compare theoretical vs empirical statistics
print(f"Theoretical mean: {df}")
print(f"NumPy sample mean: {samples_numpy.mean():.3f}")
print(f"SciPy sample mean: {samples_scipy.mean():.3f}")
print(f"\nTheoretical variance: {2 * df}")
print(f"NumPy sample variance: {samples_numpy.var():.3f}")
print(f"SciPy sample variance: {samples_scipy.var():.3f}")
For statistical inference, you’ll need PDF, CDF, and critical values:
from scipy import stats
df = 5
chi2_dist = stats.chi2(df=df)
# Probability density at x=3
pdf_value = chi2_dist.pdf(3)
print(f"PDF at x=3: {pdf_value:.4f}")
# Cumulative probability P(X <= 3)
cdf_value = chi2_dist.cdf(3)
print(f"P(X <= 3): {cdf_value:.4f}")
# Critical values for common significance levels
alpha_levels = [0.10, 0.05, 0.01]
for alpha in alpha_levels:
critical_value = chi2_dist.ppf(1 - alpha)
print(f"Critical value (α={alpha}): {critical_value:.4f}")
# Survival function: P(X > x) - useful for p-values
test_statistic = 11.07
p_value = chi2_dist.sf(test_statistic)
print(f"\nP(X > {test_statistic}): {p_value:.4f}")
Chi-Square Goodness-of-Fit Test
The goodness-of-fit test compares observed frequencies against expected frequencies for a single categorical variable. The test statistic is:
χ² = Σ (observed - expected)² / expected
Let’s test whether a die is fair:
import numpy as np
from scipy import stats
# Observed frequencies from 600 dice rolls
observed = np.array([89, 113, 98, 104, 92, 104])
# Expected frequencies if die is fair (600/6 = 100 each)
expected = np.array([100, 100, 100, 100, 100, 100])
# Perform chi-square goodness-of-fit test
chi2_stat, p_value = stats.chisquare(f_obs=observed, f_exp=expected)
print("Chi-Square Goodness-of-Fit Test: Is the die fair?")
print(f"Observed frequencies: {observed}")
print(f"Expected frequencies: {expected}")
print(f"\nChi-square statistic: {chi2_stat:.4f}")
print(f"Degrees of freedom: {len(observed) - 1}")
print(f"P-value: {p_value:.4f}")
alpha = 0.05
if p_value < alpha:
print(f"\nReject H0 at α={alpha}: Die appears biased")
else:
print(f"\nFail to reject H0 at α={alpha}: No evidence die is biased")
The degrees of freedom for goodness-of-fit equals (number of categories - 1). If you estimate parameters from the data (like fitting a Poisson distribution), subtract additional degrees of freedom for each estimated parameter.
Chi-Square Test for Independence
The independence test examines whether two categorical variables are related. It uses a contingency table (crosstab) of observed frequencies:
import pandas as pd
import numpy as np
from scipy import stats
# Survey data: Treatment type vs. Outcome
data = {
'treatment': ['A']*150 + ['B']*150 + ['C']*150,
'outcome': (
['improved']*75 + ['no_change']*50 + ['worsened']*25 +
['improved']*60 + ['no_change']*55 + ['worsened']*35 +
['improved']*90 + ['no_change']*40 + ['worsened']*20
)
}
df = pd.DataFrame(data)
# Create contingency table
contingency_table = pd.crosstab(df['treatment'], df['outcome'])
print("Contingency Table:")
print(contingency_table)
print()
# Perform chi-square test for independence
chi2_stat, p_value, dof, expected_freq = stats.chi2_contingency(contingency_table)
print(f"Chi-square statistic: {chi2_stat:.4f}")
print(f"Degrees of freedom: {dof}")
print(f"P-value: {p_value:.4f}")
print(f"\nExpected frequencies (if independent):")
print(pd.DataFrame(
expected_freq,
index=contingency_table.index,
columns=contingency_table.columns
).round(2))
alpha = 0.05
if p_value < alpha:
print(f"\nReject H0: Treatment and outcome are NOT independent")
else:
print(f"\nFail to reject H0: No evidence of association")
Degrees of freedom for independence tests equal (rows - 1) × (columns - 1).
Practical Applications and Interpretation
Statistical significance alone doesn’t tell you how strong an association is. Cramér’s V provides an effect size measure ranging from 0 (no association) to 1 (perfect association):
import pandas as pd
import numpy as np
from scipy import stats
def cramers_v(contingency_table):
"""Calculate Cramér's V for effect size."""
chi2 = stats.chi2_contingency(contingency_table)[0]
n = contingency_table.sum().sum()
min_dim = min(contingency_table.shape) - 1
return np.sqrt(chi2 / (n * min_dim))
def check_chi2_assumptions(contingency_table):
"""Check assumptions for chi-square test."""
_, _, _, expected = stats.chi2_contingency(contingency_table)
min_expected = expected.min()
cells_below_5 = (expected < 5).sum()
total_cells = expected.size
pct_below_5 = (cells_below_5 / total_cells) * 100
print("Assumption Check: Expected Frequencies")
print(f" Minimum expected frequency: {min_expected:.2f}")
print(f" Cells with expected < 5: {cells_below_5}/{total_cells} ({pct_below_5:.1f}%)")
if min_expected < 1:
print(" WARNING: Expected frequency below 1 - results unreliable")
return False
elif pct_below_5 > 20:
print(" WARNING: >20% cells below 5 - consider Fisher's exact test")
return False
else:
print(" Assumptions satisfied")
return True
def complete_chi2_analysis(df, var1, var2, alpha=0.05):
"""Complete chi-square analysis workflow."""
# Create contingency table
table = pd.crosstab(df[var1], df[var2])
print(f"Analyzing relationship: {var1} vs {var2}\n")
print("Contingency Table:")
print(table)
print()
# Check assumptions
assumptions_met = check_chi2_assumptions(table)
print()
# Perform test
chi2_stat, p_value, dof, expected = stats.chi2_contingency(table)
# Calculate effect size
v = cramers_v(table)
# Interpret effect size
if v < 0.1:
effect_interpretation = "negligible"
elif v < 0.3:
effect_interpretation = "small"
elif v < 0.5:
effect_interpretation = "medium"
else:
effect_interpretation = "large"
print("Results:")
print(f" χ² = {chi2_stat:.4f}, df = {dof}, p = {p_value:.4f}")
print(f" Cramér's V = {v:.4f} ({effect_interpretation} effect)")
print()
if p_value < alpha:
print(f"Conclusion: Significant association (p < {alpha})")
else:
print(f"Conclusion: No significant association (p >= {alpha})")
return {'chi2': chi2_stat, 'p_value': p_value, 'dof': dof, 'cramers_v': v}
# Example usage with customer data
np.random.seed(42)
n = 500
customer_data = pd.DataFrame({
'age_group': np.random.choice(['18-30', '31-50', '51+'], n, p=[0.3, 0.45, 0.25]),
'purchase_category': np.random.choice(['electronics', 'clothing', 'home'], n, p=[0.35, 0.40, 0.25])
})
results = complete_chi2_analysis(customer_data, 'age_group', 'purchase_category')
Summary and Best Practices
Choosing the Right Test:
- Single categorical variable against theory → Goodness-of-fit (
scipy.stats.chisquare) - Two categorical variables → Independence test (
scipy.stats.chi2_contingency) - Small samples or sparse tables → Fisher’s exact test (
scipy.stats.fisher_exact)
Minimum Sample Size Guidelines:
- All expected frequencies ≥ 5 (traditional rule)
- No expected frequency < 1 (hard requirement)
- If >20% of cells have expected < 5, use Fisher’s exact test
Library Comparison:
scipy.stats: Clean API, sufficient for most needsstatsmodels: More detailed output, better for regression-style workflowspingouin: Excellent for effect sizes and assumption checking
Common Pitfalls to Avoid:
- Using chi-square on continuous data (discretize first or use different tests)
- Including the same observation in multiple cells
- Confusing statistical significance with practical importance—always report effect sizes
- Ignoring assumption violations, especially with small samples
Chi-square tests are workhorses of categorical analysis. Master these fundamentals, and you’ll handle the majority of categorical data problems you encounter in practice.