How to Calculate Eta Squared in Python

Statistical significance tells you whether an effect exists. Effect size tells you whether anyone should care. Eta squared (η²) bridges this gap for ANOVA by quantifying how much of the total...

Key Insights

  • Eta squared (η²) measures the proportion of variance explained by a factor in ANOVA, with benchmarks of 0.01 (small), 0.06 (medium), and 0.14 (large) guiding interpretation.
  • Manual calculation using NumPy gives you full control and understanding, while libraries like pingouin provide convenience with built-in effect size reporting.
  • Partial eta squared is preferred for factorial designs because it isolates each factor’s contribution, whereas standard eta squared can be misleading when multiple factors are present.

Introduction to Eta Squared

Statistical significance tells you whether an effect exists. Effect size tells you whether anyone should care. Eta squared (η²) bridges this gap for ANOVA by quantifying how much of the total variance in your dependent variable is explained by your independent variable.

When you run an ANOVA and get a p-value of 0.001, you know something is happening. But is it a 2% difference or a 40% difference in explained variance? That distinction matters for practical decision-making, and eta squared provides the answer.

The standard interpretation benchmarks, established by Cohen, are:

  • Small effect: η² = 0.01 (1% of variance explained)
  • Medium effect: η² = 0.06 (6% of variance explained)
  • Large effect: η² = 0.14 (14% of variance explained)

Use eta squared when you have a one-way ANOVA and want a straightforward measure of effect size. For factorial designs with multiple factors, partial eta squared becomes the better choice. For repeated measures, generalized eta squared handles the complexity of correlated observations.

The Formula Behind Eta Squared

Eta squared is conceptually simple: it’s the ratio of between-group variance to total variance.

η² = SS_between / SS_total

Breaking down the components:

SS_between (Sum of Squares Between Groups): This measures how much the group means deviate from the grand mean. Large values indicate that group membership matters—the groups are genuinely different.

SS_total (Total Sum of Squares): This measures the total variability in your data, calculated as the sum of squared deviations of each observation from the grand mean.

The ratio gives you a proportion between 0 and 1. An eta squared of 0.25 means that 25% of the total variance in your outcome variable can be attributed to group membership.

There’s also SS_within (Sum of Squares Within Groups), which captures individual variation within each group. The relationship is:

SS_total = SS_between + SS_within

This decomposition is the foundation of ANOVA itself, and eta squared simply expresses the between-group portion as a proportion of the whole.

Manual Calculation with NumPy

Understanding the manual calculation ensures you know exactly what’s happening under the hood. Here’s a function that computes eta squared from raw group data:

import numpy as np

def calculate_eta_squared(groups):
    """
    Calculate eta squared from a list of group arrays.
    
    Parameters
    ----------
    groups : list of array-like
        Each element contains observations for one group.
    
    Returns
    -------
    float
        Eta squared value between 0 and 1.
    """
    # Flatten all observations and compute grand mean
    all_observations = np.concatenate(groups)
    grand_mean = np.mean(all_observations)
    
    # Calculate SS_total: sum of squared deviations from grand mean
    ss_total = np.sum((all_observations - grand_mean) ** 2)
    
    # Calculate SS_between: weighted sum of squared group mean deviations
    ss_between = 0
    for group in groups:
        group_mean = np.mean(group)
        n_group = len(group)
        ss_between += n_group * (group_mean - grand_mean) ** 2
    
    # Eta squared is the ratio
    eta_squared = ss_between / ss_total
    
    return eta_squared


# Example usage
group_a = np.array([23, 25, 27, 22, 24])
group_b = np.array([31, 33, 29, 35, 32])
group_c = np.array([28, 26, 30, 27, 29])

eta_sq = calculate_eta_squared([group_a, group_b, group_c])
print(f"Eta squared: {eta_sq:.4f}")
# Output: Eta squared: 0.6552

This result indicates a large effect—group membership explains about 66% of the variance in our outcome variable. The groups are substantially different from each other.

Using scipy.stats with ANOVA Output

SciPy’s f_oneway function runs the ANOVA but doesn’t directly report effect sizes. However, you can build a wrapper that calculates eta squared alongside the standard ANOVA output:

from scipy import stats
import numpy as np

def anova_with_eta_squared(*groups):
    """
    Perform one-way ANOVA and return F-statistic, p-value, and eta squared.
    
    Parameters
    ----------
    *groups : array-like
        Variable number of group arrays.
    
    Returns
    -------
    dict
        Dictionary containing F-statistic, p-value, and eta squared.
    """
    # Run the ANOVA
    f_statistic, p_value = stats.f_oneway(*groups)
    
    # Calculate eta squared manually (scipy doesn't provide SS directly)
    all_observations = np.concatenate(groups)
    grand_mean = np.mean(all_observations)
    
    ss_total = np.sum((all_observations - grand_mean) ** 2)
    
    ss_between = sum(
        len(group) * (np.mean(group) - grand_mean) ** 2 
        for group in groups
    )
    
    eta_squared = ss_between / ss_total
    
    return {
        'F_statistic': f_statistic,
        'p_value': p_value,
        'eta_squared': eta_squared,
        'effect_size': interpret_eta_squared(eta_squared)
    }


def interpret_eta_squared(eta_sq):
    """Return qualitative interpretation of eta squared."""
    if eta_sq < 0.01:
        return 'negligible'
    elif eta_sq < 0.06:
        return 'small'
    elif eta_sq < 0.14:
        return 'medium'
    else:
        return 'large'


# Example with treatment groups
control = np.array([4.2, 3.8, 4.5, 4.1, 3.9, 4.3])
treatment_a = np.array([5.1, 5.4, 4.9, 5.2, 5.0, 5.3])
treatment_b = np.array([6.2, 5.8, 6.0, 6.1, 5.9, 6.3])

results = anova_with_eta_squared(control, treatment_a, treatment_b)
print(f"F({2}, {15}) = {results['F_statistic']:.2f}, p = {results['p_value']:.4f}")
print(f"η² = {results['eta_squared']:.3f} ({results['effect_size']} effect)")

This wrapper gives you everything you need for reporting in one function call.

Calculating Eta Squared with pingouin

The pingouin library is purpose-built for statistical analysis and includes effect sizes by default. It’s the most convenient option for routine analyses:

import pingouin as pg
import pandas as pd
import numpy as np

# Create sample data in long format (required by pingouin)
np.random.seed(42)
data = pd.DataFrame({
    'score': np.concatenate([
        np.random.normal(50, 10, 30),  # Group A
        np.random.normal(55, 10, 30),  # Group B
        np.random.normal(62, 10, 30)   # Group C
    ]),
    'group': ['A'] * 30 + ['B'] * 30 + ['C'] * 30
})

# Run ANOVA with effect size
anova_results = pg.anova(data=data, dv='score', between='group', detailed=True)
print(anova_results)

# Extract eta squared directly
eta_squared = anova_results['np2'].iloc[0]  # 'np2' is partial eta squared
print(f"\nEta squared: {eta_squared:.4f}")

# Verify with manual calculation
groups = [data[data['group'] == g]['score'].values for g in ['A', 'B', 'C']]
manual_eta = calculate_eta_squared(groups)
print(f"Manual calculation: {manual_eta:.4f}")

Pingouin reports np2 (partial eta squared), which equals standard eta squared in one-way designs. The library also provides confidence intervals and handles more complex designs automatically.

For a cleaner output focused on effect size:

# Quick effect size extraction
result = pg.anova(data=data, dv='score', between='group')
print(f"Effect size (η²): {result['np2'].values[0]:.3f}")
print(f"F-statistic: {result['F'].values[0]:.2f}")
print(f"p-value: {result['p-unc'].values[0]:.4f}")

Partial Eta Squared for Factorial Designs

When you have multiple factors, standard eta squared becomes problematic. Each factor’s eta squared is calculated against the same total SS, so the values don’t partition cleanly and can be misleading.

Partial eta squared solves this by comparing each factor’s SS to the sum of its SS and the error SS:

partial η² = SS_effect / (SS_effect + SS_error)

Here’s how to calculate it for a two-way ANOVA using statsmodels:

import statsmodels.api as sm
from statsmodels.formula.api import ols
import pandas as pd
import numpy as np

# Create factorial design data
np.random.seed(123)
n_per_cell = 20

data = pd.DataFrame({
    'outcome': np.concatenate([
        np.random.normal(10, 2, n_per_cell),  # A1, B1
        np.random.normal(12, 2, n_per_cell),  # A1, B2
        np.random.normal(11, 2, n_per_cell),  # A2, B1
        np.random.normal(18, 2, n_per_cell),  # A2, B2 (interaction effect)
    ]),
    'factor_a': ['A1'] * 2 * n_per_cell + ['A2'] * 2 * n_per_cell,
    'factor_b': (['B1'] * n_per_cell + ['B2'] * n_per_cell) * 2
})

# Fit the two-way ANOVA model
model = ols('outcome ~ C(factor_a) * C(factor_b)', data=data).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

print("ANOVA Table:")
print(anova_table)
print()

# Calculate partial eta squared for each effect
def calculate_partial_eta_squared(anova_table):
    """Extract partial eta squared from statsmodels ANOVA table."""
    results = {}
    ss_residual = anova_table.loc['Residual', 'sum_sq']
    
    for effect in anova_table.index:
        if effect != 'Residual':
            ss_effect = anova_table.loc[effect, 'sum_sq']
            partial_eta_sq = ss_effect / (ss_effect + ss_residual)
            results[effect] = partial_eta_sq
    
    return results


partial_eta_values = calculate_partial_eta_squared(anova_table)

print("Partial Eta Squared Values:")
for effect, value in partial_eta_values.items():
    interpretation = interpret_eta_squared(value)
    print(f"  {effect}: {value:.4f} ({interpretation})")

This approach correctly isolates each factor’s contribution. The interaction term often reveals effects that main effects alone would miss.

Practical Considerations and Reporting

Bias in Eta Squared

Eta squared is a biased estimator—it tends to overestimate the population effect size, especially with small samples. Omega squared (ω²) provides a less biased alternative:

def calculate_omega_squared(groups):
    """
    Calculate omega squared (less biased than eta squared).
    """
    all_obs = np.concatenate(groups)
    grand_mean = np.mean(all_obs)
    n_total = len(all_obs)
    k = len(groups)  # number of groups
    
    ss_total = np.sum((all_obs - grand_mean) ** 2)
    ss_between = sum(
        len(g) * (np.mean(g) - grand_mean) ** 2 
        for g in groups
    )
    
    # Calculate MS_within (mean square error)
    ss_within = ss_total - ss_between
    df_within = n_total - k
    ms_within = ss_within / df_within
    
    # Omega squared formula
    omega_squared = (ss_between - (k - 1) * ms_within) / (ss_total + ms_within)
    
    return max(0, omega_squared)  # Can't be negative


# Compare the two
eta = calculate_eta_squared([group_a, group_b, group_c])
omega = calculate_omega_squared([group_a, group_b, group_c])
print(f"Eta squared: {eta:.4f}")
print(f"Omega squared: {omega:.4f}")

APA-Style Reporting

When reporting eta squared in publications, follow this format:

“There was a significant effect of treatment on performance, F(2, 45) = 12.34, p < .001, η² = .35, 95% CI [.18, .48].”

For partial eta squared in factorial designs:

“The main effect of factor A was significant, F(1, 76) = 8.92, p = .004, η²ₚ = .11.”

Confidence Intervals

Pingouin can calculate confidence intervals for effect sizes, which you should report when possible:

# Get confidence intervals with pingouin
result = pg.anova(data=data, dv='score', between='group')
# For more detailed CI, use pg.compute_effsize_from_t or bootstrap methods

The key takeaway: always report effect sizes alongside p-values. A statistically significant result with η² = 0.02 tells a very different story than one with η² = 0.45. The former might be real but trivial; the latter represents a substantial, practically meaningful difference.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.