T Distribution in Python: Complete Guide
The t-distribution, also called Student's t-distribution, exists because of a fundamental problem in statistics: we rarely know the true population variance. When William Sealy Gosset developed it in...
Key Insights
- The t-distribution is your go-to tool when working with small samples (n < 30) or when population variance is unknown—which is almost always in real-world scenarios.
- Degrees of freedom directly control the distribution’s shape: lower df means heavier tails and more uncertainty; as df approaches infinity, the t-distribution converges to the normal distribution.
- SciPy’s
scipy.stats.tmodule provides everything you need for t-distribution work, from probability calculations to hypothesis testing, but understanding the underlying assumptions is critical for valid results.
Introduction to the T Distribution
The t-distribution, also called Student’s t-distribution, exists because of a fundamental problem in statistics: we rarely know the true population variance. When William Sealy Gosset developed it in 1908 while working at Guinness Brewery, he solved a practical problem that still matters today.
Use the t-distribution when:
- Your sample size is small (typically n < 30)
- Population standard deviation is unknown (you’re estimating from sample data)
- Your data is approximately normally distributed
The key difference from the normal distribution is the heavier tails. This accounts for the additional uncertainty introduced by estimating variance from limited data. The shape is controlled by degrees of freedom (df), typically calculated as n - 1 for a single sample.
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
# Compare t-distribution to normal at different df values
x = np.linspace(-4, 4, 1000)
normal = stats.norm.pdf(x)
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(x, normal, 'k-', linewidth=2, label='Normal (z)')
for df in [1, 3, 10, 30]:
t_dist = stats.t.pdf(x, df)
ax.plot(x, t_dist, '--', linewidth=1.5, label=f't (df={df})')
ax.set_xlabel('x')
ax.set_ylabel('Probability Density')
ax.set_title('T-Distribution vs Normal Distribution')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
Notice how the t-distribution with df=1 has dramatically heavier tails. By df=30, it’s nearly indistinguishable from the normal distribution. This visual intuition matters when interpreting results.
Working with T Distributions in SciPy
SciPy’s scipy.stats.t object is your primary tool for t-distribution calculations. It follows the same interface as other continuous distributions in SciPy, making it easy to learn once and apply everywhere.
The three key parameters are:
df: Degrees of freedom (required)loc: Location parameter, shifts the distribution (default: 0)scale: Scale parameter, stretches the distribution (default: 1)
from scipy import stats
import numpy as np
# Create a t-distribution with 10 degrees of freedom
t_dist = stats.t(df=10)
# Probability Density Function (PDF) - height of curve at a point
print(f"PDF at x=0: {t_dist.pdf(0):.4f}")
print(f"PDF at x=2: {t_dist.pdf(2):.4f}")
# Cumulative Distribution Function (CDF) - P(X <= x)
print(f"\nP(X <= 0): {t_dist.cdf(0):.4f}")
print(f"P(X <= 2): {t_dist.cdf(2):.4f}")
# Percent Point Function (PPF) - inverse of CDF
# Critical values for two-tailed test at alpha=0.05
alpha = 0.05
critical_value = t_dist.ppf(1 - alpha/2)
print(f"\nCritical value (two-tailed, α=0.05): ±{critical_value:.4f}")
# Generate random samples
np.random.seed(42)
samples = t_dist.rvs(size=1000)
print(f"\nSample mean: {samples.mean():.4f}")
print(f"Sample std: {samples.std():.4f}")
# Calculate probability between two values
prob_between = t_dist.cdf(1.5) - t_dist.cdf(-1.5)
print(f"\nP(-1.5 < X < 1.5): {prob_between:.4f}")
The ppf method is particularly useful for finding critical values in hypothesis testing. For a two-tailed test at α=0.05 with df=10, you’d reject the null hypothesis if your t-statistic exceeds approximately ±2.228.
One-Sample and Two-Sample T-Tests
Hypothesis testing with t-tests follows a straightforward pattern: calculate a test statistic, compare it to a critical value or p-value, and make a decision.
The one-sample t-test answers: “Does this sample come from a population with a specific mean?”
from scipy import stats
import numpy as np
# Scenario: A manufacturer claims widgets weigh 50 grams on average
# We sample 15 widgets and want to test this claim
np.random.seed(42)
widget_weights = np.array([51.2, 49.8, 52.1, 50.5, 48.9,
51.8, 50.2, 49.5, 52.3, 51.0,
50.8, 49.2, 51.5, 50.1, 52.0])
claimed_mean = 50.0
# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(widget_weights, claimed_mean)
print("One-Sample T-Test Results")
print("=" * 40)
print(f"Sample size: {len(widget_weights)}")
print(f"Sample mean: {widget_weights.mean():.3f}")
print(f"Sample std: {widget_weights.std(ddof=1):.3f}")
print(f"Claimed population mean: {claimed_mean}")
print(f"\nT-statistic: {t_stat:.4f}")
print(f"P-value (two-tailed): {p_value:.4f}")
alpha = 0.05
if p_value < alpha:
print(f"\nResult: Reject H₀ at α={alpha}")
print("Evidence suggests mean differs from claimed value.")
else:
print(f"\nResult: Fail to reject H₀ at α={alpha}")
print("Insufficient evidence to reject the claimed mean.")
For comparing two independent groups, use the two-sample t-test:
# Scenario: Compare test scores between two teaching methods
np.random.seed(42)
method_a = np.array([78, 82, 85, 79, 88, 91, 76, 84, 80, 87])
method_b = np.array([72, 75, 80, 71, 78, 74, 69, 77, 73, 76])
# Two-sample t-test (assuming equal variances)
t_stat, p_value = stats.ttest_ind(method_a, method_b)
print("Two-Sample T-Test Results")
print("=" * 40)
print(f"Method A: mean={method_a.mean():.2f}, std={method_a.std(ddof=1):.2f}")
print(f"Method B: mean={method_b.mean():.2f}, std={method_b.std(ddof=1):.2f}")
print(f"\nT-statistic: {t_stat:.4f}")
print(f"P-value: {p_value:.4f}")
# Welch's t-test (doesn't assume equal variances) - generally preferred
t_stat_welch, p_value_welch = stats.ttest_ind(method_a, method_b, equal_var=False)
print(f"\nWelch's t-test p-value: {p_value_welch:.4f}")
Use equal_var=False for Welch’s t-test when you can’t assume equal variances between groups. This is the safer default choice.
Paired T-Tests for Dependent Samples
When observations are naturally paired—before/after measurements, matched subjects, or repeated measures—use the paired t-test. It’s more powerful than an independent t-test because it controls for individual variation.
from scipy import stats
import numpy as np
# Scenario: Blood pressure before and after medication for 12 patients
np.random.seed(42)
bp_before = np.array([142, 138, 150, 148, 135, 140, 155, 145, 139, 152, 147, 141])
bp_after = np.array([138, 135, 142, 140, 132, 136, 148, 138, 134, 145, 140, 137])
# Calculate differences
differences = bp_before - bp_after
print("Paired T-Test: Blood Pressure Reduction")
print("=" * 50)
print(f"Mean BP before: {bp_before.mean():.2f} mmHg")
print(f"Mean BP after: {bp_after.mean():.2f} mmHg")
print(f"Mean difference: {differences.mean():.2f} mmHg")
print(f"Std of differences: {differences.std(ddof=1):.2f}")
# Perform paired t-test
t_stat, p_value = stats.ttest_rel(bp_before, bp_after)
print(f"\nT-statistic: {t_stat:.4f}")
print(f"P-value (two-tailed): {p_value:.6f}")
# One-tailed test (we expect reduction, so before > after)
p_value_one_tailed = p_value / 2 if t_stat > 0 else 1 - p_value / 2
print(f"P-value (one-tailed, reduction): {p_value_one_tailed:.6f}")
if p_value < 0.05:
print("\nConclusion: Significant difference in blood pressure.")
print(f"The medication reduced BP by an average of {differences.mean():.1f} mmHg.")
The paired t-test is mathematically equivalent to a one-sample t-test on the differences. This is why it’s more powerful: instead of comparing two noisy groups, you’re testing whether the mean difference is zero.
Confidence Intervals Using T Distribution
Confidence intervals quantify uncertainty around your estimate. For small samples with unknown population variance, use the t-distribution.
from scipy import stats
import numpy as np
def t_confidence_interval(data, confidence=0.95):
"""Calculate confidence interval using t-distribution."""
n = len(data)
mean = np.mean(data)
se = stats.sem(data) # Standard error of the mean
# Get t critical value
df = n - 1
t_crit = stats.t.ppf((1 + confidence) / 2, df)
margin_of_error = t_crit * se
return mean - margin_of_error, mean + margin_of_error
# Sample data: response times in milliseconds
np.random.seed(42)
response_times = np.array([245, 312, 278, 295, 267, 289, 301, 256, 284, 273])
# Manual calculation
ci_manual = t_confidence_interval(response_times, 0.95)
# Using scipy's built-in method
ci_scipy = stats.t.interval(
confidence=0.95,
df=len(response_times) - 1,
loc=np.mean(response_times),
scale=stats.sem(response_times)
)
print("95% Confidence Interval for Mean Response Time")
print("=" * 50)
print(f"Sample mean: {response_times.mean():.2f} ms")
print(f"Sample size: {len(response_times)}")
print(f"\nManual calculation: ({ci_manual[0]:.2f}, {ci_manual[1]:.2f})")
print(f"SciPy t.interval: ({ci_scipy[0]:.2f}, {ci_scipy[1]:.2f})")
# Compare to z-interval (what you'd get assuming known variance)
z_crit = stats.norm.ppf(0.975)
z_margin = z_crit * stats.sem(response_times)
ci_z = (response_times.mean() - z_margin, response_times.mean() + z_margin)
print(f"\nZ-interval (wrong for small samples): ({ci_z[0]:.2f}, {ci_z[1]:.2f})")
print(f"\nNote: T-interval is wider, reflecting greater uncertainty.")
The t-interval is always wider than the z-interval for small samples. This isn’t a bug—it correctly reflects the additional uncertainty from estimating variance.
T Distribution with Pandas and Real-World Data
In practice, you’ll work with DataFrames and need to check assumptions before running t-tests.
import pandas as pd
import numpy as np
from scipy import stats
# Create sample dataset
np.random.seed(42)
data = pd.DataFrame({
'group': ['A'] * 25 + ['B'] * 25,
'score': np.concatenate([
np.random.normal(75, 10, 25),
np.random.normal(82, 12, 25)
])
})
def check_assumptions_and_test(df, group_col, value_col, alpha=0.05):
"""Complete t-test workflow with assumption checking."""
groups = df[group_col].unique()
group_a = df[df[group_col] == groups[0]][value_col]
group_b = df[df[group_col] == groups[1]][value_col]
print("=" * 60)
print("T-TEST ANALYSIS REPORT")
print("=" * 60)
# Descriptive statistics
print("\n1. DESCRIPTIVE STATISTICS")
print(df.groupby(group_col)[value_col].agg(['count', 'mean', 'std']))
# Check normality (Shapiro-Wilk test)
print("\n2. NORMALITY CHECK (Shapiro-Wilk)")
for group in groups:
group_data = df[df[group_col] == group][value_col]
stat, p = stats.shapiro(group_data)
result = "Normal" if p > alpha else "Non-normal"
print(f" {group}: W={stat:.4f}, p={p:.4f} → {result}")
# Check equal variances (Levene's test)
print("\n3. EQUAL VARIANCE CHECK (Levene's)")
stat, p = stats.levene(group_a, group_b)
equal_var = p > alpha
result = "Equal" if equal_var else "Unequal"
print(f" W={stat:.4f}, p={p:.4f} → {result} variances")
# Perform appropriate t-test
print("\n4. T-TEST RESULTS")
if equal_var:
t_stat, p_value = stats.ttest_ind(group_a, group_b, equal_var=True)
test_type = "Student's t-test"
else:
t_stat, p_value = stats.ttest_ind(group_a, group_b, equal_var=False)
test_type = "Welch's t-test"
print(f" Test used: {test_type}")
print(f" T-statistic: {t_stat:.4f}")
print(f" P-value: {p_value:.4f}")
# Effect size (Cohen's d)
pooled_std = np.sqrt(((len(group_a)-1)*group_a.std()**2 +
(len(group_b)-1)*group_b.std()**2) /
(len(group_a) + len(group_b) - 2))
cohens_d = (group_a.mean() - group_b.mean()) / pooled_std
print(f"\n5. EFFECT SIZE")
print(f" Cohen's d: {cohens_d:.4f}")
# Conclusion
print("\n6. CONCLUSION")
if p_value < alpha:
print(f" Reject H₀: Significant difference between groups (p < {alpha})")
else:
print(f" Fail to reject H₀: No significant difference (p >= {alpha})")
return {'t_stat': t_stat, 'p_value': p_value, 'cohens_d': cohens_d}
# Run the analysis
results = check_assumptions_and_test(data, 'group', 'score')
This workflow ensures you’re not blindly applying tests. The Shapiro-Wilk test checks normality, Levene’s test checks variance equality, and Cohen’s d provides practical significance beyond statistical significance.
Summary and Best Practices
Choose the t-distribution when:
- Sample size is small (n < 30) and population variance is unknown
- You’re constructing confidence intervals from sample data
- You’re comparing means between groups or against a known value
Common pitfalls to avoid:
- Ignoring normality: T-tests are robust to mild non-normality, but severely skewed data requires alternatives like the Mann-Whitney U test
- Using independent t-test for paired data: You’ll lose statistical power and may miss real effects
- Forgetting to check equal variance: Use Welch’s t-test by default; it’s nearly as powerful as Student’s t-test when variances are equal and much better when they’re not
- Confusing statistical and practical significance: A p-value of 0.001 means nothing if Cohen’s d is 0.1
The t-distribution is foundational to inferential statistics. Master it, understand its assumptions, and you’ll have a reliable tool for drawing conclusions from limited data.