How to Perform a Paired T-Test in Excel

Key Insights

The paired t-test compares two related measurements from the same subjects, making it ideal for before/after studies, matched experiments, and repeated measures designs where observations are naturally linked.
Excel offers two approaches: the T.TEST() function for quick p-value calculations, and the Data Analysis ToolPak for comprehensive statistical output including confidence intervals and critical values.
Always verify the paired t-test assumptions—normality of differences, continuous data, and true pairing—before interpreting results, as violations can lead to misleading conclusions.

Introduction to Paired T-Tests

The paired t-test (also called the dependent samples t-test) determines whether the mean difference between two sets of related observations is statistically significant. Unlike the independent t-test that compares two separate groups, the paired version accounts for the inherent correlation between matched measurements.

You should reach for a paired t-test when your data involves:

Before/after measurements: Testing blood pressure before and after medication
Matched subjects: Comparing twins or matched pairs on some outcome
Repeated measures: Same participants tested under two conditions
Self-pairing: Left eye vs. right eye measurements on the same patients

The test relies on three key assumptions. First, the differences between paired observations should follow a normal distribution (though the test is robust to moderate violations with larger samples). Second, the data must be continuous or at least interval-scale. Third, each observation in one group must have exactly one corresponding observation in the other group—the pairing must be meaningful and intentional.

Preparing Your Data in Excel

Proper data structure prevents errors and simplifies analysis. Arrange your paired data in two adjacent columns with each row representing one subject or matched pair.

Here’s how your spreadsheet should look for a study measuring systolic blood pressure before and after a 12-week exercise program:

     A              B              C
1    Subject_ID     BP_Before      BP_After
2    1              142            135
3    2              138            132
4    3              156            148
5    4              147            140
6    5              139            136
7    6              151            142
8    7              144            138
9    8              160            151
10   9              135            130
11   10             148            141
12   11             153            145
13   12             141            137
14   13             158            149
15   14             145            139
16   15             150            143
17   16             137            133
18   17             162            152
19   18             149            142
20   19             143            138
21   20             155            147

Before running any analysis, check for missing values. A paired t-test requires complete pairs—if one measurement is missing, you must exclude the entire row. Use =COUNTBLANK(B2:B21) and =COUNTBLANK(C2:C21) to identify gaps. For this dataset, we have 20 complete pairs ready for analysis.

Method 1: Using the T.TEST Function

The T.TEST() function provides the fastest route to a p-value. The syntax is:

=T.TEST(array1, array2, tails, type)

Each argument serves a specific purpose:

array1: First data range (e.g., before measurements)
array2: Second data range (e.g., after measurements)
tails: Use 1 for one-tailed test, 2 for two-tailed test
type: Use 1 for paired, 2 for equal variance independent, 3 for unequal variance independent

For our blood pressure example, enter this formula in any empty cell:

=T.TEST(B2:B21, C2:C21, 2, 1)

This returns approximately 0.0000000012 (displayed as 1.20E-09 in scientific notation). The extremely small p-value indicates strong evidence that the exercise program significantly changed blood pressure.

For a one-tailed test where you specifically hypothesize that the “after” measurements will be lower:

=T.TEST(B2:B21, C2:C21, 1, 1)

This returns half the two-tailed p-value, approximately 6.0E-10.

The T.TEST() function is convenient but limited—it only outputs the p-value. You won’t get the t-statistic, degrees of freedom, or confidence intervals. For complete statistical reporting, use the Data Analysis ToolPak.

Method 2: Using the Data Analysis ToolPak

The ToolPak add-in provides publication-ready statistical output. First, verify it’s enabled:

Click File → Options → Add-ins
In the Manage dropdown, select Excel Add-ins and click Go
Check Analysis ToolPak and click OK

Now run the paired t-test:

Navigate to Data tab → Data Analysis (in the Analysis group)
Select t-Test: Paired Two Sample for Means → Click OK
Configure the dialog:
- Variable 1 Range: $B$1:$B$21 (include header)
- Variable 2 Range: $C$1:$C$21 (include header)
- Hypothesized Mean Difference: 0 (testing if means are equal)
- Labels: Check this box (since we included headers)
- Alpha: 0.05 (standard significance level)
- Output Range: Select where you want results

Click OK to generate the output table.

Interpreting Your Results

The ToolPak produces a comprehensive output table. Here’s what our blood pressure analysis returns:

t-Test: Paired Two Sample for Means

                          BP_Before    BP_After
Mean                      147.65       140.40
Variance                  58.45        40.99
Observations              20           20
Pearson Correlation       0.9847
Hypothesized Mean Diff    0
df                        19
t Stat                    10.4521
P(T<=t) one-tail          6.02E-09
t Critical one-tail       1.7291
P(T<=t) two-tail          1.20E-08
t Critical two-tail       2.0930

Let’s decode each component:

Mean: The before group averaged 147.65 mmHg; the after group averaged 140.40 mmHg. The mean difference is 7.25 mmHg.

Pearson Correlation: At 0.9847, the before/after measurements are highly correlated. This confirms pairing was appropriate—subjects who started high tended to end high.

df (degrees of freedom): For paired t-tests, df = n - 1 = 19.

t Stat: The calculated t-statistic of 10.45 measures how many standard errors the mean difference is from zero.

P(T<=t) two-tail: This is your key result. At 1.20E-08 (0.000000012), it’s far below the 0.05 threshold. You reject the null hypothesis.

t Critical two-tail: The threshold value (2.093) that t Stat must exceed for significance at α = 0.05. Since 10.45 > 2.093, the result is significant.

Statistical conclusion: The exercise program produced a statistically significant reduction in systolic blood pressure (t(19) = 10.45, p < 0.001), with an average decrease of 7.25 mmHg.

Manual Calculation Approach

Calculating the t-statistic manually helps verify Excel’s output and deepens understanding. The paired t-test works on the differences between paired observations.

First, calculate the difference for each pair in column D:

D2: =B2-C2

Copy this formula down through D21. For our data, differences range from 6 to 11 mmHg.

Next, calculate the mean difference:

=AVERAGE(D2:D21)

Result: 7.25

Calculate the standard deviation of differences:

=STDEV.S(D2:D21)

Result: approximately 3.10

Calculate the standard error:

=STDEV.S(D2:D21)/SQRT(COUNT(D2:D21))

Result: approximately 0.694

Finally, calculate the t-statistic:

=AVERAGE(D2:D21)/(STDEV.S(D2:D21)/SQRT(COUNT(D2:D21)))

Result: approximately 10.45

This matches the ToolPak output. To get the p-value from the t-statistic:

=T.DIST.2T(ABS(your_t_statistic), 19)

Common Pitfalls and Best Practices

Assumption violations matter. The normality assumption applies to the differences, not the original variables. Test this with a histogram of column D or use the Shapiro-Wilk test (available in specialized add-ins). With samples above 30, the Central Limit Theorem provides protection against moderate non-normality.

Outliers distort results. A single extreme difference can inflate or deflate the t-statistic. Examine your differences for values more than 3 standard deviations from the mean. Consider whether outliers represent data errors, measurement problems, or genuine extreme responses.

Sample size affects power. With fewer than 10 pairs, you may lack statistical power to detect real effects. Use power analysis before data collection. As a rough guide, 20-30 pairs provide reasonable power for medium effect sizes.

Pairing must be intentional. Don’t force pairing on unrelated data. If subjects weren’t actually matched or measured twice, use an independent t-test instead. Inappropriate pairing violates the test’s mathematical foundation.

Consider alternatives when assumptions fail. If differences are severely non-normal and your sample is small, the Wilcoxon signed-rank test offers a non-parametric alternative. This test compares medians rather than means and doesn’t assume normality. Excel doesn’t include it natively, but you can implement it with formulas or use add-ins.

Report effect sizes alongside p-values. A significant p-value tells you the effect isn’t zero—it doesn’t tell you if the effect matters practically. Calculate Cohen’s d for paired samples:

=AVERAGE(D2:D21)/STDEV.S(D2:D21)

For our data, d ≈ 2.34, indicating a very large effect. Guidelines suggest 0.2 is small, 0.5 is medium, and 0.8 is large.

The paired t-test remains one of the most useful statistical tools for comparing related measurements. Excel makes execution straightforward—your job is ensuring the test fits your data and interpreting results thoughtfully.