How to Calculate R-Squared in Excel
R-squared, also called the coefficient of determination, answers a fundamental question in regression analysis: how much of the variation in your dependent variable is explained by your independent...
Key Insights
- R-squared measures how well your independent variable(s) explain the variance in your dependent variable, ranging from 0 (no explanatory power) to 1 (perfect fit)
- Excel provides four distinct methods to calculate R-squared: the RSQ function for quick results, LINEST for comprehensive regression statistics, trendlines for visual analysis, and CORREL squared for correlation-based calculation
- A high R-squared doesn’t guarantee a good model—always examine residual plots and consider whether the relationship makes logical sense before drawing conclusions
Introduction to R-Squared
R-squared, also called the coefficient of determination, answers a fundamental question in regression analysis: how much of the variation in your dependent variable is explained by your independent variable(s)?
When you fit a regression line through your data, some points fall close to the line while others scatter further away. R-squared quantifies this fit as a proportion between 0 and 1. An R-squared of 0.85 means your model explains 85% of the variance in the outcome variable. The remaining 15% is unexplained variance—noise, missing variables, or inherent randomness.
This metric matters because it tells you whether your model has predictive power. If you’re building a sales forecast based on advertising spend, an R-squared of 0.92 suggests strong explanatory power. An R-squared of 0.15 tells you advertising spend alone doesn’t explain much about sales variation.
Excel offers multiple ways to calculate R-squared, each suited to different workflows. Let’s examine the underlying math first, then work through each method.
Understanding the R-Squared Formula
The mathematical definition of R-squared is:
R² = 1 - (SS_res / SS_tot)
Where:
- SS_res (Residual Sum of Squares) = Σ(yᵢ - ŷᵢ)² — the sum of squared differences between actual values and predicted values
- SS_tot (Total Sum of Squares) = Σ(yᵢ - ȳ)² — the sum of squared differences between actual values and the mean
Think of SS_tot as the total variance in your data. SS_res represents the variance your model fails to explain. The ratio SS_res/SS_tot gives you the proportion of unexplained variance. Subtracting from 1 flips this to give you the explained proportion.
Here’s how you’d calculate this manually in Excel with sample data:
| Row | A (X) | B (Y) | C (Predicted Y) | D (Residual²) | E ((Y-Mean)²) |
|-----|--------|--------|-----------------|------------------|------------------|
| 1 | X | Y | Y_hat | (Y-Y_hat)² | (Y-Y_mean)² |
| 2 | 1 | 2.1 | 2.2 | =($B2-C2)^2 | =($B2-$B$22)^2 |
| 3 | 2 | 4.3 | 4.0 | =($B3-C3)^2 | =($B3-$B$22)^2 |
| ... | ... | ... | ... | ... | ... |
| 21 | 20 | 38.5 | 38.0 | =($B21-C21)^2 | =($B21-$B$22)^2 |
| 22 | Mean Y | =AVERAGE(B2:B21) | | | |
| 23 | SS_res | =SUM(D2:D21) | | | |
| 24 | SS_tot | =SUM(E2:E21) | | | |
| 25 | R² | =1-(B23/B24) | | | |
This manual approach builds intuition but isn’t practical for regular use. Excel’s built-in functions handle this automatically.
Method 1: Using the RSQ Function
The RSQ function is the fastest path to R-squared in Excel. It takes two arguments: your known Y values (dependent variable) and known X values (independent variable).
=RSQ(known_y's, known_x's)
For a practical example, suppose you have advertising spend in column A (rows 2-20) and sales revenue in column B (rows 2-20):
=RSQ(B2:B20, A2:A20)
This returns the R-squared value directly. If the result is 0.847, your advertising spend explains 84.7% of the variance in sales revenue.
A few important notes on RSQ:
- The function works only for simple linear regression (one independent variable)
- Argument order matters conceptually but not mathematically—RSQ returns the same value regardless of which range you list first
- Empty cells and text values are ignored, but ensure your ranges have equal lengths
Here’s a complete working example:
| A (Ad Spend) | B (Sales) |
|--------------|-----------|
| 1000 | 15000 |
| 1500 | 18500 |
| 2000 | 22000 |
| 2500 | 24000 |
| 3000 | 28500 |
| 3500 | 31000 |
| 4000 | 35000 |
| 4500 | 38000 |
| 5000 | 42500 |
Cell C1: =RSQ(B1:B9, A1:A9)
Result: 0.9967
An R-squared of 0.9967 indicates an exceptionally strong linear relationship—advertising spend explains nearly all sales variance in this dataset.
Method 2: Using LINEST Function
LINEST provides comprehensive regression statistics beyond just R-squared. It’s an array function that returns slope, intercept, standard errors, R-squared, F-statistic, and more.
The syntax:
=LINEST(known_y's, known_x's, [const], [stats])
Set both optional arguments to TRUE to get the full statistics output:
=LINEST(B2:B20, A2:A20, TRUE, TRUE)
In modern Excel (Microsoft 365 and Excel 2021+), this spills automatically into a 5-row by 2-column array:
| Column 1 | Column 2 |
|-----------------|---------------|
| Slope | Intercept |
| Std Error Slope | Std Error Int |
| R-squared | Std Error Y |
| F-statistic | df (degrees) |
| SS_reg | SS_res |
To extract just R-squared, use INDEX:
=INDEX(LINEST(B2:B20, A2:A20, TRUE, TRUE), 3, 1)
This returns the value from row 3, column 1 of the LINEST output—the R-squared value.
In older Excel versions, you’ll need to enter LINEST as a legacy array formula. Select a 5x2 range, type the formula, and press Ctrl+Shift+Enter instead of just Enter.
LINEST becomes essential when you need multiple regression statistics simultaneously or when performing multiple regression with several independent variables.
Method 3: Using Excel’s Trendline Feature
For visual analysis, Excel’s trendline feature displays R-squared directly on your chart. This approach works well for presentations and exploratory analysis.
Step-by-step process:
- Select your X and Y data ranges
- Insert a Scatter chart (Insert → Charts → Scatter)
- Click on any data point in the chart to select the data series
- Right-click and select “Add Trendline”
- In the Format Trendline pane, choose “Linear” under Trendline Options
- Scroll down and check “Display R-squared value on chart”
The chart now shows the R² value near the trendline. You can also check “Display Equation on chart” to see the regression equation (y = mx + b format).
Chart Output Example:
- Trendline equation: y = 6.8x + 8200
- R² = 0.9967
This method has limitations:
- You can’t reference the displayed R² value in other formulas
- It only works for single-variable regression
- The displayed value rounds based on chart formatting
For reports where you need the visual representation alongside the statistic, this method excels. For calculations that feed into other analyses, use RSQ or LINEST instead.
Method 4: Manual Calculation with CORREL
For simple linear regression, R-squared equals the square of the Pearson correlation coefficient. This mathematical relationship provides another calculation path:
=CORREL(A2:A20, B2:B20)^2
The CORREL function returns a value between -1 and 1 indicating correlation strength and direction. Squaring it produces R-squared.
This equivalence only holds for simple linear regression (one independent variable). With multiple regression, you must use LINEST or manual SS calculations.
The CORREL approach is useful when you’re already working with correlation matrices or want to understand the relationship between correlation and explained variance. A correlation of 0.9 means R-squared of 0.81—the independent variable explains 81% of variance. A correlation of 0.5 gives R-squared of 0.25—only 25% explained variance.
| Correlation (r) | R-squared (r²) |
|-----------------|----------------|
| 0.99 | 0.98 |
| 0.90 | 0.81 |
| 0.70 | 0.49 |
| 0.50 | 0.25 |
| 0.30 | 0.09 |
This table illustrates why moderate correlations often disappoint in predictive power. A “decent” correlation of 0.5 explains only a quarter of variance.
Interpreting R-Squared Results
Calculating R-squared is straightforward. Interpreting it correctly requires more nuance.
General interpretation guidelines:
- 0.9 - 1.0: Excellent fit, but verify you’re not overfitting or measuring something trivial
- 0.7 - 0.9: Strong explanatory power, typical for well-specified models
- 0.5 - 0.7: Moderate fit, common in social sciences and business contexts
- 0.3 - 0.5: Weak but potentially meaningful relationship
- Below 0.3: Limited explanatory power, consider additional variables
These thresholds vary by field. Physics experiments might expect R-squared above 0.99. Marketing models might celebrate 0.4.
Critical limitations to understand:
R-squared always increases when you add variables, even useless ones. Use adjusted R-squared for multiple regression to penalize unnecessary complexity.
High R-squared doesn’t prove causation. Two variables can correlate strongly without one causing the other.
R-squared can mislead with non-linear relationships. A perfect quadratic relationship might show low R-squared when fitted with a linear model. Always plot your data.
Outliers dramatically affect R-squared. A single extreme point can inflate or deflate the statistic. Examine your data visually before trusting the number.
Time series data often shows artificially high R-squared due to autocorrelation. Both variables trending upward over time creates spurious correlation.
The practical takeaway: R-squared is one diagnostic among many. Combine it with residual plots, domain knowledge, and logical reasoning about your model. A statistically strong R-squared means nothing if the underlying relationship doesn’t make sense.