Excel: How to Find the Regression Equation

Regression analysis is one of the most practical statistical tools you'll use in business and data analysis. At its core, a regression equation describes the relationship between two variables,...

Key Insights

  • Excel offers three distinct methods for finding regression equations: LINEST for comprehensive statistical output, SLOPE/INTERCEPT for simplicity, and trendlines for visual analysis
  • The regression equation y = mx + b lets you predict outcomes based on input variables, making it essential for sales forecasting, trend analysis, and data-driven decision making
  • Always check your R-squared value to assess how well your regression equation fits the data—values closer to 1 indicate stronger predictive power

Introduction to Regression Analysis in Excel

Regression analysis is one of the most practical statistical tools you’ll use in business and data analysis. At its core, a regression equation describes the relationship between two variables, allowing you to predict one based on the other.

The simple linear regression equation takes the form:

y = mx + b

Where m is the slope (how much y changes for each unit change in x) and b is the y-intercept (the value of y when x equals zero).

Real-world applications are everywhere. Sales teams use regression to forecast revenue based on marketing spend. Operations managers predict production costs based on output volume. Financial analysts estimate stock prices based on economic indicators. If you work with data, you need regression in your toolkit.

Excel makes this accessible without requiring statistical software. Let’s walk through three methods, from most comprehensive to most visual.

Preparing Your Data

Before running any regression analysis, your data needs to be structured correctly. Poor data preparation is the most common source of regression errors.

Organize your data in two adjacent columns: independent variable (X) in the first column, dependent variable (Y) in the second. The independent variable is what you’re using to predict; the dependent variable is what you’re predicting.

Here’s a sample dataset we’ll use throughout this article—advertising spend versus sales revenue:

| Row | A (Ad Spend $) | B (Sales $) |
|-----|----------------|-------------|
| 1   | Ad Spend       | Sales       |
| 2   | 1000           | 15000       |
| 3   | 1500           | 18500       |
| 4   | 2000           | 22000       |
| 5   | 2500           | 24500       |
| 6   | 3000           | 28000       |
| 7   | 3500           | 31000       |
| 8   | 4000           | 35500       |
| 9   | 4500           | 38000       |
| 10  | 5000           | 42000       |

Check for these common issues before proceeding:

  1. Missing values: Remove or interpolate any blank cells. Excel’s regression functions will error or produce incorrect results with gaps.
  2. Text in numeric columns: Ensure all data cells contain numbers, not text formatted as numbers.
  3. Outliers: Scan for obviously incorrect values that could skew your results.
  4. Consistent units: Both columns should use consistent measurement units throughout.

A quick validation formula to check for non-numeric values:

=SUMPRODUCT(--(ISNUMBER(A2:A10)))

This returns the count of numeric cells. If it doesn’t match your expected row count, you have data issues to resolve.

Method 1: Using the LINEST Function

LINEST is Excel’s most powerful regression function. It returns an array of statistics including slope, intercept, standard errors, R-squared, and more. This is my recommended approach for serious analysis.

The basic syntax:

=LINEST(known_y's, known_x's, [const], [stats])
  • known_y's: Your dependent variable range (sales in our example)
  • known_x's: Your independent variable range (ad spend)
  • const: TRUE to calculate intercept normally, FALSE to force intercept through zero
  • stats: TRUE to return additional regression statistics, FALSE for just slope and intercept

For our dataset, the basic formula:

=LINEST(B2:B10, A2:A10)

This returns two values in adjacent cells: slope first, then intercept. In modern Excel (365/2021), this spills automatically. In older versions, you’ll need to select two horizontal cells and enter as an array formula with Ctrl+Shift+Enter.

To get the full statistical output:

=LINEST(B2:B10, A2:A10, TRUE, TRUE)

This returns a 5-row by 2-column array:

| Column 1          | Column 2          |
|-------------------|-------------------|
| Slope (m)         | Intercept (b)     |
| Standard error m  | Standard error b  |
| R-squared         | Standard error y  |
| F-statistic       | Degrees freedom   |
| Regression SS     | Residual SS       |

For our sample data, you’ll get approximately:

| 6.75              | 8083.33           |
| 0.15              | 523.81            |
| 0.9964            | 466.67            |
| 1963.64           | 7                 |
| 427500000         | 1524444.44        |

The slope of 6.75 means each additional dollar in advertising generates approximately $6.75 in sales. The intercept of 8083.33 represents baseline sales with zero advertising. The R-squared of 0.9964 indicates an excellent fit—99.64% of sales variation is explained by advertising spend.

To extract specific values into separate cells:

=INDEX(LINEST(B2:B10, A2:A10, TRUE, TRUE), 1, 1)  ' Returns slope
=INDEX(LINEST(B2:B10, A2:A10, TRUE, TRUE), 1, 2)  ' Returns intercept
=INDEX(LINEST(B2:B10, A2:A10, TRUE, TRUE), 3, 1)  ' Returns R-squared

Method 2: Using Individual SLOPE and INTERCEPT Functions

If you just need the regression equation without additional statistics, Excel’s SLOPE and INTERCEPT functions offer a cleaner approach. These are easier to read and audit in spreadsheets shared with non-technical colleagues.

The SLOPE function calculates m:

=SLOPE(known_y's, known_x's)

The INTERCEPT function calculates b:

=INTERCEPT(known_y's, known_x's)

For our dataset:

=SLOPE(B2:B10, A2:A10)

Returns: 6.75

=INTERCEPT(B2:B10, A2:A10)

Returns: 8083.33

You can build a formatted equation string:

="y = " & ROUND(SLOPE(B2:B10, A2:A10), 2) & "x + " & ROUND(INTERCEPT(B2:B10, A2:A10), 2)

Returns: y = 6.75x + 8083.33

To also capture the R-squared value with this method, use the RSQ function:

=RSQ(B2:B10, A2:A10)

Returns: 0.9964

This approach is ideal when you’re building dashboards or reports where transparency matters. Anyone can click on the cell and immediately understand what’s being calculated.

Method 3: Adding a Trendline to a Scatter Chart

The visual approach works best for presentations and exploratory analysis. Excel can display the regression equation directly on a chart, making it immediately interpretable.

Follow these steps:

  1. Select your data range (A1:B10, including headers)

  2. Insert a scatter chart: Go to Insert → Charts → Scatter → Select “Scatter with only Markers”

  3. Add the trendline: Click on any data point in the chart, then right-click and select “Add Trendline”

  4. Configure trendline options: In the Format Trendline pane:

    • Select “Linear” under Trendline Options
    • Check “Display Equation on chart”
    • Check “Display R-squared value on chart”

The chart will now show your regression line with the equation and R² value displayed directly on the plot.

You can also access trendline options through:

Chart Design → Add Chart Element → Trendline → More Trendline Options

For forecasting, the trendline dialog lets you extend the line forward or backward:

  • Forward: Enter periods to project future values
  • Backward: Extend the line to see where it intercepts the y-axis

One limitation: the equation displayed on the chart rounds coefficients. For precise values, use LINEST or SLOPE/INTERCEPT. The chart equation is for visual reference, not calculation.

Interpreting and Using Your Regression Equation

Now that you have your equation, let’s put it to work. Our derived equation is:

Sales = 6.75 × Ad Spend + 8083.33

Understanding the coefficients:

The slope (6.75) is your return on investment metric. Every $1 increase in advertising correlates with a $6.75 increase in sales. This is actionable intelligence for budget allocation.

The intercept (8083.33) represents theoretical baseline sales with zero advertising. Be cautious interpreting this literally—it’s a mathematical construct. In reality, zero advertising might not yield $8,083 in sales; this is where the model extrapolates beyond your data range.

Making predictions:

To predict sales for a $6,000 advertising budget:

=6.75 * 6000 + 8083.33

Returns: $48,583.33

Or more dynamically, if your slope is in D2, intercept in D3, and new ad spend in E2:

=D2 * E2 + D3

Assessing model quality:

The R-squared value (0.9964) tells you that 99.64% of the variation in sales is explained by advertising spend. This is exceptionally high—real-world data rarely fits this well.

General R-squared interpretation:

  • 0.9+: Excellent fit, highly predictive
  • 0.7-0.9: Good fit, useful for forecasting
  • 0.5-0.7: Moderate fit, use with caution
  • Below 0.5: Poor fit, consider additional variables or non-linear models

Remember that correlation doesn’t imply causation. Your regression shows a relationship, not necessarily a causal mechanism.

Conclusion

Excel provides three solid methods for finding regression equations, each suited to different needs:

Use LINEST when you need comprehensive statistics for formal analysis or when building models that require standard errors and significance testing.

Use SLOPE and INTERCEPT for straightforward calculations in shared workbooks where formula transparency matters.

Use trendlines for presentations, exploratory analysis, and when visual context helps stakeholders understand the relationship.

Start with the visual trendline method to validate that a linear relationship exists, then switch to LINEST or SLOPE/INTERCEPT for your actual calculations. Always report the R-squared value alongside your equation—predictions without fit assessment are meaningless.

For more complex scenarios involving multiple independent variables, you’ll need Excel’s Data Analysis ToolPak or dedicated statistical software. But for single-variable regression, these three methods cover every practical use case you’ll encounter.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.