How to Perform Regression in Google Sheets

Regression analysis answers a simple question: how does one variable change when another changes? If you spend more on advertising, how much more revenue can you expect? If a student studies more...

Key Insights

  • Google Sheets’ LINEST function returns a complete regression output including coefficients, R², and standard errors—but you must use Ctrl+Shift+Enter to see all statistics
  • Trendlines on scatter plots provide quick visual regression analysis, but LINEST gives you the numerical precision needed for actual predictions and model evaluation
  • Multiple regression in Sheets works well for 3-5 variables, but beyond that complexity you should move to R, Python, or dedicated statistical software

Introduction to Regression Analysis

Regression analysis answers a simple question: how does one variable change when another changes? If you spend more on advertising, how much more revenue can you expect? If a student studies more hours, what grade improvement should they see?

Linear regression fits a straight line through your data points, giving you an equation in the form y = mx + b. The slope (m) tells you the rate of change, and the intercept (b) tells you the baseline value when x equals zero.

Use simple linear regression when you have one independent variable predicting one dependent variable. Use multiple regression when several factors influence your outcome—like predicting house prices from square footage, bedroom count, and neighborhood rating simultaneously.

Google Sheets handles both scenarios surprisingly well. You won’t get the diagnostic depth of R or SPSS, but for quick analysis and business decision-making, Sheets delivers.

Preparing Your Data

Regression requires clean, properly structured data. Your dependent variable (what you’re predicting) goes in one column. Your independent variables (predictors) go in adjacent columns.

Here’s a sample dataset tracking monthly sales against advertising spend:

| Month    | Ad Spend ($) | Sales ($)  |
|----------|--------------|------------|
| Jan      | 5000         | 42000      |
| Feb      | 7500         | 51000      |
| Mar      | 6000         | 45000      |
| Apr      | 8000         | 58000      |
| May      | 4500         | 38000      |
| Jun      | 9000         | 62000      |
| Jul      | 7000         | 49000      |
| Aug      | 10000        | 71000      |
| Sep      | 8500         | 59000      |
| Oct      | 6500         | 47000      |
| Nov      | 11000        | 75000      |
| Dec      | 12000        | 82000      |

Place this in cells A1:C13 with headers in row 1.

Before running regression, check for these issues:

Missing values: LINEST ignores rows where either x or y is blank, but inconsistent gaps can skew results. Either fill missing values with estimates or remove those rows entirely.

Text in numeric columns: A single text entry will break your formula. Use =ISNUMBER(A2) to verify cells contain numbers.

Outliers: Scan for data entry errors. A $120,000 ad spend month when others average $7,000 will dominate your regression line.

Clean data with this formula to flag non-numeric entries:

=IF(ISNUMBER(B2), "OK", "CHECK")

Drag this down your data column to identify problems quickly.

Using the LINEST Function

LINEST is the workhorse function for regression in Sheets. Its full syntax:

=LINEST(known_y's, known_x's, [calc_intercept], [verbose])

The third parameter (TRUE/FALSE) determines whether to calculate the y-intercept or force it through zero. Almost always use TRUE.

The fourth parameter is critical: set it to TRUE to get the full statistical output, not just coefficients.

For our sales data, enter this formula:

=LINEST(C2:C13, B2:B13, TRUE, TRUE)

Here’s the key step most people miss: select a 5-row by 2-column range (like E2:F6), enter the formula, then press Ctrl+Shift+Enter (or Cmd+Shift+Enter on Mac). This creates an array formula that displays all statistics.

The output matrix looks like this:

| Slope (m)      | Intercept (b)  |
| Std Error (m)  | Std Error (b)  |
| R²             | Std Error (y)  |
| F-statistic    | df (degrees)   |
| SS Regression  | SS Residual    |

For our sample data, you’ll get approximately:

| 5.89           | 12847          |
| 0.31           | 2456           |
| 0.97           | 2891           |
| 354.2          | 10             |
| 2962841667     | 83558333       |

Interpretation:

  • Slope (5.89): Each additional dollar in ad spend generates $5.89 in sales
  • Intercept (12847): Baseline sales of ~$12,847 with zero advertising
  • R² (0.97): 97% of sales variation is explained by ad spend—excellent fit
  • Standard errors: Used to calculate confidence intervals for coefficients

To predict sales for a $15,000 ad spend:

=E2*15000 + F2

This returns approximately $101,197 in predicted sales.

Creating Trendlines in Charts

Visual regression through trendlines helps communicate findings to non-technical stakeholders.

Step 1: Select your x and y data (B1:C13 including headers).

Step 2: Insert > Chart. Sheets usually defaults to a column chart—change it to Scatter chart in the Chart editor.

Step 3: Click the chart, then the three-dot menu > Edit chart.

Step 4: Go to Customize > Series.

Step 5: Check “Trendline” box. Additional options appear:

  • Type: Linear (default), Exponential, Polynomial, etc.
  • Line color and opacity
  • Label: Select “Use Equation” to display y = mx + b on the chart
  • Check “Show R²” to display the coefficient of determination

Your chart now shows the regression line with its equation. The visual makes it immediately clear whether the relationship is strong (points cluster near the line) or weak (points scattered widely).

For polynomial regression, change the trendline type and set the degree. A degree-2 polynomial fits curves with one bend; degree-3 handles S-curves.

Multiple Regression with LINEST

Real-world predictions often depend on multiple factors. Extend LINEST by providing multiple x columns.

Expand the dataset with two more predictors:

| Month | Ad Spend | Email Subs | Store Traffic | Sales  |
|-------|----------|------------|---------------|--------|
| Jan   | 5000     | 2500       | 8500          | 42000  |
| Feb   | 7500     | 2800       | 9200          | 51000  |
| Mar   | 6000     | 2600       | 8800          | 45000  |
| Apr   | 8000     | 3100       | 10500         | 58000  |
| May   | 4500     | 2400       | 7900          | 38000  |
| Jun   | 9000     | 3400       | 11200         | 62000  |
| Jul   | 7000     | 2900       | 9800          | 49000  |
| Aug   | 10000    | 3800       | 12500         | 71000  |
| Sep   | 8500     | 3200       | 10800         | 59000  |
| Oct   | 6500     | 2700       | 9100          | 47000  |
| Nov   | 11000    | 4200       | 13800         | 75000  |
| Dec   | 12000    | 4500       | 14500         | 82000  |

For three independent variables, select a 5-row by 4-column output range and enter:

=LINEST(E2:E13, B2:D13, TRUE, TRUE)

The output now shows coefficients for each predictor in reverse order:

| Coef(Traffic) | Coef(Email) | Coef(AdSpend) | Intercept |

Read right-to-left to match your column order. The prediction equation becomes:

Sales = (Coef_AdSpend × AdSpend) + (Coef_Email × EmailSubs) + (Coef_Traffic × Traffic) + Intercept

Build the prediction formula:

=SUMPRODUCT(B14:D14, $H$2:$J$2) + $K$2

Where H2:J2 contains your coefficients (reversed to match column order) and K2 is the intercept.

Using the Analysis ToolPak Add-on

LINEST lacks p-values and confidence intervals. For publication-quality regression output, install an add-on.

XLMiner Analysis ToolPak (free) replicates Excel’s Data Analysis features:

  1. Extensions > Add-ons > Get add-ons
  2. Search “XLMiner Analysis ToolPak”
  3. Install and authorize

To run regression:

  1. Extensions > XLMiner Analysis ToolPak > Start
  2. Click “Regression” in the sidebar
  3. Set Y Range (dependent variable with header)
  4. Set X Range (independent variables with header)
  5. Check “Labels” if you included headers
  6. Choose output location
  7. Click OK

The output includes:

  • Coefficients with standard errors, t-statistics, and p-values
  • Confidence intervals (95% by default) for each coefficient
  • ANOVA table showing model significance
  • Residual output for diagnostic checking

A p-value below 0.05 indicates statistical significance—that predictor genuinely influences the outcome rather than appearing related by chance.

Practical Tips and Limitations

Check for multicollinearity: When independent variables correlate strongly with each other (like ad spend and store traffic both increasing together), coefficient estimates become unstable. Calculate correlation between predictors:

=CORREL(B2:B13, C2:C13)

Correlations above 0.8 suggest you should drop one variable or combine them.

Examine residuals: Subtract predicted values from actual values. Plot residuals against predicted values—they should scatter randomly around zero. Patterns indicate your linear model misses something.

=C2 - (E$2*B2 + F$2)

Watch for extrapolation danger: Your model is valid within the range of observed data. Predicting sales for $50,000 ad spend when your max was $12,000 is speculation, not analysis.

Know when to upgrade tools: Google Sheets handles datasets up to a few thousand rows and 5-6 predictors comfortably. Beyond that, switch to Python’s scikit-learn, R’s lm() function, or dedicated tools like SPSS. You’ll gain better diagnostics, cross-validation, and handling of categorical variables.

Regression in Sheets won’t win any statistical methodology awards, but it solves 80% of business analysis needs with zero setup and instant shareability. Master LINEST, understand its output, and you’ll make better data-driven decisions without leaving your browser.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.