CORREL Function in Google Sheets: Complete Guide

The CORREL function in Google Sheets calculates the Pearson correlation coefficient between two datasets. This statistical measure quantifies the strength and direction of the linear relationship...

Key Insights

  • CORREL calculates the Pearson correlation coefficient between two datasets, returning a value from -1 (perfect negative correlation) to 1 (perfect positive correlation), with 0 indicating no linear relationship.
  • The function ignores text values and empty cells but requires equal-sized arrays—mismatched ranges trigger errors that can silently skew your analysis.
  • Correlation measures association, not causation; a strong CORREL result between two variables doesn’t prove one causes the other, only that they move together.

Introduction to CORREL and Correlation Analysis

The CORREL function in Google Sheets calculates the Pearson correlation coefficient between two datasets. This statistical measure quantifies the strength and direction of the linear relationship between two variables, returning a value between -1 and 1.

Understanding correlation is fundamental to data analysis. A correlation of 1 means the variables move perfectly together—when one increases, the other increases proportionally. A correlation of -1 means they move in perfect opposition. A correlation near 0 suggests no linear relationship exists.

Real-world applications are everywhere. Marketing teams use CORREL to analyze the relationship between ad spend and conversions. Retailers examine how temperature affects ice cream sales. Financial analysts study correlations between stock prices to build diversified portfolios. Product managers correlate feature usage with customer retention.

The key word here is “linear.” CORREL measures linear relationships specifically. If your data follows a curved pattern, CORREL might return a weak correlation even when a strong non-linear relationship exists. Keep this limitation in mind as we explore the function.

CORREL Syntax and Parameters

The CORREL function uses a straightforward syntax:

=CORREL(data_y, data_x)

Both parameters are required:

  • data_y: The first range of values (dependent variable)
  • data_x: The second range of values (independent variable)

Despite the parameter names suggesting y and x, the order doesn’t affect the result. CORREL(A1:A10, B1:B10) returns the same value as CORREL(B1:B10, A1:A10).

Here’s a basic example with sales and marketing data:

| Row | A (Marketing Spend) | B (Sales) |
|-----|---------------------|-----------|
| 1   | 5000                | 45000     |
| 2   | 7500                | 52000     |
| 3   | 3000                | 38000     |
| 4   | 10000               | 68000     |
| 5   | 6000                | 49000     |

Formula: =CORREL(B1:B5, A1:A5)
Result: 0.987

This strong positive correlation (0.987) indicates that marketing spend and sales move closely together.

Critical requirements:

  • Both ranges must contain the same number of values
  • Values must be numeric (or convertible to numeric)
  • Empty cells and text values are ignored, but this can cause mismatched pair counts

When CORREL encounters text or empty cells, it excludes those positions from both arrays. If cell A3 is empty, the corresponding B3 value is also excluded from the calculation. This behavior can produce unexpected results if you’re not careful about data consistency.

Interpreting Correlation Results

Understanding what correlation values mean is crucial for proper analysis:

Correlation Range Interpretation Example
0.9 to 1.0 Very strong positive Height vs. shoe size
0.7 to 0.9 Strong positive Study hours vs. test scores
0.4 to 0.7 Moderate positive Income vs. happiness
0.1 to 0.4 Weak positive Coffee consumption vs. productivity
-0.1 to 0.1 No correlation Lottery numbers vs. weather
-0.4 to -0.1 Weak negative Exercise vs. body fat
-0.7 to -0.4 Moderate negative Price vs. demand
-1.0 to -0.7 Strong negative Altitude vs. temperature

Here’s sample data demonstrating different correlation strengths:

Strong Positive (r  0.95):
| X  | Y   |
|----|-----|
| 10 | 22  |
| 20 | 41  |
| 30 | 58  |
| 40 | 82  |
| 50 | 99  |

=CORREL(B2:B6, A2:A6)  0.998

No Correlation (r  0):
| X  | Y   |
|----|-----|
| 10 | 45  |
| 20 | 12  |
| 30 | 67  |
| 40 | 23  |
| 50 | 51  |

=CORREL(B2:B6, A2:A6)  0.089

The causation trap: A high correlation between ice cream sales and drowning deaths doesn’t mean ice cream causes drowning. Both increase during summer months—the lurking variable is temperature. Always think critically about what drives the relationship you’re measuring.

Practical Examples and Use Cases

Financial Analysis: Stock Correlations

Portfolio managers use correlation to diversify holdings. Stocks with low or negative correlations reduce overall portfolio risk.

| Date       | Stock A | Stock B | Stock C |
|------------|---------|---------|---------|
| 2024-01-01 | 100     | 50      | 75      |
| 2024-02-01 | 105     | 48      | 78      |
| 2024-03-01 | 102     | 52      | 74      |
| 2024-04-01 | 110     | 47      | 80      |
| 2024-05-01 | 108     | 49      | 82      |
| 2024-06-01 | 115     | 45      | 85      |

=CORREL(B2:B7, C2:C7) → -0.89 (A vs B: negatively correlated)
=CORREL(B2:B7, D2:D7) → 0.94 (A vs C: positively correlated)
=CORREL(C2:C7, D2:D7) → -0.76 (B vs C: negatively correlated)

Stock B provides diversification benefits when paired with Stock A or C due to the negative correlations.

Marketing Metrics: Website Traffic vs. Sales

Analyzing the relationship between traffic sources and revenue helps prioritize marketing channels:

| Week | Organic Traffic | Paid Traffic | Email Clicks | Revenue |
|------|-----------------|--------------|--------------|---------|
| 1    | 12000           | 5000         | 800          | 45000   |
| 2    | 14500           | 4200         | 950          | 52000   |
| 3    | 11000           | 6500         | 700          | 48000   |
| 4    | 16000           | 3800         | 1100         | 58000   |
| 5    | 13500           | 5500         | 850          | 51000   |
| 6    | 18000           | 4000         | 1250         | 65000   |

Organic vs Revenue: =CORREL(E2:E7, B2:B7) → 0.96
Paid vs Revenue: =CORREL(E2:E7, C2:C7) → -0.42
Email vs Revenue: =CORREL(E2:E7, D2:D7) → 0.98

This analysis reveals that organic traffic and email clicks strongly correlate with revenue, while paid traffic shows a weak negative correlation—a signal to investigate the paid campaign’s effectiveness.

Google Sheets offers several related statistical functions. Understanding when to use each prevents confusion:

Sample Data (A1:A6 and B1:B6):
| A  | B  |
|----|-----|
| 10 | 25  |
| 15 | 32  |
| 20 | 41  |
| 25 | 48  |
| 30 | 55  |
| 35 | 63  |

=CORREL(B1:B6, A1:A6)     → 0.9996
=PEARSON(B1:B6, A1:A6)    → 0.9996
=RSQ(B1:B6, A1:A6)        → 0.9992
=COVARIANCE.P(A1:A6, B1:B6) → 116.67
=COVARIANCE.S(A1:A6, B1:B6) → 140.00

CORREL vs. PEARSON: Functionally identical. Both return the Pearson correlation coefficient. Use whichever name you find more intuitive.

RSQ (R-squared): Returns the square of the correlation coefficient. RSQ tells you what proportion of variance in Y is explained by X. A CORREL of 0.9 means RSQ of 0.81, indicating 81% of the variance is explained.

COVARIANCE.P and COVARIANCE.S: Return covariance rather than correlation. Covariance measures how variables change together but isn’t normalized to a -1 to 1 scale. Use COVARIANCE.P for population data and COVARIANCE.S for samples.

Advanced Techniques and Troubleshooting

Conditional Correlation with FILTER

Calculate correlation for specific subsets of your data:

| Region | Ad Spend | Sales |
|--------|----------|-------|
| North  | 5000     | 48000 |
| South  | 7000     | 52000 |
| North  | 6500     | 55000 |
| South  | 8000     | 61000 |
| North  | 4500     | 42000 |

North region only:
=CORREL(
  FILTER(C2:C6, A2:A6="North"),
  FILTER(B2:B6, A2:A6="North")
)

Building a Correlation Matrix

For analyzing multiple variables, create a correlation matrix:

| Variable      | Traffic | Spend | Conversions | Revenue |
|---------------|---------|-------|-------------|---------|
| Traffic       | 1       |       |             |         |
| Spend         | =CORREL($B$2:$B$10,C$2:C$10) | 1 |   |         |
| Conversions   | =CORREL($B$2:$B$10,D$2:D$10) | =CORREL($C$2:$C$10,D$2:D$10) | 1 | |
| Revenue       | =CORREL($B$2:$B$10,E$2:E$10) | =CORREL($C$2:$C$10,E$2:E$10) | =CORREL($D$2:$D$10,E$2:E$10) | 1 |

Use absolute references strategically to make the matrix easy to build and maintain.

Common Errors and Fixes

#N/A Error: Usually caused by mismatched array sizes. Check that both ranges have the same number of cells.

#VALUE! Error: Non-numeric data that can’t be converted. Use ISNUMBER to identify problematic cells.

#DIV/0! Error: Occurs when one dataset has zero variance (all identical values). Correlation is undefined when there’s no variation.

Debugging formula:
=IF(STDEV(A1:A10)=0, "No variance in X",
  IF(STDEV(B1:B10)=0, "No variance in Y",
    CORREL(A1:A10, B1:B10)))

Best Practices and Limitations

Minimum sample size: While CORREL works with as few as two data points, meaningful correlations require larger samples. Aim for at least 30 observations for reliable results. Smaller samples produce unstable correlations that can change dramatically with new data.

Handle outliers carefully: A single extreme value can dramatically shift correlation results. Before running CORREL, visualize your data with a scatter plot. Consider removing or investigating outliers that don’t represent typical behavior.

Non-linear relationships: CORREL only detects linear relationships. Data following a U-shape, exponential curve, or other non-linear pattern may show weak correlation despite a strong underlying relationship. If you suspect non-linearity, transform your data (log, square root) or use different analytical methods.

Data preparation checklist:

  1. Ensure both ranges are the same size
  2. Remove or address missing values consistently
  3. Check for outliers using a scatter plot
  4. Verify the relationship appears linear
  5. Confirm you have sufficient sample size

CORREL is a powerful starting point for understanding relationships in your data. Use it to generate hypotheses, identify variables worth investigating further, and quantify the strength of linear associations. Just remember: it’s a tool for exploration, not proof of causation.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.