How to Calculate Percent Change in Pandas

Percent change is one of the most fundamental calculations in data analysis. Whether you're tracking stock returns, measuring revenue growth, analyzing user engagement metrics, or monitoring...

Key Insights

  • Pandas’ pct_change() method calculates percent change using the formula (current - previous) / previous, returning decimal values (0.05 = 5% increase)
  • The periods parameter lets you calculate change over any interval—use periods=7 for week-over-week analysis on daily data or periods=12 for year-over-year on monthly data
  • Always handle the resulting NaN values deliberately; the first row(s) will always be NaN since there’s no previous value to compare against

Introduction

Percent change is one of the most fundamental calculations in data analysis. Whether you’re tracking stock returns, measuring revenue growth, analyzing user engagement metrics, or monitoring inventory levels, you need to understand how values change relative to their previous state.

The formula is straightforward: (current_value - previous_value) / previous_value. But implementing this manually across thousands of rows, handling edge cases, and applying it to grouped data gets tedious fast. That’s where Pandas’ pct_change() method comes in—it handles the heavy lifting while giving you control over the calculation details.

This article covers everything you need to calculate percent change effectively in Pandas, from basic usage to grouped operations and manual alternatives for edge cases.

Basic Usage of pct_change()

The pct_change() method works on both Series and DataFrames. By default, it compares each value to the immediately preceding value.

import pandas as pd

# Sample stock prices
prices = pd.Series([100, 105, 103, 110, 108], 
                   index=pd.date_range('2024-01-01', periods=5, freq='D'),
                   name='price')

print("Original prices:")
print(prices)
print("\nPercent change:")
print(prices.pct_change())

Output:

Original prices:
2024-01-01    100
2024-01-02    105
2024-01-03    103
2024-01-04    110
2024-01-05    108
Freq: D, Name: price, dtype: int64

Percent change:
2024-01-01         NaN
2024-01-02    0.050000
2024-01-03   -0.019048
2024-01-04    0.067961
2024-01-05   -0.018182
Freq: D, Name: price, dtype: float64

Notice two things. First, the result is in decimal form—0.05 means a 5% increase, -0.019048 means approximately a 1.9% decrease. Multiply by 100 if you need percentage format. Second, the first value is NaN because there’s no previous value to compare against.

The underlying calculation for each row is:

  • Row 2: (105 - 100) / 100 = 0.05
  • Row 3: (103 - 105) / 105 = -0.019048
  • Row 4: (110 - 103) / 103 = 0.067961
  • Row 5: (108 - 110) / 110 = -0.018182

Key Parameters

The pct_change() method accepts several parameters that modify its behavior:

periods: Controls how many rows back to compare. Default is 1 (compare to immediately previous row).

fill_method: Historically used to handle missing values before calculation. Deprecated in recent Pandas versions—handle missing data explicitly instead.

limit: When using fill_method, limits how many consecutive NaNs to fill.

The periods parameter is particularly useful for time series analysis:

import pandas as pd
import numpy as np

# Daily sales data for two weeks
dates = pd.date_range('2024-01-01', periods=14, freq='D')
daily_sales = pd.Series([1000, 1050, 980, 1100, 1150, 1080, 1200,
                         1100, 1180, 1050, 1220, 1280, 1150, 1350], 
                        index=dates, name='sales')

# Day-over-day change (default)
daily_change = daily_sales.pct_change()

# Week-over-week change (compare to same day last week)
weekly_change = daily_sales.pct_change(periods=7)

comparison = pd.DataFrame({
    'sales': daily_sales,
    'daily_pct': daily_change,
    'weekly_pct': weekly_change
})

print(comparison.tail(7))

Output:

            sales  daily_pct  weekly_pct
2024-01-08   1100  -0.083333    0.100000
2024-01-09   1180   0.072727    0.123810
2024-01-10   1050  -0.110169    0.071429
2024-01-11   1220   0.161905    0.109091
2024-01-12   1280   0.049180    0.113043
2024-01-13   1150  -0.101562    0.064815
2024-01-14   1350   0.173913    0.125000

Week-over-week comparisons often reveal trends that daily fluctuations obscure. For monthly data, use periods=12 to get year-over-year change.

Handling Missing Values

Missing values require careful handling. The pct_change() method produces NaN when either the current or previous value is missing:

import pandas as pd
import numpy as np

# Data with missing values
data = pd.Series([100, 105, np.nan, 110, 115, np.nan, np.nan, 125])

print("Original data:")
print(data)
print("\nPercent change (no preprocessing):")
print(data.pct_change())

Output:

Original data:
0    100.0
1    105.0
2      NaN
3    110.0
4    115.0
5      NaN
6      NaN
7    125.0
dtype: float64

Percent change (no preprocessing):
0         NaN
1    0.050000
2         NaN
3         NaN
4    0.045455
5         NaN
6         NaN
7         NaN
dtype: float64

The old fill_method parameter is deprecated. Instead, handle missing values explicitly before calling pct_change():

import pandas as pd
import numpy as np

data = pd.Series([100, 105, np.nan, 110, 115, np.nan, np.nan, 125])

# Option 1: Forward fill before calculating
filled_data = data.ffill()
pct_with_ffill = filled_data.pct_change()

# Option 2: Interpolate missing values
interpolated_data = data.interpolate()
pct_with_interp = interpolated_data.pct_change()

comparison = pd.DataFrame({
    'original': data,
    'ffill': filled_data,
    'pct_ffill': pct_with_ffill,
    'interpolated': interpolated_data,
    'pct_interp': pct_with_interp
})

print(comparison)

Choose your fill strategy based on your data’s characteristics. Forward fill works well for prices (last known value persists). Interpolation suits metrics that change gradually. Sometimes dropping NaN rows entirely is the right call.

Applying to DataFrames and GroupBy

When applied to a DataFrame, pct_change() calculates percent change for each column independently:

import pandas as pd

# Multi-product sales data
df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=6, freq='M'),
    'product_a': [1000, 1100, 1050, 1200, 1180, 1300],
    'product_b': [500, 520, 540, 530, 560, 590],
    'product_c': [2000, 2100, 2200, 2150, 2300, 2400]
})

df.set_index('date', inplace=True)

# Calculate percent change for all products at once
pct_changes = df.pct_change()
print(pct_changes)

For grouped calculations—like calculating growth per product category or per region—combine groupby() with pct_change():

import pandas as pd

# Sales data by product and month
sales_data = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar', 'Apr'] * 3,
    'product': ['Widget'] * 4 + ['Gadget'] * 4 + ['Gizmo'] * 4,
    'revenue': [1000, 1100, 1050, 1200,   # Widget
                500, 550, 600, 580,        # Gadget
                2000, 2200, 2100, 2400]    # Gizmo
})

# Sort to ensure correct order within groups
sales_data = sales_data.sort_values(['product', 'month'])

# Calculate month-over-month growth within each product
sales_data['mom_growth'] = sales_data.groupby('product')['revenue'].pct_change()

print(sales_data.sort_values(['product', 'month']))

Output:

   month product  revenue  mom_growth
5    Feb  Gadget      550    0.100000
4    Jan  Gadget      500         NaN
7    Apr  Gadget      580   -0.033333
6    Mar  Gadget      600    0.090909
9    Feb   Gizmo     2200    0.100000
8    Jan   Gizmo     2000         NaN
11   Apr   Gizmo     2400    0.142857
10   Mar   Gizmo     2100   -0.045455
1    Feb  Widget     1100    0.100000
0    Jan  Widget     1000         NaN
3    Apr  Widget     1200    0.142857
2    Mar  Widget     1050   -0.045455

Each product’s percent change is calculated independently, with the first month of each product showing NaN.

Manual Calculation Alternative

Sometimes you need more control than pct_change() provides. Use shift() to implement the calculation manually:

import pandas as pd

df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=5, freq='D'),
    'value': [100, 105, 103, 110, 108]
})

# Manual percent change calculation
df['pct_change_manual'] = (df['value'] - df['value'].shift(1)) / df['value'].shift(1)

# Verify it matches built-in method
df['pct_change_builtin'] = df['value'].pct_change()

print(df)

Manual calculation is useful when you need:

  • Custom handling of edge cases (e.g., division by zero)
  • Absolute change alongside percent change
  • Different comparison logic (e.g., compare to a fixed baseline)
import pandas as pd

df = pd.DataFrame({
    'value': [100, 105, 0, 110, 108]  # Note the zero
})

# Built-in produces inf when dividing by zero
df['builtin'] = df['value'].pct_change()

# Manual with zero handling
previous = df['value'].shift(1)
df['manual_safe'] = (df['value'] - previous) / previous.replace(0, pd.NA)

print(df)

Practical Use Case: Financial Analysis

Let’s put it all together with a realistic financial analysis example:

import pandas as pd
import numpy as np

# Simulated stock data for multiple tickers
np.random.seed(42)
dates = pd.date_range('2024-01-01', periods=60, freq='B')  # Business days

stock_data = pd.DataFrame({
    'date': dates.tolist() * 3,
    'ticker': ['AAPL'] * 60 + ['GOOGL'] * 60 + ['MSFT'] * 60,
    'close': np.concatenate([
        150 + np.cumsum(np.random.randn(60) * 2),  # AAPL
        140 + np.cumsum(np.random.randn(60) * 2.5),  # GOOGL
        380 + np.cumsum(np.random.randn(60) * 3),  # MSFT
    ])
})

# Calculate daily returns per stock
stock_data = stock_data.sort_values(['ticker', 'date'])
stock_data['daily_return'] = stock_data.groupby('ticker')['close'].pct_change()

# Calculate 5-day (weekly) returns
stock_data['weekly_return'] = stock_data.groupby('ticker')['close'].pct_change(periods=5)

# Summary statistics per ticker
summary = stock_data.groupby('ticker')['daily_return'].agg([
    ('mean_daily_return', 'mean'),
    ('std_daily_return', 'std'),
    ('min_daily_return', 'min'),
    ('max_daily_return', 'max')
])

# Annualized metrics (assuming 252 trading days)
summary['annualized_return'] = summary['mean_daily_return'] * 252
summary['annualized_volatility'] = summary['std_daily_return'] * np.sqrt(252)
summary['sharpe_ratio'] = summary['annualized_return'] / summary['annualized_volatility']

print("Daily Return Statistics by Ticker:")
print(summary.round(4))

This example demonstrates a complete workflow: loading data, calculating returns at multiple intervals, grouping by category, and deriving meaningful summary statistics.

For visualization, plot the cumulative returns:

# Calculate cumulative returns
stock_data['cumulative_return'] = stock_data.groupby('ticker')['daily_return'].transform(
    lambda x: (1 + x).cumprod() - 1
)

# Plot with: stock_data.pivot(index='date', columns='ticker', values='cumulative_return').plot()

The pct_change() method handles the tedious calculation work, letting you focus on analysis and interpretation. Master its parameters and combine it with groupby operations, and you’ll handle most percent change scenarios efficiently.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.