How to Fill NaN with Zero in Pandas

NaN (Not a Number) values are the bane of data analysis. They creep into your DataFrames from missing CSV fields, failed API calls, mismatched joins, and countless other sources. Before you can...

Key Insights

  • The fillna(0) method is the most straightforward and performant way to replace NaN values with zero in Pandas, working on both DataFrames and Series
  • Always prefer assignment (df = df.fillna(0)) over inplace=True for cleaner, more predictable code—the inplace parameter is being deprecated in future Pandas versions
  • When working with mixed-type DataFrames, use select_dtypes() to target only numeric columns, avoiding unintended modifications to categorical or string data

Introduction

NaN (Not a Number) values are the bane of data analysis. They creep into your DataFrames from missing CSV fields, failed API calls, mismatched joins, and countless other sources. Before you can perform meaningful calculations, you need to handle these gaps.

Replacing NaN with zero is one of the most common data cleaning operations. It’s appropriate when missing values genuinely represent “nothing”—zero sales, zero clicks, zero occurrences. However, it’s not always the right choice. Filling NaN with zero in a temperature column would be misleading, while doing so in a revenue column might be exactly what you need.

This article covers every practical method for filling NaN values with zero in Pandas, from the basic one-liner to selective approaches for complex DataFrames.

Using fillna(0) - The Basic Approach

The fillna() method is your primary tool for handling missing values. Pass 0 as the argument, and every NaN in your DataFrame becomes zero.

import pandas as pd
import numpy as np

# Create a DataFrame with NaN values
df = pd.DataFrame({
    'product': ['Widget', 'Gadget', 'Sprocket', 'Gizmo'],
    'sales': [150, np.nan, 89, np.nan],
    'returns': [5, 2, np.nan, 1],
    'rating': [4.5, np.nan, 3.8, 4.2]
})

print("Original DataFrame:")
print(df)
print()

# Fill all NaN values with zero
df_filled = df.fillna(0)

print("After fillna(0):")
print(df_filled)

Output:

Original DataFrame:
    product  sales  returns  rating
0    Widget  150.0      5.0     4.5
1    Gadget    NaN      2.0     NaN
2  Sprocket   89.0      NaN     3.8
3     Gizmo    NaN      1.0     4.2

After fillna(0):
    product  sales  returns  rating
0    Widget  150.0      5.0     4.5
1    Gadget    0.0      2.0     0.0
2  Sprocket   89.0      0.0     3.8
3     Gizmo    0.0      1.0     4.2

This approach is clean and fast. Pandas optimizes fillna() internally, making it significantly faster than manual iteration or apply functions. For most use cases, this is all you need.

The method also works identically on Series objects:

sales_series = df['sales']
sales_filled = sales_series.fillna(0)
print(sales_filled)

Filling NaN in Specific Columns

Blanket replacement isn’t always appropriate. You might want to fill NaN with zero in sales figures while preserving NaN in rating columns (where zero would be a valid but misleading value).

Target a single column by selecting it first:

df = pd.DataFrame({
    'product': ['Widget', 'Gadget', 'Sprocket'],
    'sales': [150, np.nan, 89],
    'rating': [4.5, np.nan, 3.8]
})

# Fill NaN only in the 'sales' column
df['sales'] = df['sales'].fillna(0)

print(df)

Output:

    product  sales  rating
0    Widget  150.0     4.5
1    Gadget    0.0     NaN
2  Sprocket   89.0     3.8

Notice that the NaN in the rating column remains untouched.

For multiple columns, you have two options. The first uses a loop or list comprehension:

columns_to_fill = ['sales', 'returns']

for col in columns_to_fill:
    df[col] = df[col].fillna(0)

The second approach uses dictionary-based filling, which is more elegant:

df = pd.DataFrame({
    'product': ['Widget', 'Gadget', 'Sprocket'],
    'sales': [150, np.nan, 89],
    'returns': [5, np.nan, np.nan],
    'rating': [4.5, np.nan, 3.8]
})

# Fill specific columns with specific values
df = df.fillna({'sales': 0, 'returns': 0})

print(df)

The dictionary approach lets you fill different columns with different values in a single operation—useful when you need zeros for some columns and other defaults for others.

In-Place Modification vs. Creating a Copy

Pandas offers an inplace parameter that modifies the original DataFrame directly:

# Using inplace=True
df.fillna(0, inplace=True)

# Equivalent to assignment
df = df.fillna(0)

Both achieve the same result, but I strongly recommend using assignment. Here’s why:

The inplace parameter is being deprecated. The Pandas development team has signaled that inplace will be removed in future versions. Code using it will eventually break.

Assignment is more explicit. When you write df = df.fillna(0), it’s immediately clear that df is being reassigned. The inplace=True pattern hides the mutation, making code harder to reason about.

Method chaining doesn’t work with inplace. Modern Pandas code often chains methods together:

# This works
df_clean = (df
    .fillna(0)
    .drop_duplicates()
    .reset_index(drop=True))

# This doesn't work - inplace returns None
df_clean = (df
    .fillna(0, inplace=True)  # Returns None, breaks the chain
    .drop_duplicates())

The only argument for inplace=True is memory efficiency with very large DataFrames, but the difference is negligible in practice due to Pandas’ internal optimizations.

Using replace() as an Alternative

The replace() method offers another way to swap NaN for zero:

import numpy as np

df = pd.DataFrame({
    'a': [1, np.nan, 3],
    'b': [np.nan, 5, 6]
})

# Using replace with np.nan
df_replaced = df.replace(np.nan, 0)

print(df_replaced)

Output:

     a    b
0  1.0  0.0
1  0.0  5.0
2  3.0  6.0

The replace() method is more general-purpose—it can swap any value for any other value. For strictly NaN replacement, fillna() is more semantic and slightly faster. Use replace() when you’re already using it for other substitutions and want to handle NaN in the same operation:

# Replace multiple values at once
df = df.replace({
    np.nan: 0,
    -999: 0,  # Common placeholder for missing data
    'N/A': 'Unknown'
})

One important caveat: replace() with np.nan won’t catch None values in object-type columns. If your DataFrame has mixed None and np.nan, use fillna() instead—it handles both.

Filling NaN with Zero in Numeric Columns Only

Real-world DataFrames contain mixed types. You might have product names, categories, dates, and numeric values all in one table. Filling NaN with zero across the board would corrupt your string columns (inserting the string “0” or causing type issues).

The solution is select_dtypes():

df = pd.DataFrame({
    'product': ['Widget', None, 'Sprocket'],
    'category': ['Electronics', 'Hardware', None],
    'sales': [150, np.nan, 89],
    'returns': [5, np.nan, 3],
    'rating': [4.5, np.nan, 3.8]
})

print("Original DataFrame:")
print(df)
print()

# Get only numeric columns
numeric_cols = df.select_dtypes(include=[np.number]).columns

# Fill NaN with zero only in numeric columns
df[numeric_cols] = df[numeric_cols].fillna(0)

print("After filling numeric columns:")
print(df)

Output:

Original DataFrame:
    product     category  sales  returns  rating
0    Widget  Electronics  150.0      5.0     4.5
1      None     Hardware    NaN      NaN     NaN
2  Sprocket         None   89.0      3.0     3.8

After filling numeric columns:
    product     category  sales  returns  rating
0    Widget  Electronics  150.0      5.0     4.5
1      None     Hardware    0.0      0.0     0.0
2  Sprocket         None   89.0      3.0     3.8

The None values in product and category remain unchanged while all numeric NaN values become zero.

You can be more specific with select_dtypes():

# Only float columns
float_cols = df.select_dtypes(include=['float64']).columns

# Only integer columns (though NaN converts int to float)
int_cols = df.select_dtypes(include=['int64']).columns

# All numeric types
numeric_cols = df.select_dtypes(include=['number']).columns

Conclusion

Filling NaN with zero is a fundamental Pandas operation. Here’s a quick reference for choosing the right approach:

Scenario Method
Fill all NaN in DataFrame df = df.fillna(0)
Fill NaN in one column df['col'] = df['col'].fillna(0)
Fill NaN in multiple specific columns df = df.fillna({'col1': 0, 'col2': 0})
Fill NaN only in numeric columns df[numeric_cols] = df[numeric_cols].fillna(0)
Replace NaN along with other values df = df.replace({np.nan: 0, -999: 0})

A few final guidelines:

Use fillna(0) as your default. It’s the most readable, performant, and idiomatic approach.

Avoid inplace=True. It’s being deprecated and makes code harder to follow.

Think before you fill. Zero isn’t always the right replacement. For time series, forward-fill (ffill) or backward-fill (bfill) might be more appropriate. For statistical analysis, the mean or median might preserve your data’s distribution better. Zero is correct when absence genuinely means zero—not when it means “unknown.”

Check your dtypes after filling. Filling NaN can change column types. A column that was float64 due to NaN values might stay float64 even after filling. If you need integers, explicitly cast with astype(int) after filling.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.