How to Delete a Column in Pandas

Deleting columns from a DataFrame is one of the most frequent operations in data cleaning. Whether you're removing irrelevant features before model training, dropping columns with too many null...

Key Insights

  • The drop() method is the most versatile option, supporting single columns, multiple columns, and conditional deletion while returning a new DataFrame by default
  • Use del or pop() for in-place modifications when you need to conserve memory or want to capture the removed column’s data
  • Avoid inplace=True in modern pandas code—reassignment is clearer, more debuggable, and works better with method chaining

Introduction

Deleting columns from a DataFrame is one of the most frequent operations in data cleaning. Whether you’re removing irrelevant features before model training, dropping columns with too many null values, or simply tidying up imported data, you’ll reach for column deletion constantly.

Pandas offers several ways to accomplish this task, each with different trade-offs. Some return new DataFrames, others modify in place. Some handle multiple columns elegantly, others work best for single-column operations. This article covers all the practical approaches so you can pick the right tool for each situation.

Let’s start with some sample data we’ll use throughout:

import pandas as pd

df = pd.DataFrame({
    'user_id': [1, 2, 3, 4, 5],
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'email': ['alice@example.com', 'bob@example.com', 'charlie@example.com', 'diana@example.com', 'eve@example.com'],
    'age': [28, 34, 22, 45, 31],
    'temp_score': [0.82, 0.91, 0.67, 0.88, 0.73],
    'debug_flag': [True, False, True, False, True]
})

Using the drop() Method

The drop() method is the workhorse for column deletion. It’s flexible, explicit, and works well in method chains.

Basic Single Column Deletion

# Remove a single column
df_clean = df.drop('temp_score', axis=1)

# Equivalent using the columns parameter (more readable)
df_clean = df.drop(columns='temp_score')

The axis=1 parameter tells pandas to operate on columns rather than rows. I prefer the columns= syntax because it’s self-documenting—you don’t need to remember what axis=1 means.

Dropping Multiple Columns

Pass a list to remove several columns at once:

# Remove multiple columns
df_clean = df.drop(columns=['temp_score', 'debug_flag'])

# Or with axis parameter
df_clean = df.drop(['temp_score', 'debug_flag'], axis=1)

Handling Missing Columns with errors

When you’re not certain a column exists, use the errors parameter to avoid exceptions:

# This raises KeyError if 'nonexistent' doesn't exist
df_clean = df.drop(columns='nonexistent')

# This silently ignores missing columns
df_clean = df.drop(columns='nonexistent', errors='ignore')

# Useful for conditional cleanup
columns_to_remove = ['temp_score', 'debug_flag', 'maybe_exists']
df_clean = df.drop(columns=columns_to_remove, errors='ignore')

The inplace Parameter

You’ll see inplace=True in older code:

# Modifies df directly, returns None
df.drop(columns='temp_score', inplace=True)

I recommend avoiding this pattern. Reassignment is clearer and allows method chaining:

# Better: explicit reassignment
df = df.drop(columns='temp_score')

# Enables chaining
df = (df
    .drop(columns='temp_score')
    .rename(columns={'user_id': 'id'})
    .set_index('id'))

Using the del Keyword

Python’s built-in del statement works directly on DataFrame columns:

# Delete a single column in place
del df['debug_flag']

This approach modifies the DataFrame directly—there’s no new object created. It’s concise but has limitations:

  • Only works for single columns
  • No error handling options
  • Can’t be used in method chains
  • Modifies the original DataFrame (no way to get a copy)

Use del when you’re doing quick interactive work and want minimal typing. For production code, drop() is usually clearer.

# Common pattern in exploratory analysis
del df['temp_score']
del df['debug_flag']

Using pop() Method

The pop() method removes a column and returns it simultaneously. This is valuable when you need to extract data while cleaning:

# Remove and capture the column
scores = df.pop('temp_score')

print(scores)
# 0    0.82
# 1    0.91
# 2    0.67
# 3    0.88
# 4    0.73
# Name: temp_score, dtype: float64

print(df.columns.tolist())
# ['user_id', 'name', 'email', 'age', 'debug_flag']

This is particularly useful when splitting data:

# Extract target variable for machine learning
y = df.pop('age')
X = df.drop(columns=['name', 'email'])  # Remove non-numeric features

# Now X contains features, y contains the target

Like del, pop() modifies the DataFrame in place and only handles one column at a time.

Selecting Columns to Keep (Inverse Approach)

Sometimes it’s easier to specify which columns you want to keep rather than which to delete. This is especially true when you’re keeping a few columns from a wide DataFrame.

Direct Column Selection

# Keep only specific columns
df_subset = df[['user_id', 'name', 'age']]

# Using .loc for the same result
df_subset = df.loc[:, ['user_id', 'name', 'age']]

Programmatic Column Selection

Build column lists dynamically when you have naming conventions:

# Keep columns that don't start with 'temp_' or 'debug_'
keep_cols = [col for col in df.columns 
             if not col.startswith(('temp_', 'debug_'))]
df_clean = df[keep_cols]

Using filter() with Regex

The filter() method selects columns matching a pattern. Combine it with drop() for pattern-based deletion:

# Add some columns with a common prefix for demonstration
df['meta_created'] = '2024-01-01'
df['meta_updated'] = '2024-01-15'
df['meta_version'] = 1

# Remove all columns starting with 'meta_'
meta_cols = df.filter(regex='^meta_').columns
df_clean = df.drop(columns=meta_cols)

# Or in one line
df_clean = df.drop(columns=df.filter(regex='^meta_').columns)

You can also use filter() to keep columns matching a pattern:

# Keep only columns containing 'id' or 'name'
df_subset = df.filter(regex='id|name')

Using select_dtypes() for Type-Based Selection

Remove columns based on their data type:

# Remove all boolean columns
bool_cols = df.select_dtypes(include='bool').columns
df_clean = df.drop(columns=bool_cols)

# Keep only numeric columns
df_numeric = df.select_dtypes(include='number')

# Remove object (string) columns
df_clean = df.select_dtypes(exclude='object')

Performance Considerations

For most DataFrames, the performance difference between methods is negligible. However, with large datasets or repeated operations, some patterns matter.

Memory Efficiency

Methods that modify in place (del, pop(), inplace=True) avoid creating a copy of the DataFrame:

import sys

# Check memory before
print(f"Before: {sys.getsizeof(df)} bytes")

# drop() creates a new DataFrame
df_new = df.drop(columns='age')  # Original df still in memory

# del modifies in place
del df['age']  # No copy created

For DataFrames consuming significant memory, in-place operations can matter. But in most cases, the clarity of reassignment outweighs the memory cost.

Avoid Repeated Single-Column Drops

Don’t do this:

# Inefficient: creates multiple intermediate DataFrames
df = df.drop(columns='col1')
df = df.drop(columns='col2')
df = df.drop(columns='col3')

Do this instead:

# Efficient: single operation
df = df.drop(columns=['col1', 'col2', 'col3'])

The inplace Debate

Despite what you might expect, inplace=True doesn’t always avoid copying data internally. Pandas often creates a copy anyway and then assigns it back. The pandas development team has discussed deprecating inplace for this reason.

Stick with reassignment for clearer, more predictable code.

Summary Table

Method Syntax Returns In-Place Multiple Columns Best Use Case
drop() df.drop(columns='col') New DataFrame No (by default) Yes General purpose, method chaining
drop() with inplace df.drop(columns='col', inplace=True) None Yes Yes Legacy code (avoid in new code)
del del df['col'] Nothing Yes No Quick interactive deletion
pop() df.pop('col') Removed column Yes No When you need the deleted data
Column selection df[['col1', 'col2']] New DataFrame No N/A Keeping few columns from many
filter() + drop() df.drop(columns=df.filter(regex='pattern').columns) New DataFrame No Yes Pattern-based deletion

Practical Recommendations

For most situations, use drop() with the columns parameter and reassignment:

df = df.drop(columns=['unwanted1', 'unwanted2'])

Use pop() when you’re extracting a column for separate use, like splitting features from a target variable. Use del only in interactive sessions where brevity matters more than clarity.

When you’re keeping a small subset of columns from a wide DataFrame, column selection is more readable than listing everything to drop:

# When keeping 3 columns from 50, this is clearer
df = df[['id', 'name', 'value']]

For pattern-based operations, combine filter() or list comprehensions with drop(). This handles real-world scenarios like removing all temporary columns, metadata fields, or columns matching a naming convention.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.