Pandas - Move Column to First/Last Position

The most efficient way to move a column to the first position is combining `insert()` and `pop()`. The `pop()` method removes and returns the column, while `insert()` places it at the specified index.

Key Insights

  • Use insert() with pop() to move columns to specific positions without creating unnecessary copies of your DataFrame
  • The reindex() method provides a declarative approach for repositioning multiple columns simultaneously
  • Column reordering operations modify the DataFrame’s column index, not the underlying data, making them memory-efficient

Moving a Column to First Position

The most efficient way to move a column to the first position is combining insert() and pop(). The pop() method removes and returns the column, while insert() places it at the specified index.

import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'city': ['NYC', 'LA', 'Chicago'],
    'salary': [70000, 80000, 90000]
})

# Move 'salary' to first position
col = df.pop('salary')
df.insert(0, 'salary', col)

print(df)

Output:

   salary     name  age     city
0   70000    Alice   25      NYC
1   80000      Bob   30       LA
2   90000  Charlie   35  Chicago

This approach modifies the DataFrame in-place, making it memory-efficient for large datasets. The insert() method takes three arguments: the position index (0 for first), the column name, and the column data.

Moving a Column to Last Position

Moving a column to the last position follows the same pattern, but you use len(df.columns) as the insertion index:

df = pd.DataFrame({
    'id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'city': ['NYC', 'LA', 'Chicago']
})

# Move 'id' to last position
col = df.pop('id')
df.insert(len(df.columns), 'id', col)

print(df)

Output:

      name  age     city  id
0    Alice   25      NYC   1
1      Bob   30       LA   2
2  Charlie   35  Chicago   3

Alternatively, you can append the column directly without calculating the length:

# Move 'name' to last position
col = df.pop('name')
df[col.name] = col

print(df)

Using Column Lists for Reordering

For more complex reordering scenarios, create a new column list and use it to reindex the DataFrame:

df = pd.DataFrame({
    'a': [1, 2, 3],
    'b': [4, 5, 6],
    'c': [7, 8, 9],
    'd': [10, 11, 12]
})

# Move 'c' to first position
cols = ['c'] + [col for col in df.columns if col != 'c']
df = df[cols]

print(df)

Output:

   c  a  b   d
0  7  1  4  10
1  8  2  5  11
2  9  3  6  12

This method creates a new DataFrame object, which uses more memory but provides clearer intent when reordering multiple columns:

# Move 'd' to first and 'a' to last
cols = ['d', 'b', 'c', 'a']
df = df[cols]

print(df)

Reordering Multiple Columns Simultaneously

When you need to move several columns to the beginning or end, list comprehensions provide a clean solution:

df = pd.DataFrame({
    'id': [1, 2, 3],
    'timestamp': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'value': [100, 200, 300],
    'category': ['A', 'B', 'C'],
    'status': ['active', 'inactive', 'active']
})

# Move 'value' and 'category' to front
priority_cols = ['value', 'category']
remaining_cols = [col for col in df.columns if col not in priority_cols]
df = df[priority_cols + remaining_cols]

print(df)

Output:

   value category  id   timestamp   status
0    100        A   1  2024-01-01   active
1    200        B   2  2024-01-02  inactive
2    300        C   3  2024-01-03   active

To move columns to the end:

# Move 'id' and 'timestamp' to end
last_cols = ['id', 'timestamp']
first_cols = [col for col in df.columns if col not in last_cols]
df = df[first_cols + last_cols]

print(df)

Using reindex() for Column Reordering

The reindex() method provides a declarative approach that’s particularly useful when working with column subsets:

df = pd.DataFrame({
    'x': [1, 2, 3],
    'y': [4, 5, 6],
    'z': [7, 8, 9]
})

# Move 'z' to first position
df = df.reindex(columns=['z', 'x', 'y'])

print(df)

Output:

   z  x  y
0  7  1  4
1  8  2  5
2  9  3  6

The reindex() method is especially powerful when you need to ensure specific column ordering while handling missing columns gracefully:

# Specify desired order; missing columns are ignored
desired_order = ['z', 'missing_col', 'x', 'y']
existing_cols = [col for col in desired_order if col in df.columns]
df = df.reindex(columns=existing_cols)

Practical Function for Column Movement

Encapsulate the logic in a reusable function for consistent column management:

def move_column(df, col_name, position='first'):
    """
    Move a column to first or last position.
    
    Parameters:
    -----------
    df : pd.DataFrame
        Input DataFrame
    col_name : str
        Name of column to move
    position : str
        Either 'first' or 'last'
    
    Returns:
    --------
    pd.DataFrame
        DataFrame with reordered columns
    """
    if col_name not in df.columns:
        raise ValueError(f"Column '{col_name}' not found in DataFrame")
    
    col = df.pop(col_name)
    
    if position == 'first':
        df.insert(0, col_name, col)
    elif position == 'last':
        df[col_name] = col
    else:
        raise ValueError("position must be 'first' or 'last'")
    
    return df

# Usage
df = pd.DataFrame({
    'a': [1, 2, 3],
    'b': [4, 5, 6],
    'c': [7, 8, 9]
})

df = move_column(df, 'b', 'first')
print(df)

Performance Considerations

For large DataFrames, the insert() and pop() combination performs better than creating new column lists:

import pandas as pd
import time

# Create large DataFrame
df = pd.DataFrame({f'col_{i}': range(1000000) for i in range(50)})

# Method 1: insert/pop
start = time.time()
col = df.pop('col_25')
df.insert(0, 'col_25', col)
print(f"insert/pop: {time.time() - start:.4f}s")

# Method 2: column list
df = pd.DataFrame({f'col_{i}': range(1000000) for i in range(50)})
start = time.time()
cols = ['col_25'] + [col for col in df.columns if col != 'col_25']
df = df[cols]
print(f"column list: {time.time() - start:.4f}s")

The insert()/pop() method modifies the DataFrame in-place, avoiding the memory overhead of creating a new DataFrame object. This difference becomes significant when working with DataFrames containing millions of rows or hundreds of columns.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.