Pandas - Add Row to DataFrame (append/concat)

Pandas deprecated the `append()` method because it was inefficient and created confusion about in-place operations. The method always returned a new DataFrame, leading developers to mistakenly chain...

Key Insights

  • The append() method is deprecated since Pandas 1.4.0; use pd.concat() or loc[] indexing for adding rows to DataFrames
  • pd.concat() is the recommended approach for adding multiple rows, while loc[] works best for single row additions with minimal overhead
  • Understanding the performance implications of different methods is critical—repeated concatenations create new objects and should be avoided in loops

Why append() is Deprecated

Pandas deprecated the append() method because it was inefficient and created confusion about in-place operations. The method always returned a new DataFrame, leading developers to mistakenly chain multiple append calls in loops, creating severe performance bottlenecks.

import pandas as pd

# Old deprecated approach - DO NOT USE
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
new_row = pd.DataFrame({'A': [5], 'B': [6]})
df = df.append(new_row, ignore_index=True)  # DeprecationWarning

The deprecation forces developers toward more explicit and performant patterns using pd.concat() or direct indexing methods.

Adding a Single Row with loc[]

For adding a single row to an existing DataFrame, loc[] indexing provides the most straightforward and performant solution. This method modifies the DataFrame in-place when using the next available index.

import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bob'],
    'age': [25, 30],
    'city': ['New York', 'London']
})

# Add a single row using loc[]
df.loc[len(df)] = ['Charlie', 35, 'Paris']

print(df)
#       name  age      city
# 0    Alice   25  New York
# 1      Bob   30    London
# 2  Charlie   35     Paris

You can also use a specific index value:

df.loc[10] = ['David', 28, 'Berlin']

print(df)
#       name  age      city
# 0    Alice   25  New York
# 1      Bob   30    London
# 2  Charlie   35     Paris
# 10   David   28    Berlin

When adding rows with dictionary values, ensure all columns are present or handle missing values explicitly:

df = pd.DataFrame({
    'name': ['Alice', 'Bob'],
    'age': [25, 30],
    'city': ['New York', 'London']
})

# Add row from dictionary
new_data = {'name': 'Eve', 'age': 27, 'city': 'Tokyo'}
df.loc[len(df)] = new_data

print(df)

Adding Multiple Rows with concat()

The pd.concat() function is the recommended approach for adding one or more rows to a DataFrame. It creates a new DataFrame object, so assign the result back to your variable.

import pandas as pd

df = pd.DataFrame({
    'product': ['Laptop', 'Mouse'],
    'price': [1200, 25],
    'stock': [15, 150]
})

# Create new rows as a DataFrame
new_rows = pd.DataFrame({
    'product': ['Keyboard', 'Monitor'],
    'price': [75, 300],
    'stock': [80, 45]
})

# Concatenate and reset index
df = pd.concat([df, new_rows], ignore_index=True)

print(df)
#     product  price  stock
# 0    Laptop   1200     15
# 1     Mouse     25    150
# 2  Keyboard     75     80
# 3   Monitor    300     45

The ignore_index=True parameter resets the index to a sequential range. Without it, the original indices are preserved:

df = pd.DataFrame({'A': [1, 2]}, index=[0, 1])
new_rows = pd.DataFrame({'A': [3, 4]}, index=[0, 1])

df = pd.concat([df, new_rows])
print(df)
#    A
# 0  1
# 1  2
# 0  3  # Duplicate index!
# 1  4

# With ignore_index=True
df = pd.concat([df, new_rows], ignore_index=True)
print(df)
#    A
# 0  1
# 1  2
# 2  3
# 3  4

Adding Rows from Dictionaries

When working with dictionary data, convert it to a DataFrame before concatenating:

df = pd.DataFrame({
    'user_id': [101, 102],
    'username': ['alice', 'bob'],
    'score': [850, 920]
})

# Single dictionary as a row
new_user = {'user_id': 103, 'username': 'charlie', 'score': 780}
df = pd.concat([df, pd.DataFrame([new_user])], ignore_index=True)

# Multiple dictionaries
new_users = [
    {'user_id': 104, 'username': 'david', 'score': 890},
    {'user_id': 105, 'username': 'eve', 'score': 950}
]
df = pd.concat([df, pd.DataFrame(new_users)], ignore_index=True)

print(df)
#    user_id username  score
# 0      101    alice    850
# 1      102      bob    920
# 2      103  charlie    780
# 3      104    david    890
# 4      105      eve    950

Performance Considerations

Never use concatenation or row addition inside loops. Each operation creates a new DataFrame object, leading to O(n²) time complexity:

import pandas as pd
import time

# BAD: Concatenating in a loop
start = time.time()
df = pd.DataFrame(columns=['A', 'B', 'C'])
for i in range(1000):
    new_row = pd.DataFrame({'A': [i], 'B': [i*2], 'C': [i*3]})
    df = pd.concat([df, new_row], ignore_index=True)
print(f"Loop concat: {time.time() - start:.3f}s")

# GOOD: Build list then create DataFrame
start = time.time()
rows = []
for i in range(1000):
    rows.append({'A': i, 'B': i*2, 'C': i*3})
df = pd.DataFrame(rows)
print(f"List then DataFrame: {time.time() - start:.3f}s")

On a typical system, the loop concatenation takes 2-3 seconds while the list approach completes in milliseconds.

Handling Missing Columns

When adding rows with mismatched columns, Pandas fills missing values with NaN:

df = pd.DataFrame({
    'name': ['Alice', 'Bob'],
    'age': [25, 30]
})

# New row with extra column
new_row = pd.DataFrame({
    'name': ['Charlie'],
    'age': [35],
    'city': ['Paris']
})

df = pd.concat([df, new_row], ignore_index=True)

print(df)
#       name  age   city
# 0    Alice   25    NaN
# 1      Bob   30    NaN
# 2  Charlie   35  Paris

To avoid this, explicitly define columns or use fillna():

# Define all columns upfront
df = pd.DataFrame({
    'name': ['Alice', 'Bob'],
    'age': [25, 30],
    'city': [None, None]
})

# Or fill missing values
df = pd.concat([df, new_row], ignore_index=True).fillna('Unknown')

Adding Rows at Specific Positions

To insert rows at specific positions, use pd.concat() with slicing:

df = pd.DataFrame({
    'id': [1, 2, 4, 5],
    'value': [10, 20, 40, 50]
})

# Insert row at index 2
new_row = pd.DataFrame({'id': [3], 'value': [30]})

df = pd.concat([
    df.iloc[:2],      # First 2 rows
    new_row,          # New row
    df.iloc[2:]       # Remaining rows
], ignore_index=True)

print(df)
#    id  value
# 0   1     10
# 1   2     20
# 2   3     30
# 3   4     40
# 4   5     50

Practical Pattern: Batch Processing

When processing data in batches, accumulate rows in a list and create a single DataFrame:

import pandas as pd

def process_batch(data_source):
    """Process data in batches and return DataFrame."""
    rows = []
    
    for record in data_source:
        # Process each record
        processed = {
            'id': record['id'],
            'value': record['raw_value'] * 2,
            'category': record['type'].upper()
        }
        rows.append(processed)
    
    return pd.DataFrame(rows)

# Simulate data source
data = [
    {'id': 1, 'raw_value': 10, 'type': 'a'},
    {'id': 2, 'raw_value': 20, 'type': 'b'},
    {'id': 3, 'raw_value': 30, 'type': 'a'}
]

result = process_batch(data)
print(result)
#    id  value category
# 0   1     20        A
# 1   2     40        B
# 2   3     60        A

This pattern is efficient, readable, and avoids the performance pitfalls of repeated concatenation. For adding rows to existing DataFrames, use pd.concat() once after collecting all new data rather than incrementally updating the DataFrame.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.