Pandas - Write DataFrame to Excel (to_excel)

Key Insights

The to_excel() method writes DataFrames to Excel files with extensive formatting options including sheet names, index control, and column selection
Use ExcelWriter for writing multiple DataFrames to different sheets in a single workbook with shared formatting and performance optimization
Advanced features include conditional formatting, column width adjustment, and formula integration through the xlsxwriter or openpyxl engines

Basic DataFrame to Excel Export

The to_excel() method provides a straightforward way to export pandas DataFrames to Excel files. The method requires the openpyxl or xlsxwriter library as the underlying engine.

import pandas as pd

# Create sample DataFrame
df = pd.DataFrame({
    'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor'],
    'price': [1200, 25, 75, 350],
    'quantity': [5, 50, 30, 10]
})

# Basic export
df.to_excel('products.xlsx')

# Export without index
df.to_excel('products_no_index.xlsx', index=False)

# Export without header
df.to_excel('products_no_header.xlsx', header=False)

# Export specific columns
df.to_excel('products_partial.xlsx', columns=['product', 'price'], index=False)

The index=False parameter is commonly used to exclude the DataFrame’s index from the output, which is typically desired when the index doesn’t contain meaningful data.

Specifying Sheet Names and Starting Position

Control where your data appears within the Excel workbook by specifying sheet names and cell positions.

import pandas as pd

df = pd.DataFrame({
    'region': ['North', 'South', 'East', 'West'],
    'revenue': [45000, 38000, 52000, 41000]
})

# Custom sheet name
df.to_excel('revenue.xlsx', sheet_name='Q1_Revenue', index=False)

# Start writing from specific cell
df.to_excel('revenue_positioned.xlsx', 
            sheet_name='Data',
            startrow=2, 
            startcol=1,
            index=False)

The startrow and startcol parameters allow you to position your data anywhere in the worksheet, useful when creating reports with headers or combining multiple data sections.

Writing Multiple DataFrames to One Workbook

Use ExcelWriter to write multiple DataFrames to different sheets within a single Excel file.

import pandas as pd

sales_df = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar'],
    'sales': [15000, 18000, 22000]
})

expenses_df = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar'],
    'expenses': [8000, 9500, 11000]
})

profit_df = pd.DataFrame({
    'month': ['Jan', 'Feb', 'Mar'],
    'profit': [7000, 8500, 11000]
})

# Write multiple sheets
with pd.ExcelWriter('financial_report.xlsx', engine='xlsxwriter') as writer:
    sales_df.to_excel(writer, sheet_name='Sales', index=False)
    expenses_df.to_excel(writer, sheet_name='Expenses', index=False)
    profit_df.to_excel(writer, sheet_name='Profit', index=False)

The context manager ensures proper file closure and resource cleanup. The engine parameter specifies which library to use for writing.

Formatting with XlsxWriter Engine

The xlsxwriter engine provides extensive formatting capabilities including cell formatting, conditional formatting, and charts.

import pandas as pd

df = pd.DataFrame({
    'employee': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'salary': [75000, 68000, 82000, 71000],
    'bonus': [5000, 3500, 6500, 4200]
})

with pd.ExcelWriter('formatted_report.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Salaries', index=False)
    
    workbook = writer.book
    worksheet = writer.sheets['Salaries']
    
    # Define formats
    currency_format = workbook.add_format({'num_format': '$#,##0'})
    header_format = workbook.add_format({
        'bold': True,
        'bg_color': '#4472C4',
        'font_color': 'white',
        'border': 1
    })
    
    # Apply currency format to salary and bonus columns
    worksheet.set_column('B:C', 12, currency_format)
    
    # Format header row
    for col_num, value in enumerate(df.columns.values):
        worksheet.write(0, col_num, value, header_format)
    
    # Set column widths
    worksheet.set_column('A:A', 15)

This approach separates data writing from formatting, giving you full control over the Excel output appearance.

Conditional Formatting

Apply conditional formatting rules to highlight specific data patterns automatically.

import pandas as pd

df = pd.DataFrame({
    'product': ['Widget A', 'Widget B', 'Widget C', 'Widget D'],
    'sales': [1200, 800, 1500, 950],
    'target': [1000, 1000, 1000, 1000]
})

with pd.ExcelWriter('sales_analysis.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Performance', index=False)
    
    workbook = writer.book
    worksheet = writer.sheets['Performance']
    
    # Highlight sales above target in green
    green_format = workbook.add_format({'bg_color': '#C6EFCE', 'font_color': '#006100'})
    
    # Highlight sales below target in red
    red_format = workbook.add_format({'bg_color': '#FFC7CE', 'font_color': '#9C0006'})
    
    # Apply conditional formatting to sales column (B2:B5)
    worksheet.conditional_format('B2:B5', {
        'type': 'cell',
        'criteria': '>=',
        'value': 1000,
        'format': green_format
    })
    
    worksheet.conditional_format('B2:B5', {
        'type': 'cell',
        'criteria': '<',
        'value': 1000,
        'format': red_format
    })

Appending to Existing Excel Files

When working with existing Excel files, you can append new sheets while preserving existing data using openpyxl.

import pandas as pd
from openpyxl import load_workbook

df_new = pd.DataFrame({
    'quarter': ['Q4'],
    'revenue': [58000]
})

# Append to existing file
with pd.ExcelWriter('revenue.xlsx', 
                    engine='openpyxl', 
                    mode='a',
                    if_sheet_exists='replace') as writer:
    df_new.to_excel(writer, sheet_name='Q4_Revenue', index=False)

The mode='a' parameter opens the file in append mode. The if_sheet_exists parameter controls behavior when a sheet name already exists: 'replace', 'new', or 'overlay'.

Writing Large DataFrames Efficiently

For large DataFrames, optimize memory usage and performance by controlling the engine and using chunking strategies.

import pandas as pd
import numpy as np

# Create large DataFrame
large_df = pd.DataFrame({
    'id': range(100000),
    'value': np.random.randn(100000),
    'category': np.random.choice(['A', 'B', 'C'], 100000)
})

# Use xlsxwriter for better performance with large datasets
with pd.ExcelWriter('large_dataset.xlsx', engine='xlsxwriter') as writer:
    large_df.to_excel(writer, sheet_name='Data', index=False)
    
    # Disable constant memory optimization if needed
    workbook = writer.book
    workbook.use_zip64()

For extremely large datasets exceeding Excel’s row limit (1,048,576 rows), consider splitting into multiple sheets:

chunk_size = 1000000
num_chunks = len(large_df) // chunk_size + 1

with pd.ExcelWriter('chunked_data.xlsx', engine='xlsxwriter') as writer:
    for i in range(num_chunks):
        start_idx = i * chunk_size
        end_idx = min((i + 1) * chunk_size, len(large_df))
        chunk_df = large_df.iloc[start_idx:end_idx]
        chunk_df.to_excel(writer, sheet_name=f'Data_{i+1}', index=False)

Handling DateTime and Numeric Formatting

Properly format datetime and numeric columns for Excel compatibility.

import pandas as pd
from datetime import datetime, timedelta

df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=5),
    'percentage': [0.15, 0.23, 0.18, 0.31, 0.27],
    'amount': [1234.56, 2345.67, 3456.78, 4567.89, 5678.90]
})

with pd.ExcelWriter('formatted_data.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Data', index=False)
    
    workbook = writer.book
    worksheet = writer.sheets['Data']
    
    # Format date column
    date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
    worksheet.set_column('A:A', 12, date_format)
    
    # Format percentage column
    percent_format = workbook.add_format({'num_format': '0.0%'})
    worksheet.set_column('B:B', 12, percent_format)
    
    # Format amount column
    money_format = workbook.add_format({'num_format': '$#,##0.00'})
    worksheet.set_column('C:C', 12, money_format)

This ensures that Excel displays your data with appropriate formatting, making it immediately usable for analysis and reporting without manual adjustments.