Pandas - Write DataFrame to Excel (to_excel)
The `to_excel()` method provides a straightforward way to export pandas DataFrames to Excel files. The method requires the `openpyxl` or `xlsxwriter` library as the underlying engine.
Key Insights
- The
to_excel()method writes DataFrames to Excel files with extensive formatting options including sheet names, index control, and column selection - Use
ExcelWriterfor writing multiple DataFrames to different sheets in a single workbook with shared formatting and performance optimization - Advanced features include conditional formatting, column width adjustment, and formula integration through the
xlsxwriteroropenpyxlengines
Basic DataFrame to Excel Export
The to_excel() method provides a straightforward way to export pandas DataFrames to Excel files. The method requires the openpyxl or xlsxwriter library as the underlying engine.
import pandas as pd
# Create sample DataFrame
df = pd.DataFrame({
'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor'],
'price': [1200, 25, 75, 350],
'quantity': [5, 50, 30, 10]
})
# Basic export
df.to_excel('products.xlsx')
# Export without index
df.to_excel('products_no_index.xlsx', index=False)
# Export without header
df.to_excel('products_no_header.xlsx', header=False)
# Export specific columns
df.to_excel('products_partial.xlsx', columns=['product', 'price'], index=False)
The index=False parameter is commonly used to exclude the DataFrame’s index from the output, which is typically desired when the index doesn’t contain meaningful data.
Specifying Sheet Names and Starting Position
Control where your data appears within the Excel workbook by specifying sheet names and cell positions.
import pandas as pd
df = pd.DataFrame({
'region': ['North', 'South', 'East', 'West'],
'revenue': [45000, 38000, 52000, 41000]
})
# Custom sheet name
df.to_excel('revenue.xlsx', sheet_name='Q1_Revenue', index=False)
# Start writing from specific cell
df.to_excel('revenue_positioned.xlsx',
sheet_name='Data',
startrow=2,
startcol=1,
index=False)
The startrow and startcol parameters allow you to position your data anywhere in the worksheet, useful when creating reports with headers or combining multiple data sections.
Writing Multiple DataFrames to One Workbook
Use ExcelWriter to write multiple DataFrames to different sheets within a single Excel file.
import pandas as pd
sales_df = pd.DataFrame({
'month': ['Jan', 'Feb', 'Mar'],
'sales': [15000, 18000, 22000]
})
expenses_df = pd.DataFrame({
'month': ['Jan', 'Feb', 'Mar'],
'expenses': [8000, 9500, 11000]
})
profit_df = pd.DataFrame({
'month': ['Jan', 'Feb', 'Mar'],
'profit': [7000, 8500, 11000]
})
# Write multiple sheets
with pd.ExcelWriter('financial_report.xlsx', engine='xlsxwriter') as writer:
sales_df.to_excel(writer, sheet_name='Sales', index=False)
expenses_df.to_excel(writer, sheet_name='Expenses', index=False)
profit_df.to_excel(writer, sheet_name='Profit', index=False)
The context manager ensures proper file closure and resource cleanup. The engine parameter specifies which library to use for writing.
Formatting with XlsxWriter Engine
The xlsxwriter engine provides extensive formatting capabilities including cell formatting, conditional formatting, and charts.
import pandas as pd
df = pd.DataFrame({
'employee': ['Alice', 'Bob', 'Charlie', 'Diana'],
'salary': [75000, 68000, 82000, 71000],
'bonus': [5000, 3500, 6500, 4200]
})
with pd.ExcelWriter('formatted_report.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='Salaries', index=False)
workbook = writer.book
worksheet = writer.sheets['Salaries']
# Define formats
currency_format = workbook.add_format({'num_format': '$#,##0'})
header_format = workbook.add_format({
'bold': True,
'bg_color': '#4472C4',
'font_color': 'white',
'border': 1
})
# Apply currency format to salary and bonus columns
worksheet.set_column('B:C', 12, currency_format)
# Format header row
for col_num, value in enumerate(df.columns.values):
worksheet.write(0, col_num, value, header_format)
# Set column widths
worksheet.set_column('A:A', 15)
This approach separates data writing from formatting, giving you full control over the Excel output appearance.
Conditional Formatting
Apply conditional formatting rules to highlight specific data patterns automatically.
import pandas as pd
df = pd.DataFrame({
'product': ['Widget A', 'Widget B', 'Widget C', 'Widget D'],
'sales': [1200, 800, 1500, 950],
'target': [1000, 1000, 1000, 1000]
})
with pd.ExcelWriter('sales_analysis.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='Performance', index=False)
workbook = writer.book
worksheet = writer.sheets['Performance']
# Highlight sales above target in green
green_format = workbook.add_format({'bg_color': '#C6EFCE', 'font_color': '#006100'})
# Highlight sales below target in red
red_format = workbook.add_format({'bg_color': '#FFC7CE', 'font_color': '#9C0006'})
# Apply conditional formatting to sales column (B2:B5)
worksheet.conditional_format('B2:B5', {
'type': 'cell',
'criteria': '>=',
'value': 1000,
'format': green_format
})
worksheet.conditional_format('B2:B5', {
'type': 'cell',
'criteria': '<',
'value': 1000,
'format': red_format
})
Appending to Existing Excel Files
When working with existing Excel files, you can append new sheets while preserving existing data using openpyxl.
import pandas as pd
from openpyxl import load_workbook
df_new = pd.DataFrame({
'quarter': ['Q4'],
'revenue': [58000]
})
# Append to existing file
with pd.ExcelWriter('revenue.xlsx',
engine='openpyxl',
mode='a',
if_sheet_exists='replace') as writer:
df_new.to_excel(writer, sheet_name='Q4_Revenue', index=False)
The mode='a' parameter opens the file in append mode. The if_sheet_exists parameter controls behavior when a sheet name already exists: 'replace', 'new', or 'overlay'.
Writing Large DataFrames Efficiently
For large DataFrames, optimize memory usage and performance by controlling the engine and using chunking strategies.
import pandas as pd
import numpy as np
# Create large DataFrame
large_df = pd.DataFrame({
'id': range(100000),
'value': np.random.randn(100000),
'category': np.random.choice(['A', 'B', 'C'], 100000)
})
# Use xlsxwriter for better performance with large datasets
with pd.ExcelWriter('large_dataset.xlsx', engine='xlsxwriter') as writer:
large_df.to_excel(writer, sheet_name='Data', index=False)
# Disable constant memory optimization if needed
workbook = writer.book
workbook.use_zip64()
For extremely large datasets exceeding Excel’s row limit (1,048,576 rows), consider splitting into multiple sheets:
chunk_size = 1000000
num_chunks = len(large_df) // chunk_size + 1
with pd.ExcelWriter('chunked_data.xlsx', engine='xlsxwriter') as writer:
for i in range(num_chunks):
start_idx = i * chunk_size
end_idx = min((i + 1) * chunk_size, len(large_df))
chunk_df = large_df.iloc[start_idx:end_idx]
chunk_df.to_excel(writer, sheet_name=f'Data_{i+1}', index=False)
Handling DateTime and Numeric Formatting
Properly format datetime and numeric columns for Excel compatibility.
import pandas as pd
from datetime import datetime, timedelta
df = pd.DataFrame({
'date': pd.date_range('2024-01-01', periods=5),
'percentage': [0.15, 0.23, 0.18, 0.31, 0.27],
'amount': [1234.56, 2345.67, 3456.78, 4567.89, 5678.90]
})
with pd.ExcelWriter('formatted_data.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='Data', index=False)
workbook = writer.book
worksheet = writer.sheets['Data']
# Format date column
date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
worksheet.set_column('A:A', 12, date_format)
# Format percentage column
percent_format = workbook.add_format({'num_format': '0.0%'})
worksheet.set_column('B:B', 12, percent_format)
# Format amount column
money_format = workbook.add_format({'num_format': '$#,##0.00'})
worksheet.set_column('C:C', 12, money_format)
This ensures that Excel displays your data with appropriate formatting, making it immediately usable for analysis and reporting without manual adjustments.