How to Write to Excel in Pandas
Pandas makes exporting data to Excel straightforward, but the simplicity of `df.to_excel()` hides a wealth of options that can transform your output from a raw data dump into a polished,...
Key Insights
- Use
to_excel()for simple exports, but switch toExcelWriterwhen you need multiple sheets, custom formatting, or control over the Excel engine - Choose
openpyxlfor reading/modifying existing files andxlsxwriterfor creating new files with advanced formatting—never mix them up - Always set
index=Falseunless your index contains meaningful data; Excel users rarely want to see pandas’ default integer index
Pandas makes exporting data to Excel straightforward, but the simplicity of df.to_excel() hides a wealth of options that can transform your output from a raw data dump into a polished, business-ready spreadsheet. While CSV remains the go-to format for data pipelines and technical workflows, Excel output shines when your audience includes analysts, managers, or anyone who expects to manipulate data in a familiar spreadsheet environment.
This guide covers everything from basic exports to multi-sheet workbooks with custom formatting. By the end, you’ll know exactly which parameters and engines to use for any Excel export scenario.
Basic Excel Export with to_excel()
The simplest way to write a DataFrame to Excel requires just one line of code. Pandas handles the file creation, sheet naming, and data serialization automatically.
import pandas as pd
# Create sample data
df = pd.DataFrame({
'product': ['Widget A', 'Widget B', 'Widget C'],
'price': [29.99, 49.99, 19.99],
'quantity': [100, 75, 200]
})
# Write to Excel
df.to_excel('products.xlsx')
This creates an Excel file with a single sheet named “Sheet1” containing your data. The DataFrame’s index appears in column A, and the column headers occupy the first row.
Before running this code, ensure you have an Excel engine installed. Pandas doesn’t include one by default:
pip install openpyxl # Recommended for most use cases
# or
pip install xlsxwriter # Better for formatting-heavy exports
The default engine depends on what’s installed, but you can specify one explicitly with the engine parameter when needed.
Customizing Output Options
The basic export works, but real-world scenarios demand more control. The to_excel() method accepts numerous parameters that let you fine-tune the output.
import pandas as pd
df = pd.DataFrame({
'product': ['Widget A', 'Widget B', 'Widget C'],
'price': [29.99, 49.99, 19.99],
'quantity': [100, 75, 200],
'category': ['Electronics', 'Electronics', 'Hardware']
})
# Export with customizations
df.to_excel(
'products_custom.xlsx',
index=False, # Remove the index column
sheet_name='Inventory', # Custom sheet name
columns=['product', 'price', 'quantity'], # Select specific columns
header=True, # Include column headers (default)
startrow=2, # Start data at row 3 (0-indexed)
startcol=1 # Start data at column B
)
The index=False parameter deserves special attention. Unless your index contains meaningful data like dates or unique identifiers, you almost always want to exclude it. Excel users find the unexplained integer column confusing.
The startrow and startcol parameters prove useful when you need to leave room for titles, logos, or summary statistics above or beside your data. Just remember they’re zero-indexed: startrow=2 means your data begins on row 3 in Excel terms.
You can also control headers with more granularity:
# Custom header names
df.to_excel(
'products_renamed.xlsx',
index=False,
header=['Product Name', 'Unit Price', 'Stock Level', 'Department']
)
Writing Multiple Sheets with ExcelWriter
Single-sheet exports only get you so far. Business reports often require multiple datasets in one workbook—perhaps sales by region on separate sheets, or raw data alongside summary statistics.
The ExcelWriter context manager handles this elegantly:
import pandas as pd
# Create multiple DataFrames
sales_q1 = pd.DataFrame({
'region': ['North', 'South', 'East', 'West'],
'revenue': [150000, 120000, 180000, 95000]
})
sales_q2 = pd.DataFrame({
'region': ['North', 'South', 'East', 'West'],
'revenue': [165000, 135000, 175000, 110000]
})
summary = pd.DataFrame({
'metric': ['Total Q1', 'Total Q2', 'Growth'],
'value': [545000, 585000, '7.3%']
})
# Write all DataFrames to one workbook
with pd.ExcelWriter('quarterly_report.xlsx', engine='openpyxl') as writer:
sales_q1.to_excel(writer, sheet_name='Q1 Sales', index=False)
sales_q2.to_excel(writer, sheet_name='Q2 Sales', index=False)
summary.to_excel(writer, sheet_name='Summary', index=False)
The context manager ensures the file is properly saved and closed, even if an error occurs. Each to_excel() call within the block adds a new sheet to the same workbook.
You can also access the underlying workbook and worksheet objects for advanced manipulation:
with pd.ExcelWriter('report_with_metadata.xlsx', engine='openpyxl') as writer:
df.to_excel(writer, sheet_name='Data', index=False)
# Access the workbook to set properties
workbook = writer.book
workbook.properties.title = 'Q2 Sales Report'
workbook.properties.creator = 'Data Team'
Formatting and Styling
Raw data exports rarely impress stakeholders. The xlsxwriter engine provides extensive formatting capabilities that can transform your output into presentation-ready spreadsheets.
import pandas as pd
df = pd.DataFrame({
'product': ['Widget A', 'Widget B', 'Widget C', 'Widget D'],
'revenue': [15000.50, 23450.75, 8900.25, 31200.00],
'units_sold': [500, 780, 296, 1040],
'margin': [0.15, 0.22, 0.08, 0.25]
})
with pd.ExcelWriter('formatted_report.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='Sales', index=False)
# Access xlsxwriter objects
workbook = writer.book
worksheet = writer.sheets['Sales']
# Define formats
currency_format = workbook.add_format({
'num_format': '$#,##0.00',
'align': 'right'
})
percent_format = workbook.add_format({
'num_format': '0.0%',
'align': 'center'
})
header_format = workbook.add_format({
'bold': True,
'bg_color': '#4472C4',
'font_color': 'white',
'align': 'center'
})
# Apply formats to columns (column index, width, format)
worksheet.set_column('A:A', 15) # Product column width
worksheet.set_column('B:B', 12, currency_format) # Revenue with currency
worksheet.set_column('C:C', 12) # Units sold
worksheet.set_column('D:D', 10, percent_format) # Margin as percentage
# Format header row
for col_num, value in enumerate(df.columns):
worksheet.write(0, col_num, value, header_format)
# Add conditional formatting for margin column
worksheet.conditional_format('D2:D5', {
'type': 'cell',
'criteria': '>=',
'value': 0.20,
'format': workbook.add_format({'bg_color': '#C6EFCE'})
})
This example demonstrates currency formatting, percentage display, column width adjustment, styled headers, and conditional formatting that highlights margins above 20%. The xlsxwriter engine excels at these tasks, making it the preferred choice for reports destined for executive review.
Handling Common Issues
Excel exports in pandas come with several gotchas that can derail your workflow. Here’s how to handle the most frequent problems.
Missing Dependencies
The error ModuleNotFoundError: No module named 'openpyxl' means you need to install an engine:
pip install openpyxl xlsxwriter
Install both to maximize flexibility. Use openpyxl for reading and modifying existing files, xlsxwriter for creating new files with complex formatting.
Appending to Existing Files
By default, to_excel() overwrites existing files. To add a new sheet to an existing workbook, use openpyxl with the mode='a' parameter:
import pandas as pd
new_data = pd.DataFrame({
'month': ['July', 'August', 'September'],
'sales': [42000, 38000, 51000]
})
# Append a new sheet to an existing file
with pd.ExcelWriter(
'quarterly_report.xlsx',
engine='openpyxl',
mode='a',
if_sheet_exists='replace' # Options: 'error', 'new', 'replace', 'overlay'
) as writer:
new_data.to_excel(writer, sheet_name='Q3 Sales', index=False)
The if_sheet_exists parameter controls behavior when the sheet name already exists. Use 'replace' to overwrite the sheet, 'new' to create a uniquely named sheet, or 'overlay' to write over existing cells without clearing the sheet first.
Large Dataset Performance
Excel files have a row limit of 1,048,576 rows. For large datasets approaching this limit, consider chunking your data or using a different format entirely:
# For very large datasets, consider chunking
chunk_size = 500000
for i, chunk in enumerate(range(0, len(large_df), chunk_size)):
subset = large_df.iloc[chunk:chunk + chunk_size]
subset.to_excel(f'data_part_{i+1}.xlsx', index=False)
For datasets that fit within limits but export slowly, xlsxwriter generally outperforms openpyxl for write operations. You can also disable the default column width calculation with engine_kwargs:
df.to_excel('large_file.xlsx', engine='xlsxwriter')
Conclusion
Pandas provides robust Excel export capabilities that scale from quick data dumps to polished business reports. For simple exports, to_excel() with index=False handles most needs. When you require multiple sheets or formatting, reach for ExcelWriter with the appropriate engine.
Choose openpyxl when you need to read existing files or append sheets to workbooks. Choose xlsxwriter when creating new files with extensive formatting requirements—it’s faster and offers more styling options. Never try to use xlsxwriter to modify existing files; it only supports creating new ones.
With these tools and patterns, you can automate Excel report generation that previously required manual spreadsheet work, saving hours of tedious formatting while producing consistent, professional output.