Pandas - Get Day of Week from Datetime

• Pandas provides multiple methods to extract day of week from datetime objects, including `dt.dayofweek`, `dt.weekday()`, and `dt.day_name()`, each serving different formatting needs

Key Insights

• Pandas provides multiple methods to extract day of week from datetime objects, including dt.dayofweek, dt.weekday(), and dt.day_name(), each serving different formatting needs • The dt.dayofweek and dt.weekday() attributes return integers (0-6), while dt.day_name() returns the full day name as a string, with Monday as 0 in both numeric approaches • Performance considerations matter when working with large datasets—vectorized operations using dt accessor are significantly faster than applying custom functions row-by-row

Understanding Pandas Datetime Day of Week Extraction

Extracting the day of week from datetime columns is a common requirement in time series analysis, data aggregation, and business intelligence workflows. Pandas offers built-in functionality through the datetime accessor that makes this operation straightforward and efficient.

The datetime accessor (dt) provides several attributes and methods specifically designed for temporal data manipulation. When working with datetime columns, you can access day of week information in multiple formats depending on your use case.

import pandas as pd
import numpy as np

# Create sample datetime data
dates = pd.date_range('2024-01-01', periods=10, freq='D')
df = pd.DataFrame({'date': dates})

print(df)
        date
0 2024-01-01
1 2024-01-02
2 2024-01-03
3 2024-01-04
4 2024-01-05
5 2024-01-06
6 2024-01-07
7 2024-01-08
8 2024-01-09
9 2024-01-10

Using dt.dayofweek for Integer Representation

The dt.dayofweek attribute returns an integer representation where Monday is 0 and Sunday is 6. This is the most common approach when you need numerical day identifiers for grouping or conditional logic.

df['day_of_week'] = df['date'].dt.dayofweek

print(df)
        date  day_of_week
0 2024-01-01            0
1 2024-01-02            1
2 2024-01-03            2
3 2024-01-04            3
4 2024-01-05            4
5 2024-01-06            5
6 2024-01-07            6
7 2024-01-08            0
8 2024-01-09            1
9 2024-01-10            2

The integer format is particularly useful for filtering weekdays versus weekends:

# Filter for weekdays only (Monday-Friday)
weekdays = df[df['day_of_week'] < 5]
print(weekdays)

# Filter for weekends
weekends = df[df['day_of_week'] >= 5]
print(weekends)

Using dt.weekday() Method

The dt.weekday() method is functionally identical to dt.dayofweek. Both return the same integer values with Monday as 0. The choice between them is purely stylistic.

df['weekday_method'] = df['date'].dt.weekday()

# Verify they produce identical results
print((df['day_of_week'] == df['weekday_method']).all())  # True

Extracting Day Names with dt.day_name()

When you need human-readable day names instead of integers, use dt.day_name(). This method returns the full name of the day as a string.

df['day_name'] = df['date'].dt.day_name()

print(df[['date', 'day_name']])
        date   day_name
0 2024-01-01     Monday
1 2024-01-02    Tuesday
2 2024-01-03  Wednesday
3 2024-01-04   Thursday
4 2024-01-05     Friday
5 2024-01-06   Saturday
6 2024-01-07     Sunday
7 2024-01-08     Monday
8 2024-01-09    Tuesday
9 2024-01-10  Wednesday

This is ideal for reporting, visualization labels, and any output that end users will read directly.

Working with String Datetime Columns

If your datetime data is stored as strings, convert it to datetime type first using pd.to_datetime():

# Sample data with string dates
df_strings = pd.DataFrame({
    'date_str': ['2024-01-15', '2024-02-20', '2024-03-10']
})

# Convert to datetime
df_strings['date'] = pd.to_datetime(df_strings['date_str'])

# Extract day of week
df_strings['day_of_week'] = df_strings['date'].dt.dayofweek
df_strings['day_name'] = df_strings['date'].dt.day_name()

print(df_strings)
    date_str       date  day_of_week  day_name
0 2024-01-15 2024-01-15            0    Monday
1 2024-02-20 2024-02-20            1   Tuesday
2 2024-03-10 2024-03-10            6    Sunday

Practical Application: Sales Analysis by Day of Week

Here’s a realistic example analyzing sales patterns across different days:

# Create sample sales data
np.random.seed(42)
dates = pd.date_range('2024-01-01', periods=90, freq='D')
sales_df = pd.DataFrame({
    'date': dates,
    'sales': np.random.randint(1000, 5000, size=90)
})

# Add day of week information
sales_df['day_of_week'] = sales_df['date'].dt.dayofweek
sales_df['day_name'] = sales_df['date'].dt.day_name()

# Calculate average sales by day of week
avg_sales = sales_df.groupby('day_name')['sales'].mean().round(2)

# Sort by day of week order
day_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
avg_sales = avg_sales.reindex(day_order)

print(avg_sales)
day_name
Monday       2847.54
Tuesday      2901.15
Wednesday    2756.77
Thursday     3041.85
Friday       2885.31
Saturday     2923.08
Sunday       2798.15
Name: sales, dtype: float64

Creating Custom Day Categories

You can create custom categories based on day of week, such as grouping days into business segments:

def categorize_day(day_num):
    if day_num < 5:
        return 'Weekday'
    else:
        return 'Weekend'

sales_df['day_category'] = sales_df['day_of_week'].apply(categorize_day)

# Analyze sales by category
category_stats = sales_df.groupby('day_category')['sales'].agg(['mean', 'sum', 'count'])
print(category_stats)

Alternatively, use vectorized operations for better performance:

sales_df['day_category'] = np.where(sales_df['day_of_week'] < 5, 'Weekday', 'Weekend')

Handling Timezone-Aware Datetimes

When working with timezone-aware datetime objects, the day of week extraction works identically:

# Create timezone-aware dates
tz_dates = pd.date_range('2024-01-01', periods=5, freq='D', tz='US/Eastern')
tz_df = pd.DataFrame({'date': tz_dates})

tz_df['day_of_week'] = tz_df['date'].dt.dayofweek
tz_df['day_name'] = tz_df['date'].dt.day_name()

print(tz_df)

Performance Considerations

When working with large datasets, the dt accessor methods are highly optimized. Avoid using apply() with custom functions when built-in methods are available:

# Create large dataset for performance testing
large_dates = pd.date_range('2020-01-01', periods=1000000, freq='H')
large_df = pd.DataFrame({'date': large_dates})

# Efficient approach
%timeit large_df['date'].dt.dayofweek

# Inefficient approach (avoid this)
%timeit large_df['date'].apply(lambda x: x.dayofweek)

The vectorized dt.dayofweek approach is typically 10-100x faster than using apply().

Integration with GroupBy Operations

Day of week extraction is particularly powerful when combined with groupby aggregations:

# Group by day of week and calculate multiple statistics
sales_stats = sales_df.groupby('day_of_week').agg({
    'sales': ['mean', 'median', 'std', 'count']
}).round(2)

print(sales_stats)

You can also create pivot tables with day of week as one dimension:

# Add month for pivot table
sales_df['month'] = sales_df['date'].dt.month

pivot = sales_df.pivot_table(
    values='sales',
    index='month',
    columns='day_name',
    aggfunc='mean'
).round(2)

print(pivot)

Handling Missing Values

The datetime accessor gracefully handles NaT (Not a Time) values:

# Create data with missing values
dates_with_na = pd.Series([pd.Timestamp('2024-01-01'), pd.NaT, pd.Timestamp('2024-01-03')])
df_na = pd.DataFrame({'date': dates_with_na})

df_na['day_of_week'] = df_na['date'].dt.dayofweek
df_na['day_name'] = df_na['date'].dt.day_name()

print(df_na)

The result shows NaN for missing datetime values, allowing you to handle them appropriately in your data pipeline.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.