How to Create a Gantt Chart in Matplotlib

Gantt charts visualize project schedules by displaying tasks as horizontal bars along a timeline. Each bar's position indicates when a task starts, and its length represents the task's duration....

Key Insights

  • Matplotlib’s barh() function combined with date arithmetic creates functional Gantt charts without external dependencies, giving you full control over visualization details.
  • Converting dates to numerical values using matplotlib’s date conversion utilities is essential for proper timeline rendering and axis formatting.
  • For projects with more than 50 tasks or complex interactive requirements, consider Plotly or dedicated libraries, but Matplotlib excels for static reports and customization.

Introduction to Gantt Charts and Use Cases

Gantt charts visualize project schedules by displaying tasks as horizontal bars along a timeline. Each bar’s position indicates when a task starts, and its length represents the task’s duration. Project managers use them to track progress, identify bottlenecks, and communicate timelines to stakeholders.

While specialized libraries like Plotly or dedicated project management tools offer Gantt chart functionality, Matplotlib provides a lightweight solution when you need full customization control or want to avoid additional dependencies. It’s particularly valuable when generating static reports, embedding charts in automated documentation, or creating publication-ready figures.

In this article, you’ll build a complete Gantt chart implementation from scratch, starting with basic horizontal bars and progressing to advanced features like task dependencies and milestone markers.

Setting Up the Environment and Data Structure

You’ll need three core libraries: matplotlib for visualization, pandas for data management, and datetime for handling dates. The data structure should capture task names, start dates, and durations at minimum.

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
from datetime import datetime, timedelta

# Create sample project data
data = {
    'Task': ['Requirements Gathering', 'System Design', 'Database Setup', 
             'Backend Development', 'Frontend Development', 'Testing', 
             'Deployment'],
    'Start': ['2024-01-01', '2024-01-08', '2024-01-15', 
              '2024-01-20', '2024-01-25', '2024-02-10', '2024-02-20'],
    'Duration': [7, 10, 5, 15, 20, 10, 3],  # days
    'Category': ['Planning', 'Planning', 'Development', 
                 'Development', 'Development', 'QA', 'Operations']
}

df = pd.DataFrame(data)
df['Start'] = pd.to_datetime(df['Start'])
df['End'] = df['Start'] + pd.to_timedelta(df['Duration'], unit='D')

This structure uses pandas DataFrames because they simplify date arithmetic and data manipulation. The Category column will later enable color-coding by project phase.

Creating a Basic Gantt Chart with Horizontal Bars

Matplotlib’s barh() function creates horizontal bars, perfect for Gantt charts. The challenge is converting dates to numerical values that matplotlib can plot. Use date2num() from matplotlib.dates for this conversion.

fig, ax = plt.subplots(figsize=(12, 6))

# Convert dates to numerical format
start_num = mdates.date2num(df['Start'])
duration_num = df['Duration']

# Create horizontal bars
ax.barh(df['Task'], duration_num, left=start_num, height=0.4)

# Format x-axis as dates
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator(interval=5))

# Rotate date labels for readability
plt.xticks(rotation=45, ha='right')

# Labels and title
ax.set_xlabel('Date')
ax.set_ylabel('Tasks')
ax.set_title('Project Timeline')

plt.tight_layout()
plt.show()

The left parameter in barh() positions where each bar starts, while the width represents duration. Setting height=0.4 prevents bars from overlapping. The date formatter ensures x-axis labels display as readable dates rather than numerical values.

Customizing the Gantt Chart

Color-coding by category improves readability. Define a color map and apply it based on the Category column. Add gridlines and adjust styling for a professional appearance.

fig, ax = plt.subplots(figsize=(14, 7))

# Define color map for categories
color_map = {
    'Planning': '#FF6B6B',
    'Development': '#4ECDC4',
    'QA': '#45B7D1',
    'Operations': '#FFA07A'
}

# Create bars with category-based colors
for idx, row in df.iterrows():
    start_num = mdates.date2num(row['Start'])
    ax.barh(row['Task'], row['Duration'], left=start_num, 
            height=0.5, color=color_map[row['Category']], 
            alpha=0.8, edgecolor='black', linewidth=0.5)

# Format x-axis with better date display
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %d'))
ax.xaxis.set_major_locator(mdates.WeekdayLocator(interval=1))
ax.xaxis.set_minor_locator(mdates.DayLocator())

# Add gridlines
ax.grid(True, axis='x', alpha=0.3, linestyle='--', linewidth=0.5)

# Improve layout
plt.xticks(rotation=45, ha='right')
ax.set_xlabel('Timeline', fontsize=12, fontweight='bold')
ax.set_ylabel('Tasks', fontsize=12, fontweight='bold')
ax.set_title('Software Development Project Schedule', 
             fontsize=14, fontweight='bold', pad=20)

# Add legend
from matplotlib.patches import Patch
legend_elements = [Patch(facecolor=color, label=category) 
                   for category, color in color_map.items()]
ax.legend(handles=legend_elements, loc='upper right')

plt.tight_layout()
plt.show()

This version adds visual hierarchy through colors, making it easy to identify which phase each task belongs to. The gridlines help readers trace task timelines accurately.

Adding Advanced Features

Task dependencies and milestones transform a basic timeline into an actionable project management tool. Use annotate() to draw arrows between dependent tasks, and scatter plots for milestone markers.

# Add dependency information to dataframe
df['Dependencies'] = [[], [0], [1], [2], [2], [3, 4], [5]]

fig, ax = plt.subplots(figsize=(15, 8))

# Plot bars
y_positions = range(len(df))
for idx, row in df.iterrows():
    start_num = mdates.date2num(row['Start'])
    ax.barh(idx, row['Duration'], left=start_num, 
            height=0.5, color=color_map[row['Category']], 
            alpha=0.8, edgecolor='black', linewidth=0.5)

# Add dependency arrows
for idx, row in df.iterrows():
    for dep_idx in row['Dependencies']:
        # Calculate arrow positions
        dep_end = mdates.date2num(df.iloc[dep_idx]['End'])
        task_start = mdates.date2num(row['Start'])
        
        # Draw arrow from dependency end to task start
        ax.annotate('', xy=(task_start, idx), 
                    xytext=(dep_end, dep_idx),
                    arrowprops=dict(arrowstyle='->', color='gray', 
                                    lw=1.5, alpha=0.6))

# Add milestone markers (e.g., end of Testing phase)
milestone_date = mdates.date2num(df.iloc[5]['End'])
ax.scatter(milestone_date, 5, marker='D', s=200, 
           color='gold', edgecolor='black', linewidth=2, 
           zorder=5, label='Milestone: Testing Complete')

# Format axes
ax.set_yticks(y_positions)
ax.set_yticklabels(df['Task'])
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %d'))
ax.xaxis.set_major_locator(mdates.WeekdayLocator(interval=1))
ax.grid(True, axis='x', alpha=0.3, linestyle='--')

plt.xticks(rotation=45, ha='right')
ax.set_xlabel('Timeline', fontsize=12, fontweight='bold')
ax.set_title('Project Schedule with Dependencies', 
             fontsize=14, fontweight='bold', pad=20)

# Combined legend
legend_elements = [Patch(facecolor=color, label=category) 
                   for category, color in color_map.items()]
legend_elements.append(plt.Line2D([0], [0], marker='D', color='w', 
                       markerfacecolor='gold', markersize=10, 
                       label='Milestone', markeredgecolor='black'))
ax.legend(handles=legend_elements, loc='upper right')

plt.tight_layout()
plt.show()

The dependency arrows clearly show which tasks must complete before others can begin. This visualization immediately reveals the critical path and potential scheduling conflicts.

Best Practices and Tips

Chart sizing matters significantly. For 5-10 tasks, use figsize=(12, 6). Add one inch of height for every five additional tasks to maintain readability. For projects exceeding 50 tasks, consider splitting into multiple charts by phase or team.

Export charts in vector formats for presentations and reports:

# High-resolution PNG for web
plt.savefig('gantt_chart.png', dpi=300, bbox_inches='tight')

# Vector format for publications
plt.savefig('gantt_chart.pdf', bbox_inches='tight')

# SVG for web graphics
plt.savefig('gantt_chart.svg', bbox_inches='tight')

Performance degrades with hundreds of tasks due to matplotlib’s rendering overhead. For large projects or interactive requirements, Plotly offers better performance and built-in zoom/pan capabilities. However, Matplotlib’s customization flexibility and lack of JavaScript dependencies make it ideal for automated reporting pipelines and scientific publications.

Avoid common pitfalls: always use tight_layout() or bbox_inches='tight' to prevent label clipping, convert dates consistently using pandas datetime objects, and test your chart with varying task counts to ensure scaling works properly.

Conclusion and Further Resources

You’ve built a complete Gantt chart system using only matplotlib, from basic horizontal bars to advanced dependency visualization. This approach gives you full control over styling, integrates seamlessly with pandas data pipelines, and requires no additional dependencies beyond the standard scientific Python stack.

Extend this implementation by adding progress bars within tasks, implementing resource allocation views, or creating interactive tooltips with mplcursors. For dynamic updates, wrap the plotting code in functions that accept DataFrames and configuration dictionaries.

Consult the matplotlib documentation for advanced customization options, particularly the dates module for complex timeline formatting and the patches module for custom shapes. The pandas documentation offers powerful date manipulation techniques that simplify complex scheduling logic.

This foundation scales from simple personal projects to automated enterprise reporting systems, proving that sometimes the best tool is the one you already have.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.