Python File Handling: read, write, and append Operations

File I/O operations form the backbone of data persistence in Python applications. Whether you're processing CSV files, managing application logs, or storing user preferences, understanding file...

Key Insights

  • Always use context managers (with statement) for file operations to ensure proper resource cleanup, even when exceptions occur
  • Choose the right file mode for your use case: ‘r’ for reading, ‘w’ for creating/overwriting, ‘a’ for appending, and combined modes like ‘r+’ for read-write operations
  • Iterate over file objects directly in loops rather than loading entire files into memory—this approach scales better for large files and reduces memory footprint

Introduction to File Handling in Python

File I/O operations form the backbone of data persistence in Python applications. Whether you’re processing CSV files, managing application logs, or storing user preferences, understanding file handling is non-negotiable for any Python developer.

Python’s built-in open() function provides a straightforward interface for file operations. The function accepts a file path and a mode parameter that determines how the file will be accessed:

# Basic syntax
file = open('filename.txt', mode='r')

# Common modes
open('data.txt', 'r')   # Read mode (default)
open('output.txt', 'w')  # Write mode (creates/overwrites)
open('log.txt', 'a')     # Append mode
open('data.txt', 'r+')   # Read and write
open('data.txt', 'rb')   # Read binary

The mode parameter is critical—using ‘w’ mode on an existing file will erase all its content, while ‘r’ mode will fail if the file doesn’t exist. Understanding these behaviors prevents data loss and runtime errors.

Reading Files

Python offers multiple methods to read file content, each suited for different scenarios. Choose based on your file size and processing requirements.

The read() method loads the entire file content into memory as a single string:

with open('document.txt', 'r') as file:
    content = file.read()
    print(content)

This approach works well for small files but becomes problematic with large datasets. You can also specify the number of characters to read:

with open('document.txt', 'r') as file:
    # Read first 100 characters
    chunk = file.read(100)
    print(chunk)

For line-by-line processing, use readline() or readlines():

# readline() - reads one line at a time
with open('document.txt', 'r') as file:
    first_line = file.readline()
    second_line = file.readline()
    print(first_line, second_line)

# readlines() - returns list of all lines
with open('document.txt', 'r') as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip())  # strip() removes trailing newlines

The most memory-efficient approach is iterating directly over the file object:

with open('large_file.txt', 'r') as file:
    for line in file:
        # Process each line individually
        # File is read line-by-line, not loaded entirely into memory
        if 'ERROR' in line:
            print(line.strip())

This iterator approach is my preferred method for processing large files. It reads one line at a time, keeping memory usage constant regardless of file size.

Writing to Files

Write mode (‘w’) creates a new file or completely overwrites an existing one. This destructive behavior is intentional—use it when you need a clean slate.

# Create and write to a new file
with open('output.txt', 'w') as file:
    file.write('First line\n')
    file.write('Second line\n')

Note that write() doesn’t add newlines automatically. You must include ‘\n’ explicitly. This gives you precise control over formatting:

# Writing formatted strings
data = {'name': 'Alice', 'score': 95}
with open('results.txt', 'w') as file:
    file.write(f"Name: {data['name']}\n")
    file.write(f"Score: {data['score']}\n")

For writing multiple lines at once, use writelines():

lines = ['Line 1\n', 'Line 2\n', 'Line 3\n']
with open('output.txt', 'w') as file:
    file.writelines(lines)

Be aware that writelines() doesn’t add separators between items. If your list items lack newlines, they’ll be concatenated without breaks.

Appending to Files

Append mode (‘a’) adds content to the end of a file without erasing existing data. This is essential for logs, incremental data collection, and any scenario where you need to preserve historical content.

# Append to existing file
with open('log.txt', 'a') as file:
    file.write('New log entry\n')

Here’s a practical logging scenario that demonstrates the difference between write and append modes:

from datetime import datetime

# Using write mode - ERASES previous logs
with open('app_log.txt', 'w') as file:
    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    file.write(f'[{timestamp}] Application started\n')

# Using append mode - PRESERVES previous logs
with open('app_log.txt', 'a') as file:
    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    file.write(f'[{timestamp}] User logged in\n')

After running this code, ‘app_log.txt’ will only contain the “User logged in” entry because write mode erased the “Application started” entry. In production, you’d use append mode for both operations.

Context Managers and Best Practices

The with statement is the correct way to handle files in Python. It automatically closes the file when the block exits, even if an exception occurs:

# Correct: Using context manager
with open('data.txt', 'r') as file:
    content = file.read()
    # File automatically closes after this block

# Incorrect: Manual file handling
file = open('data.txt', 'r')
content = file.read()
file.close()  # Might not execute if an exception occurs

The context manager approach prevents resource leaks. Without it, exceptions can leave files open, potentially causing lock issues or data corruption.

Handle specific exceptions to make your code more robust:

from pathlib import Path

try:
    with open('config.txt', 'r') as file:
        config = file.read()
except FileNotFoundError:
    print('Config file not found, using defaults')
    config = 'default_settings'
except PermissionError:
    print('Permission denied to read config file')
    config = None

Always anticipate failure modes. File operations can fail for numerous reasons: missing files, permission issues, disk full, network drives disconnected. Defensive programming saves debugging time.

Working with File Paths

Python’s pathlib module provides an object-oriented approach to file paths that works across operating systems:

from pathlib import Path

# Create Path objects
data_dir = Path('data')
file_path = data_dir / 'output.txt'  # Works on Windows and Unix

# Check file existence before operations
if file_path.exists():
    with open(file_path, 'r') as file:
        content = file.read()
else:
    print(f'{file_path} does not exist')

# Create parent directories if needed
file_path.parent.mkdir(parents=True, exist_ok=True)
with open(file_path, 'w') as file:
    file.write('Data content')

The pathlib approach is cleaner than string concatenation and handles platform differences automatically. Use it for any non-trivial path manipulation.

Practical Example: Log File Manager

Here’s a complete example combining read, write, and append operations in a reusable class:

from pathlib import Path
from datetime import datetime

class LogManager:
    def __init__(self, log_file):
        self.log_file = Path(log_file)
        self.log_file.parent.mkdir(parents=True, exist_ok=True)
    
    def write_log(self, message, level='INFO'):
        """Append a timestamped log entry"""
        timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        log_entry = f'[{timestamp}] [{level}] {message}\n'
        
        with open(self.log_file, 'a') as file:
            file.write(log_entry)
    
    def read_logs(self, filter_level=None):
        """Read and optionally filter log entries"""
        if not self.log_file.exists():
            return []
        
        logs = []
        with open(self.log_file, 'r') as file:
            for line in file:
                if filter_level is None or f'[{filter_level}]' in line:
                    logs.append(line.strip())
        return logs
    
    def clear_logs(self):
        """Remove all log entries"""
        with open(self.log_file, 'w') as file:
            file.write('')  # Overwrites with empty content
    
    def get_log_stats(self):
        """Return statistics about the log file"""
        if not self.log_file.exists():
            return {'total_lines': 0, 'size_bytes': 0}
        
        line_count = 0
        with open(self.log_file, 'r') as file:
            for line in file:
                line_count += 1
        
        return {
            'total_lines': line_count,
            'size_bytes': self.log_file.stat().st_size
        }

# Usage example
logger = LogManager('logs/app.log')

# Write logs
logger.write_log('Application started')
logger.write_log('Database connection failed', level='ERROR')
logger.write_log('Retrying connection', level='WARNING')

# Read all logs
all_logs = logger.read_logs()
print('All logs:', all_logs)

# Read only errors
errors = logger.read_logs(filter_level='ERROR')
print('Errors:', errors)

# Get statistics
stats = logger.get_log_stats()
print(f"Total entries: {stats['total_lines']}, Size: {stats['size_bytes']} bytes")

This log manager demonstrates real-world file handling patterns: checking existence before reading, using append mode for logs, providing filtered reads, and combining multiple file operations in a cohesive interface.

File handling in Python is straightforward once you understand the mode behaviors and embrace context managers. Start with these fundamentals, handle exceptions appropriately, and your file operations will be both reliable and maintainable.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.