Python - Check if File/Directory Exists

The `pathlib` module, introduced in Python 3.4, provides an object-oriented interface for filesystem paths. This is the recommended approach for modern Python applications.

Key Insights

  • Python provides multiple methods to check file/directory existence through os.path, pathlib.Path, and os modules, each with different use cases and performance characteristics
  • The pathlib.Path approach is the modern, object-oriented solution that handles cross-platform path operations more elegantly than legacy os.path methods
  • Race conditions between checking existence and performing file operations require proper exception handling rather than relying solely on existence checks

The pathlib module, introduced in Python 3.4, provides an object-oriented interface for filesystem paths. This is the recommended approach for modern Python applications.

from pathlib import Path

# Check if file exists
file_path = Path('/path/to/file.txt')
if file_path.exists():
    print("File exists")

# Check if it's specifically a file
if file_path.is_file():
    print("Path points to a file")

# Check if it's specifically a directory
dir_path = Path('/path/to/directory')
if dir_path.is_dir():
    print("Path points to a directory")

The exists() method returns True for both files and directories, while is_file() and is_dir() provide more specific checks. These methods also return False for broken symbolic links, which is important for robust code.

from pathlib import Path

def check_path_type(path_str):
    path = Path(path_str)
    
    if not path.exists():
        return "Does not exist"
    elif path.is_file():
        return "Regular file"
    elif path.is_dir():
        return "Directory"
    elif path.is_symlink():
        return "Symbolic link (possibly broken)"
    else:
        return "Special file (socket, FIFO, etc.)"

# Usage
print(check_path_type('/etc/hosts'))        # Regular file
print(check_path_type('/tmp'))              # Directory
print(check_path_type('/nonexistent'))      # Does not exist

Using os.path Module

The os.path module provides traditional functions for path operations. While still widely used in legacy code, it’s less elegant than pathlib for most use cases.

import os

# Check if path exists (file or directory)
if os.path.exists('/path/to/file.txt'):
    print("Path exists")

# Check specifically for file
if os.path.isfile('/path/to/file.txt'):
    print("It's a file")

# Check specifically for directory
if os.path.isdir('/path/to/directory'):
    print("It's a directory")

# Check if it's a symbolic link
if os.path.islink('/path/to/symlink'):
    print("It's a symbolic link")

The os.path functions work with strings rather than Path objects, which can lead to more verbose code when performing multiple operations:

import os

def get_file_info(filepath):
    if not os.path.exists(filepath):
        return None
    
    return {
        'exists': True,
        'is_file': os.path.isfile(filepath),
        'is_dir': os.path.isdir(filepath),
        'is_link': os.path.islink(filepath),
        'size': os.path.getsize(filepath) if os.path.isfile(filepath) else None,
        'absolute_path': os.path.abspath(filepath)
    }

# Usage
info = get_file_info('example.txt')
if info:
    print(f"File size: {info['size']} bytes")

Symbolic links require special attention because they can point to non-existent targets, creating edge cases in existence checks.

from pathlib import Path
import os

# pathlib approach
symlink_path = Path('/path/to/symlink')

# exists() returns False for broken symlinks
print(symlink_path.exists())  # False if target doesn't exist

# is_symlink() returns True even if target doesn't exist
print(symlink_path.is_symlink())  # True

# Resolve the symlink to get the actual target
try:
    resolved = symlink_path.resolve(strict=True)
    print(f"Target exists: {resolved}")
except FileNotFoundError:
    print("Symlink target does not exist")

# os.path approach
symlink_str = '/path/to/symlink'

# lexists() returns True even for broken symlinks
print(os.path.lexists(symlink_str))  # True

# exists() returns False for broken symlinks
print(os.path.exists(symlink_str))   # False if broken

# Check if symlink and verify target
if os.path.islink(symlink_str):
    target = os.readlink(symlink_str)
    print(f"Symlink points to: {target}")
    print(f"Target exists: {os.path.exists(target)}")

Race Conditions and EAFP Pattern

Checking if a file exists before operating on it creates a race condition—the file could be deleted or created between the check and the operation. Python’s EAFP (Easier to Ask for Forgiveness than Permission) pattern handles this better.

from pathlib import Path

# LBYL (Look Before You Leap) - Not recommended
def read_file_lbyl(filepath):
    path = Path(filepath)
    if path.exists() and path.is_file():
        # File could be deleted here by another process
        with open(filepath, 'r') as f:
            return f.read()
    return None

# EAFP (Easier to Ask for Forgiveness than Permission) - Recommended
def read_file_eafp(filepath):
    try:
        with open(filepath, 'r') as f:
            return f.read()
    except FileNotFoundError:
        return None
    except PermissionError:
        print(f"Permission denied: {filepath}")
        return None
    except IsADirectoryError:
        print(f"Path is a directory: {filepath}")
        return None

# Usage
content = read_file_eafp('data.txt')
if content:
    print(content)

The EAFP approach is more robust because it handles the actual error condition rather than trying to predict it.

Checking Multiple Paths

When working with multiple potential file locations, you often need to find the first existing file from a list of candidates.

from pathlib import Path

def find_config_file(possible_paths):
    """Find the first existing config file from a list of paths."""
    for path_str in possible_paths:
        path = Path(path_str)
        if path.is_file():
            return path
    return None

# Usage
config_locations = [
    './config.yml',
    '~/.config/myapp/config.yml',
    '/etc/myapp/config.yml'
]

# Expand user paths
expanded_paths = [Path(p).expanduser() for p in config_locations]
config_path = find_config_file(expanded_paths)

if config_path:
    print(f"Using config: {config_path}")
else:
    print("No config file found")

For directory existence with automatic creation:

from pathlib import Path

def ensure_directory_exists(dir_path):
    """Create directory if it doesn't exist."""
    path = Path(dir_path)
    path.mkdir(parents=True, exist_ok=True)
    return path

# Usage
log_dir = ensure_directory_exists('./logs/application')
log_file = log_dir / 'app.log'

# Now you can safely write to the log file
with open(log_file, 'a') as f:
    f.write('Application started\n')

Performance Considerations

When checking existence for many files, the method you choose can impact performance:

from pathlib import Path
import os
import time

def benchmark_existence_checks(paths, iterations=1000):
    # Benchmark pathlib
    start = time.perf_counter()
    for _ in range(iterations):
        for path_str in paths:
            Path(path_str).exists()
    pathlib_time = time.perf_counter() - start
    
    # Benchmark os.path
    start = time.perf_counter()
    for _ in range(iterations):
        for path_str in paths:
            os.path.exists(path_str)
    ospath_time = time.perf_counter() - start
    
    return {
        'pathlib': pathlib_time,
        'os.path': ospath_time
    }

# Test with sample paths
test_paths = ['/etc/hosts', '/tmp', '/nonexistent', '/var/log']
results = benchmark_existence_checks(test_paths)

print(f"pathlib: {results['pathlib']:.4f}s")
print(f"os.path: {results['os.path']:.4f}s")

Generally, os.path functions are slightly faster for simple existence checks, but pathlib offers better code organization and readability, making it the preferred choice for most applications. The performance difference is negligible unless you’re checking thousands of paths in a tight loop.

Cross-Platform Path Handling

Path separators differ between operating systems. Both pathlib and os.path handle this automatically, but pathlib does it more elegantly:

from pathlib import Path
import os

# pathlib automatically uses correct separator
config_path = Path('config') / 'settings' / 'app.yml'
print(config_path)  # config/settings/app.yml on Unix, config\settings\app.yml on Windows

# os.path requires explicit joining
config_path_str = os.path.join('config', 'settings', 'app.yml')
print(config_path_str)

# Check existence works the same on all platforms
if config_path.exists():
    print("Config file found")

The pathlib approach with the / operator is cleaner and less error-prone than string concatenation or os.path.join().

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.