Integration Testing: Testing Component Interactions

Unit tests verify that individual functions work correctly in isolation. Integration tests verify that your components actually work together. This distinction matters because most production bugs...

Key Insights

  • Integration tests catch bugs that unit tests miss by verifying that components work together correctly—testing the seams where systems connect is where most production bugs hide.
  • The choice between mocking dependencies and using real services involves tradeoffs; contract testing offers a middle ground that maintains speed while ensuring compatibility.
  • Proper test isolation through transaction rollbacks, container orchestration, and careful state management prevents the flaky tests that erode team confidence in your test suite.

Beyond Unit Tests

Unit tests verify that individual functions work correctly in isolation. Integration tests verify that your components actually work together. This distinction matters because most production bugs don’t live inside a single function—they hide in the interactions between components.

Consider a simple user registration flow: your validation logic might be perfect, your database repository might work flawlessly, and your email service might send messages correctly. But when you wire them together, you discover that your repository expects a different date format than your validator produces, or your email service times out under the database transaction’s lock.

Integration tests catch these interaction bugs. They exercise the real code paths your application uses in production, including serialization, network calls, database queries, and message passing. If unit tests answer “does this function work?”, integration tests answer “does this system work?”

Types of Integration Testing

Four main strategies exist for integration testing, each with distinct tradeoffs.

Big-bang integration tests all components together simultaneously. You build everything, wire it up, and run tests against the complete system. This approach is simple to understand but difficult to debug—when a test fails, any component could be the culprit. Big-bang works best for small systems or final validation before release.

Bottom-up integration starts with the lowest-level components (databases, external services) and progressively adds higher-level components. You test your repository layer first, then your service layer with the real repository, then your API layer with real services. This approach catches low-level bugs early but requires test drivers to simulate higher-level components initially.

Top-down integration works in reverse: start with the API layer using stubs for lower components, then progressively replace stubs with real implementations. This lets you validate user-facing behavior early but requires maintaining stubs that may drift from real implementations.

Sandwich integration combines both approaches, testing from the top and bottom simultaneously and meeting in the middle. This works well for large teams that can parallelize testing efforts.

For most projects, I recommend bottom-up integration. Database and external service interactions cause the most integration bugs, and catching them early saves debugging time.

Setting Up the Test Environment

Reliable integration tests require reproducible environments. Docker Compose provides the most practical solution for managing test dependencies.

# docker-compose.test.yml
version: '3.8'

services:
  postgres:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: testdb
      POSTGRES_USER: testuser
      POSTGRES_PASSWORD: testpass
    ports:
      - "5433:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U testuser -d testdb"]
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6380:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 5s
      retries: 5

  rabbitmq:
    image: rabbitmq:3-management-alpine
    ports:
      - "5673:5672"
      - "15673:15672"
    healthcheck:
      test: ["CMD", "rabbitmq-diagnostics", "check_running"]
      interval: 10s
      timeout: 10s
      retries: 5

Use different ports than your development environment to avoid conflicts. The healthchecks ensure services are ready before tests run.

Configuration management becomes critical. Your application needs different connection strings for test, development, and production environments:

# config.py
import os
from dataclasses import dataclass

@dataclass
class DatabaseConfig:
    host: str
    port: int
    name: str
    user: str
    password: str
    
    @classmethod
    def from_env(cls, prefix: str = ""):
        return cls(
            host=os.getenv(f"{prefix}DB_HOST", "localhost"),
            port=int(os.getenv(f"{prefix}DB_PORT", "5432")),
            name=os.getenv(f"{prefix}DB_NAME", "app"),
            user=os.getenv(f"{prefix}DB_USER", "postgres"),
            password=os.getenv(f"{prefix}DB_PASSWORD", ""),
        )

# In tests, set TEST_DB_* environment variables
test_config = DatabaseConfig.from_env(prefix="TEST_")

Writing Effective Integration Tests

Good integration tests focus on component boundaries. Test the contracts between systems, not internal implementation details.

# test_user_registration.py
import pytest
import httpx
from datetime import datetime, timedelta

class TestUserRegistration:
    """Integration tests for user registration flow."""
    
    @pytest.fixture
    def client(self, app):
        """HTTP client configured for testing."""
        return httpx.Client(base_url="http://localhost:8000")
    
    @pytest.fixture
    def mock_email_service(self, wiremock):
        """Stub email service responses."""
        wiremock.stub_for(
            wiremock.post("/api/emails/send")
            .will_return(status=202, json={"messageId": "test-123"})
        )
        return wiremock
    
    def test_successful_registration_creates_user_and_sends_email(
        self, client, db_session, mock_email_service
    ):
        # Arrange
        payload = {
            "email": "newuser@example.com",
            "password": "SecurePass123!",
            "name": "Test User"
        }
        
        # Act
        response = client.post("/api/users/register", json=payload)
        
        # Assert - API response
        assert response.status_code == 201
        data = response.json()
        assert data["email"] == payload["email"]
        assert "id" in data
        assert "password" not in data  # Never expose passwords
        
        # Assert - Database state
        user = db_session.query(User).filter_by(email=payload["email"]).first()
        assert user is not None
        assert user.name == payload["name"]
        assert user.created_at > datetime.utcnow() - timedelta(seconds=5)
        
        # Assert - External service called
        email_requests = mock_email_service.find_requests(
            method="POST", path="/api/emails/send"
        )
        assert len(email_requests) == 1
        assert email_requests[0].json()["to"] == payload["email"]
    
    def test_duplicate_email_returns_conflict(self, client, db_session):
        # Arrange - Create existing user
        existing_user = User(email="existing@example.com", name="Existing")
        db_session.add(existing_user)
        db_session.commit()
        
        # Act
        response = client.post("/api/users/register", json={
            "email": "existing@example.com",
            "password": "SecurePass123!",
            "name": "New User"
        })
        
        # Assert
        assert response.status_code == 409
        assert "already exists" in response.json()["detail"].lower()
    
    def test_email_service_failure_rolls_back_user_creation(
        self, client, db_session, wiremock
    ):
        # Arrange - Email service returns error
        wiremock.stub_for(
            wiremock.post("/api/emails/send")
            .will_return(status=500, json={"error": "Service unavailable"})
        )
        
        # Act
        response = client.post("/api/users/register", json={
            "email": "newuser@example.com",
            "password": "SecurePass123!",
            "name": "Test User"
        })
        
        # Assert - Request failed
        assert response.status_code == 503
        
        # Assert - No user was created (transaction rolled back)
        user = db_session.query(User).filter_by(email="newuser@example.com").first()
        assert user is None

Notice how this test verifies multiple integration points: HTTP handling, database persistence, and external service communication. It also tests failure scenarios—the email service failure test catches a common bug where partial operations leave the system in an inconsistent state.

Managing Test Data and State

Test pollution—where one test affects another—destroys confidence in your test suite. The transaction rollback pattern provides the cleanest solution for database tests:

# conftest.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

@pytest.fixture(scope="session")
def engine():
    """Create database engine once per test session."""
    return create_engine(
        "postgresql://testuser:testpass@localhost:5433/testdb"
    )

@pytest.fixture(scope="session")
def tables(engine):
    """Create all tables before tests, drop after."""
    Base.metadata.create_all(engine)
    yield
    Base.metadata.drop_all(engine)

@pytest.fixture
def db_session(engine, tables):
    """
    Provide a transactional database session that rolls back after each test.
    """
    connection = engine.connect()
    transaction = connection.begin()
    session = sessionmaker(bind=connection)()
    
    # Begin a nested transaction (savepoint)
    nested = connection.begin_nested()
    
    # Restart savepoint after each commit
    @event.listens_for(session, "after_transaction_end")
    def restart_savepoint(session, transaction):
        nonlocal nested
        if transaction.nested and not transaction._parent.nested:
            nested = connection.begin_nested()
    
    yield session
    
    # Rollback everything
    session.close()
    transaction.rollback()
    connection.close()

This pattern wraps each test in a transaction that rolls back automatically. Your tests can commit data normally, but nothing persists between tests.

For non-transactional cleanup, implement explicit teardown:

@pytest.fixture
def clean_redis(redis_client):
    """Ensure clean Redis state for each test."""
    yield redis_client
    redis_client.flushdb()

Mocking vs. Real Dependencies

The mocking debate has a simple answer: use real dependencies when practical, mock when necessary, and use contract testing to verify your mocks stay accurate.

Real dependencies provide the highest confidence but introduce complexity and slowness. Use them for:

  • Databases (with transaction rollback)
  • Caches (Redis, Memcached)
  • Message queues in simple scenarios

Mock external HTTP services that you don’t control. WireMock provides contract verification to ensure your mocks match reality:

# test_payment_service.py
import pytest
from wiremock.client import WireMock

@pytest.fixture
def payment_api_stub(wiremock_server):
    """Configure payment API stubs with contract verification."""
    wm = WireMock(host="localhost", port=8080)
    
    # Stub successful payment
    wm.register(
        wm.post("/v1/charges")
        .with_request_body(containing="amount"))
        .will_return(
            status=200,
            json={
                "id": "ch_test123",
                "status": "succeeded",
                "amount": 1000
            }
        )
    )
    
    yield wm
    
    # Verify all stubs were called (contract verification)
    wm.verify_that(
        wm.post("/v1/charges"),
        called_at_least_once=True
    )
    wm.reset()

Contract testing tools like Pact take this further by generating contracts from provider tests and verifying consumer expectations match.

Integration Testing in CI/CD Pipelines

Integration tests in CI require careful orchestration. GitHub Actions service containers simplify dependency management:

# .github/workflows/integration-tests.yml
name: Integration Tests

on: [push, pull_request]

jobs:
  integration-tests:
    runs-on: ubuntu-latest
    
    services:
      postgres:
        image: postgres:15-alpine
        env:
          POSTGRES_DB: testdb
          POSTGRES_USER: testuser
          POSTGRES_PASSWORD: testpass
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5          
      
      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5          

    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'
      
      - name: Install dependencies
        run: pip install -r requirements-test.txt
      
      - name: Run integration tests
        env:
          TEST_DB_HOST: localhost
          TEST_DB_PORT: 5432
          TEST_DB_NAME: testdb
          TEST_DB_USER: testuser
          TEST_DB_PASSWORD: testpass
          TEST_REDIS_URL: redis://localhost:6379
        run: |
          pytest tests/integration \
            --tb=short \
            -x \
            --timeout=30 \
            -n auto          

Key practices for CI integration tests:

  1. Use --timeout to fail slow tests quickly
  2. Use -x to stop on first failure during development
  3. Parallelize with -n auto but ensure test isolation
  4. Retry flaky tests sparingly—fix the root cause instead

For flaky tests, implement retry logic as a last resort:

@pytest.mark.flaky(reruns=2, reruns_delay=1)
def test_occasionally_flaky_external_service():
    """This test may fail due to external service latency."""
    pass

But treat every @flaky marker as technical debt. Flaky tests indicate either poor isolation or unreliable dependencies that need better stubbing.

Integration tests require more investment than unit tests but catch the bugs that actually break production. Start with your most critical paths, ensure proper isolation, and build confidence in your component interactions.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.