Integration Testing: Testing Component Interactions
Unit tests verify that individual functions work correctly in isolation. Integration tests verify that your components actually work together. This distinction matters because most production bugs...
Key Insights
- Integration tests catch bugs that unit tests miss by verifying that components work together correctly—testing the seams where systems connect is where most production bugs hide.
- The choice between mocking dependencies and using real services involves tradeoffs; contract testing offers a middle ground that maintains speed while ensuring compatibility.
- Proper test isolation through transaction rollbacks, container orchestration, and careful state management prevents the flaky tests that erode team confidence in your test suite.
Beyond Unit Tests
Unit tests verify that individual functions work correctly in isolation. Integration tests verify that your components actually work together. This distinction matters because most production bugs don’t live inside a single function—they hide in the interactions between components.
Consider a simple user registration flow: your validation logic might be perfect, your database repository might work flawlessly, and your email service might send messages correctly. But when you wire them together, you discover that your repository expects a different date format than your validator produces, or your email service times out under the database transaction’s lock.
Integration tests catch these interaction bugs. They exercise the real code paths your application uses in production, including serialization, network calls, database queries, and message passing. If unit tests answer “does this function work?”, integration tests answer “does this system work?”
Types of Integration Testing
Four main strategies exist for integration testing, each with distinct tradeoffs.
Big-bang integration tests all components together simultaneously. You build everything, wire it up, and run tests against the complete system. This approach is simple to understand but difficult to debug—when a test fails, any component could be the culprit. Big-bang works best for small systems or final validation before release.
Bottom-up integration starts with the lowest-level components (databases, external services) and progressively adds higher-level components. You test your repository layer first, then your service layer with the real repository, then your API layer with real services. This approach catches low-level bugs early but requires test drivers to simulate higher-level components initially.
Top-down integration works in reverse: start with the API layer using stubs for lower components, then progressively replace stubs with real implementations. This lets you validate user-facing behavior early but requires maintaining stubs that may drift from real implementations.
Sandwich integration combines both approaches, testing from the top and bottom simultaneously and meeting in the middle. This works well for large teams that can parallelize testing efforts.
For most projects, I recommend bottom-up integration. Database and external service interactions cause the most integration bugs, and catching them early saves debugging time.
Setting Up the Test Environment
Reliable integration tests require reproducible environments. Docker Compose provides the most practical solution for managing test dependencies.
# docker-compose.test.yml
version: '3.8'
services:
postgres:
image: postgres:15-alpine
environment:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
ports:
- "5433:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U testuser -d testdb"]
interval: 5s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
ports:
- "6380:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 5s
retries: 5
rabbitmq:
image: rabbitmq:3-management-alpine
ports:
- "5673:5672"
- "15673:15672"
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "check_running"]
interval: 10s
timeout: 10s
retries: 5
Use different ports than your development environment to avoid conflicts. The healthchecks ensure services are ready before tests run.
Configuration management becomes critical. Your application needs different connection strings for test, development, and production environments:
# config.py
import os
from dataclasses import dataclass
@dataclass
class DatabaseConfig:
host: str
port: int
name: str
user: str
password: str
@classmethod
def from_env(cls, prefix: str = ""):
return cls(
host=os.getenv(f"{prefix}DB_HOST", "localhost"),
port=int(os.getenv(f"{prefix}DB_PORT", "5432")),
name=os.getenv(f"{prefix}DB_NAME", "app"),
user=os.getenv(f"{prefix}DB_USER", "postgres"),
password=os.getenv(f"{prefix}DB_PASSWORD", ""),
)
# In tests, set TEST_DB_* environment variables
test_config = DatabaseConfig.from_env(prefix="TEST_")
Writing Effective Integration Tests
Good integration tests focus on component boundaries. Test the contracts between systems, not internal implementation details.
# test_user_registration.py
import pytest
import httpx
from datetime import datetime, timedelta
class TestUserRegistration:
"""Integration tests for user registration flow."""
@pytest.fixture
def client(self, app):
"""HTTP client configured for testing."""
return httpx.Client(base_url="http://localhost:8000")
@pytest.fixture
def mock_email_service(self, wiremock):
"""Stub email service responses."""
wiremock.stub_for(
wiremock.post("/api/emails/send")
.will_return(status=202, json={"messageId": "test-123"})
)
return wiremock
def test_successful_registration_creates_user_and_sends_email(
self, client, db_session, mock_email_service
):
# Arrange
payload = {
"email": "newuser@example.com",
"password": "SecurePass123!",
"name": "Test User"
}
# Act
response = client.post("/api/users/register", json=payload)
# Assert - API response
assert response.status_code == 201
data = response.json()
assert data["email"] == payload["email"]
assert "id" in data
assert "password" not in data # Never expose passwords
# Assert - Database state
user = db_session.query(User).filter_by(email=payload["email"]).first()
assert user is not None
assert user.name == payload["name"]
assert user.created_at > datetime.utcnow() - timedelta(seconds=5)
# Assert - External service called
email_requests = mock_email_service.find_requests(
method="POST", path="/api/emails/send"
)
assert len(email_requests) == 1
assert email_requests[0].json()["to"] == payload["email"]
def test_duplicate_email_returns_conflict(self, client, db_session):
# Arrange - Create existing user
existing_user = User(email="existing@example.com", name="Existing")
db_session.add(existing_user)
db_session.commit()
# Act
response = client.post("/api/users/register", json={
"email": "existing@example.com",
"password": "SecurePass123!",
"name": "New User"
})
# Assert
assert response.status_code == 409
assert "already exists" in response.json()["detail"].lower()
def test_email_service_failure_rolls_back_user_creation(
self, client, db_session, wiremock
):
# Arrange - Email service returns error
wiremock.stub_for(
wiremock.post("/api/emails/send")
.will_return(status=500, json={"error": "Service unavailable"})
)
# Act
response = client.post("/api/users/register", json={
"email": "newuser@example.com",
"password": "SecurePass123!",
"name": "Test User"
})
# Assert - Request failed
assert response.status_code == 503
# Assert - No user was created (transaction rolled back)
user = db_session.query(User).filter_by(email="newuser@example.com").first()
assert user is None
Notice how this test verifies multiple integration points: HTTP handling, database persistence, and external service communication. It also tests failure scenarios—the email service failure test catches a common bug where partial operations leave the system in an inconsistent state.
Managing Test Data and State
Test pollution—where one test affects another—destroys confidence in your test suite. The transaction rollback pattern provides the cleanest solution for database tests:
# conftest.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
@pytest.fixture(scope="session")
def engine():
"""Create database engine once per test session."""
return create_engine(
"postgresql://testuser:testpass@localhost:5433/testdb"
)
@pytest.fixture(scope="session")
def tables(engine):
"""Create all tables before tests, drop after."""
Base.metadata.create_all(engine)
yield
Base.metadata.drop_all(engine)
@pytest.fixture
def db_session(engine, tables):
"""
Provide a transactional database session that rolls back after each test.
"""
connection = engine.connect()
transaction = connection.begin()
session = sessionmaker(bind=connection)()
# Begin a nested transaction (savepoint)
nested = connection.begin_nested()
# Restart savepoint after each commit
@event.listens_for(session, "after_transaction_end")
def restart_savepoint(session, transaction):
nonlocal nested
if transaction.nested and not transaction._parent.nested:
nested = connection.begin_nested()
yield session
# Rollback everything
session.close()
transaction.rollback()
connection.close()
This pattern wraps each test in a transaction that rolls back automatically. Your tests can commit data normally, but nothing persists between tests.
For non-transactional cleanup, implement explicit teardown:
@pytest.fixture
def clean_redis(redis_client):
"""Ensure clean Redis state for each test."""
yield redis_client
redis_client.flushdb()
Mocking vs. Real Dependencies
The mocking debate has a simple answer: use real dependencies when practical, mock when necessary, and use contract testing to verify your mocks stay accurate.
Real dependencies provide the highest confidence but introduce complexity and slowness. Use them for:
- Databases (with transaction rollback)
- Caches (Redis, Memcached)
- Message queues in simple scenarios
Mock external HTTP services that you don’t control. WireMock provides contract verification to ensure your mocks match reality:
# test_payment_service.py
import pytest
from wiremock.client import WireMock
@pytest.fixture
def payment_api_stub(wiremock_server):
"""Configure payment API stubs with contract verification."""
wm = WireMock(host="localhost", port=8080)
# Stub successful payment
wm.register(
wm.post("/v1/charges")
.with_request_body(containing="amount"))
.will_return(
status=200,
json={
"id": "ch_test123",
"status": "succeeded",
"amount": 1000
}
)
)
yield wm
# Verify all stubs were called (contract verification)
wm.verify_that(
wm.post("/v1/charges"),
called_at_least_once=True
)
wm.reset()
Contract testing tools like Pact take this further by generating contracts from provider tests and verifying consumer expectations match.
Integration Testing in CI/CD Pipelines
Integration tests in CI require careful orchestration. GitHub Actions service containers simplify dependency management:
# .github/workflows/integration-tests.yml
name: Integration Tests
on: [push, pull_request]
jobs:
integration-tests:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15-alpine
env:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
redis:
image: redis:7-alpine
ports:
- 6379:6379
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: pip install -r requirements-test.txt
- name: Run integration tests
env:
TEST_DB_HOST: localhost
TEST_DB_PORT: 5432
TEST_DB_NAME: testdb
TEST_DB_USER: testuser
TEST_DB_PASSWORD: testpass
TEST_REDIS_URL: redis://localhost:6379
run: |
pytest tests/integration \
--tb=short \
-x \
--timeout=30 \
-n auto
Key practices for CI integration tests:
- Use
--timeoutto fail slow tests quickly - Use
-xto stop on first failure during development - Parallelize with
-n autobut ensure test isolation - Retry flaky tests sparingly—fix the root cause instead
For flaky tests, implement retry logic as a last resort:
@pytest.mark.flaky(reruns=2, reruns_delay=1)
def test_occasionally_flaky_external_service():
"""This test may fail due to external service latency."""
pass
But treat every @flaky marker as technical debt. Flaky tests indicate either poor isolation or unreliable dependencies that need better stubbing.
Integration tests require more investment than unit tests but catch the bugs that actually break production. Start with your most critical paths, ensure proper isolation, and build confidence in your component interactions.