Design an E-Commerce Platform: Product Catalog and Orders

E-commerce platforms face a fundamental tension: product catalogs need to serve millions of reads per second with sub-100ms latency, while order processing demands strong consistency guarantees that...

Key Insights

  • Separate your product catalog and order services with clear boundaries—the catalog optimizes for reads while orders prioritize consistency and state management
  • Use the saga pattern for checkout flows to handle distributed transactions without two-phase commits, accepting eventual consistency where appropriate
  • Implement the outbox pattern for reliable event publishing to prevent data inconsistencies when services or message brokers fail

Introduction & Requirements Analysis

E-commerce platforms face a fundamental tension: product catalogs need to serve millions of reads per second with sub-100ms latency, while order processing demands strong consistency guarantees that can survive partial system failures. Getting this balance wrong means either slow browsing experiences that kill conversions or lost orders that kill your business.

Functional requirements for our platform include product browsing with search and filtering, shopping cart management, checkout with inventory reservation, and order tracking. Non-functional requirements demand horizontal scalability to handle traffic spikes (think Black Friday), high availability (99.9%+ uptime), and eventual consistency between services with strong consistency within transaction boundaries.

The architecture decisions we make here will determine whether your platform gracefully handles 10x traffic spikes or falls over during your biggest sales event.

High-Level System Architecture

We’ll use a microservices architecture with clear service boundaries. The Product Catalog Service owns all product data and search functionality. The Order Service manages the order lifecycle from cart to fulfillment. The Inventory Service tracks stock levels and handles reservations.

graph TB
    subgraph "Client Layer"
        Web[Web App]
        Mobile[Mobile App]
    end
    
    subgraph "Edge Layer"
        CDN[CDN - Product Images]
        Gateway[API Gateway]
    end
    
    subgraph "Service Layer"
        Catalog[Product Catalog Service]
        Order[Order Service]
        Inventory[Inventory Service]
        User[User Service]
    end
    
    subgraph "Data Layer"
        CatalogDB[(Catalog DB - PostgreSQL)]
        OrderDB[(Order DB - PostgreSQL)]
        InventoryDB[(Inventory DB - PostgreSQL)]
        Redis[(Redis Cache)]
        ES[(Elasticsearch)]
        Kafka[Kafka]
    end
    
    Web --> CDN
    Mobile --> CDN
    Web --> Gateway
    Mobile --> Gateway
    Gateway --> Catalog
    Gateway --> Order
    Gateway --> Inventory
    Catalog --> CatalogDB
    Catalog --> Redis
    Catalog --> ES
    Order --> OrderDB
    Order --> Kafka
    Inventory --> InventoryDB
    Inventory --> Kafka

Services communicate synchronously for queries (REST/gRPC) and asynchronously for state changes (Kafka events). This hybrid approach gives us fast reads while ensuring reliable event processing for critical operations.

// Service interface definitions
interface ProductCatalogService {
  getProduct(id: string): Promise<Product>;
  searchProducts(query: SearchQuery): Promise<PaginatedResult<Product>>;
  getProductsByCategory(categoryId: string): Promise<Product[]>;
}

interface OrderService {
  createOrder(request: CreateOrderRequest): Promise<Order>;
  getOrder(id: string): Promise<Order>;
  cancelOrder(id: string): Promise<void>;
}

interface InventoryService {
  checkAvailability(productId: string, quantity: number): Promise<boolean>;
  reserveInventory(reservationId: string, items: LineItem[]): Promise<Reservation>;
  confirmReservation(reservationId: string): Promise<void>;
  releaseReservation(reservationId: string): Promise<void>;
}

Product Catalog Service Design

The catalog is read-heavy—expect 1000:1 read-to-write ratios. Design accordingly.

-- PostgreSQL schema for products
CREATE TABLE categories (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(255) NOT NULL,
    slug VARCHAR(255) UNIQUE NOT NULL,
    parent_id UUID REFERENCES categories(id),
    path LTREE NOT NULL, -- For hierarchical queries
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE products (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sku VARCHAR(100) UNIQUE NOT NULL,
    name VARCHAR(500) NOT NULL,
    description TEXT,
    category_id UUID REFERENCES categories(id),
    base_price DECIMAL(10,2) NOT NULL,
    status VARCHAR(50) DEFAULT 'draft',
    attributes JSONB DEFAULT '{}',
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE product_variants (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    product_id UUID REFERENCES products(id) ON DELETE CASCADE,
    sku VARCHAR(100) UNIQUE NOT NULL,
    name VARCHAR(255) NOT NULL,
    price_modifier DECIMAL(10,2) DEFAULT 0,
    attributes JSONB NOT NULL, -- {"size": "L", "color": "blue"}
    image_urls TEXT[]
);

CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_products_status ON products(status);
CREATE INDEX idx_products_attributes ON products USING GIN(attributes);

The repository pattern with caching provides clean separation and performance:

class ProductRepository {
  constructor(
    private db: Database,
    private cache: RedisCache,
    private searchClient: ElasticsearchClient
  ) {}

  async findById(id: string): Promise<Product | null> {
    const cacheKey = `product:${id}`;
    
    // Check cache first
    const cached = await this.cache.get<Product>(cacheKey);
    if (cached) return cached;
    
    // Query database
    const result = await this.db.query<ProductRow>(
      `SELECT p.*, array_agg(pv.*) as variants
       FROM products p
       LEFT JOIN product_variants pv ON pv.product_id = p.id
       WHERE p.id = $1 AND p.status = 'active'
       GROUP BY p.id`,
      [id]
    );
    
    if (!result.rows[0]) return null;
    
    const product = this.mapToProduct(result.rows[0]);
    
    // Cache for 5 minutes
    await this.cache.set(cacheKey, product, 300);
    
    return product;
  }

  async search(query: SearchQuery): Promise<PaginatedResult<Product>> {
    // Use Elasticsearch for complex searches
    const esQuery = {
      bool: {
        must: [
          { match: { status: 'active' } },
          query.text ? { multi_match: { 
            query: query.text, 
            fields: ['name^3', 'description'] 
          }} : { match_all: {} }
        ],
        filter: [
          query.categoryId ? { term: { category_id: query.categoryId } } : null,
          query.minPrice ? { range: { base_price: { gte: query.minPrice } } } : null,
          query.maxPrice ? { range: { base_price: { lte: query.maxPrice } } } : null
        ].filter(Boolean)
      }
    };

    const response = await this.searchClient.search({
      index: 'products',
      body: { query: esQuery, from: query.offset, size: query.limit }
    });

    return {
      items: response.hits.hits.map(hit => hit._source as Product),
      total: response.hits.total.value,
      offset: query.offset,
      limit: query.limit
    };
  }
}

Order Service & State Management

Orders follow a strict lifecycle. Model this explicitly with a state machine:

enum OrderStatus {
  PENDING = 'pending',
  INVENTORY_RESERVED = 'inventory_reserved',
  PAYMENT_PENDING = 'payment_pending',
  PAYMENT_CONFIRMED = 'payment_confirmed',
  SHIPPED = 'shipped',
  DELIVERED = 'delivered',
  CANCELLED = 'cancelled',
  REFUNDED = 'refunded'
}

const ORDER_TRANSITIONS: Record<OrderStatus, OrderStatus[]> = {
  [OrderStatus.PENDING]: [OrderStatus.INVENTORY_RESERVED, OrderStatus.CANCELLED],
  [OrderStatus.INVENTORY_RESERVED]: [OrderStatus.PAYMENT_PENDING, OrderStatus.CANCELLED],
  [OrderStatus.PAYMENT_PENDING]: [OrderStatus.PAYMENT_CONFIRMED, OrderStatus.CANCELLED],
  [OrderStatus.PAYMENT_CONFIRMED]: [OrderStatus.SHIPPED, OrderStatus.REFUNDED],
  [OrderStatus.SHIPPED]: [OrderStatus.DELIVERED],
  [OrderStatus.DELIVERED]: [OrderStatus.REFUNDED],
  [OrderStatus.CANCELLED]: [],
  [OrderStatus.REFUNDED]: []
};

class Order {
  constructor(
    public readonly id: string,
    public status: OrderStatus,
    public readonly items: OrderItem[],
    public readonly customerId: string,
    private events: DomainEvent[] = []
  ) {}

  transition(newStatus: OrderStatus): void {
    const allowedTransitions = ORDER_TRANSITIONS[this.status];
    if (!allowedTransitions.includes(newStatus)) {
      throw new InvalidStateTransitionError(this.status, newStatus);
    }
    
    const previousStatus = this.status;
    this.status = newStatus;
    this.events.push(new OrderStatusChangedEvent(this.id, previousStatus, newStatus));
  }

  getUncommittedEvents(): DomainEvent[] {
    return [...this.events];
  }

  clearEvents(): void {
    this.events = [];
  }
}

The checkout flow spans multiple services. Use the saga pattern to coordinate:

class CheckoutSaga {
  private steps: SagaStep[] = [];

  constructor(
    private inventoryService: InventoryService,
    private paymentService: PaymentService,
    private orderRepository: OrderRepository
  ) {}

  async execute(order: Order): Promise<void> {
    try {
      // Step 1: Reserve inventory
      const reservation = await this.inventoryService.reserveInventory(
        order.id,
        order.items
      );
      this.steps.push({
        name: 'inventory_reservation',
        compensate: () => this.inventoryService.releaseReservation(reservation.id)
      });
      order.transition(OrderStatus.INVENTORY_RESERVED);

      // Step 2: Process payment
      const payment = await this.paymentService.charge(
        order.customerId,
        order.total
      );
      this.steps.push({
        name: 'payment',
        compensate: () => this.paymentService.refund(payment.id)
      });
      order.transition(OrderStatus.PAYMENT_CONFIRMED);

      // Step 3: Confirm reservation
      await this.inventoryService.confirmReservation(reservation.id);
      
      await this.orderRepository.save(order);
      
    } catch (error) {
      await this.compensate();
      throw error;
    }
  }

  private async compensate(): Promise<void> {
    // Execute compensating actions in reverse order
    for (const step of this.steps.reverse()) {
      try {
        await step.compensate();
      } catch (compensationError) {
        // Log and continue - manual intervention may be needed
        console.error(`Compensation failed for ${step.name}:`, compensationError);
      }
    }
  }
}

For idempotent order creation, use client-generated idempotency keys:

async createOrder(request: CreateOrderRequest): Promise<Order> {
  // Check for existing order with same idempotency key
  const existing = await this.orderRepository.findByIdempotencyKey(
    request.idempotencyKey
  );
  if (existing) return existing;

  const order = new Order(
    generateId(),
    OrderStatus.PENDING,
    request.items,
    request.customerId
  );

  // Store with idempotency key
  await this.orderRepository.save(order, request.idempotencyKey);
  
  return order;
}

Data Consistency & Event-Driven Communication

The outbox pattern ensures events are published reliably:

// Domain events
interface DomainEvent {
  eventId: string;
  eventType: string;
  aggregateId: string;
  occurredAt: Date;
  payload: unknown;
}

class OrderPlacedEvent implements DomainEvent {
  eventId = generateId();
  eventType = 'order.placed';
  occurredAt = new Date();
  
  constructor(
    public aggregateId: string,
    public payload: { items: OrderItem[]; customerId: string; total: number }
  ) {}
}

// Outbox pattern implementation
class OutboxRepository {
  async saveWithEvents(
    order: Order,
    transaction: Transaction
  ): Promise<void> {
    // Save order
    await transaction.query(
      `INSERT INTO orders (id, status, customer_id, items, total)
       VALUES ($1, $2, $3, $4, $5)`,
      [order.id, order.status, order.customerId, order.items, order.total]
    );

    // Save events to outbox in same transaction
    for (const event of order.getUncommittedEvents()) {
      await transaction.query(
        `INSERT INTO outbox (id, event_type, aggregate_id, payload, created_at)
         VALUES ($1, $2, $3, $4, $5)`,
        [event.eventId, event.eventType, event.aggregateId, event.payload, event.occurredAt]
      );
    }
    
    order.clearEvents();
  }
}

// Background worker publishes events
class OutboxPublisher {
  async processOutbox(): Promise<void> {
    const events = await this.db.query(
      `SELECT * FROM outbox WHERE published_at IS NULL 
       ORDER BY created_at LIMIT 100 FOR UPDATE SKIP LOCKED`
    );

    for (const event of events.rows) {
      await this.kafka.publish(event.event_type, event.payload);
      await this.db.query(
        `UPDATE outbox SET published_at = NOW() WHERE id = $1`,
        [event.id]
      );
    }
  }
}

Scaling Strategies & Performance

For the read-heavy catalog, implement CQRS to separate read and write models:

// Write model - handles commands
class ProductCommandHandler {
  constructor(private repository: ProductRepository) {}

  async handle(command: UpdateProductCommand): Promise<void> {
    const product = await this.repository.findById(command.productId);
    product.update(command.changes);
    await this.repository.save(product);
    // Event published via outbox triggers read model update
  }
}

// Read model - optimized for queries
class ProductQueryHandler {
  constructor(
    private readReplica: Database,
    private cache: RedisCache
  ) {}

  async getProductDetails(id: string): Promise<ProductDetailView> {
    const cached = await this.cache.get(`product-view:${id}`);
    if (cached) return cached;

    // Query from read replica with denormalized data
    const result = await this.readReplica.query(
      `SELECT * FROM product_detail_view WHERE id = $1`,
      [id]
    );
    
    const view = result.rows[0];
    await this.cache.set(`product-view:${id}`, view, 300);
    return view;
  }
}

Configure read replicas at the database level:

# PostgreSQL replica configuration
primary:
  host: db-primary.internal
  max_connections: 200
  
replicas:
  - host: db-replica-1.internal
    max_connections: 500
  - host: db-replica-2.internal
    max_connections: 500

routing:
  read_queries: replicas  # Round-robin across replicas
  write_queries: primary

Summary & Further Considerations

The architecture we’ve designed separates concerns appropriately: the Product Catalog Service optimizes for read performance with caching, search integration, and read replicas. The Order Service prioritizes consistency with explicit state machines, saga-based transactions, and the outbox pattern for reliable event publishing.

Key decisions to remember: use synchronous communication for queries, asynchronous events for state changes, and always implement idempotency for operations that modify state.

For production systems, you’ll also need to address payment gateway integration (treat it as another saga step), recommendation engines (consume order events to build user preferences), and multi-region deployment (consider CRDTs for shopping carts, strong consistency for orders). Each of these deserves its own deep dive, but the foundation we’ve built here will support those extensions cleanly.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.