Design a Ride-Sharing Service: Location-Based Matching
Before diving into architecture, let's establish what we're building. A ride-sharing service needs to match riders with nearby drivers in real-time, track locations continuously, and manage the full...
Key Insights
- Geohashing provides the best balance of simplicity and performance for proximity queries at scale, enabling efficient driver lookups with Redis’s built-in geospatial commands
- An expanding ring search pattern prevents over-fetching while guaranteeing you find the optimal driver, starting with a small radius and widening only when necessary
- Distributed locks with short TTLs are essential for preventing double-assignment of drivers to concurrent ride requests, but your system must gracefully handle lock failures
System Requirements & Scale
Before diving into architecture, let’s establish what we’re building. A ride-sharing service needs to match riders with nearby drivers in real-time, track locations continuously, and manage the full trip lifecycle.
Functional requirements:
- Riders request rides with pickup and dropoff locations
- System matches riders with available drivers within acceptable distance
- Real-time location tracking for both parties during trips
- Trip state management (requested, matched, pickup, in-progress, completed)
Non-functional requirements:
- Matching latency under 1 second for 99th percentile
- 99.9% availability for core matching functionality
- Eventual consistency acceptable for location data (stale by up to 5 seconds is fine)
- Strong consistency required for trip state and driver assignment
Scale assumptions:
- 10 million daily active users across 100 cities
- 500,000 concurrent drivers during peak hours
- 50,000 location updates per second
- 10,000 ride requests per second at peak
These numbers drive every architectural decision that follows.
High-Level Architecture
The system decomposes into four core services, connected through an event-driven architecture:
// Core service interfaces
type LocationService interface {
UpdateDriverLocation(ctx context.Context, driverID string, lat, lng float64) error
FindNearbyDrivers(ctx context.Context, lat, lng float64, radiusKm float64) ([]Driver, error)
GetDriverLocation(ctx context.Context, driverID string) (*Location, error)
}
type MatchingService interface {
RequestMatch(ctx context.Context, req *RideRequest) (*MatchResult, error)
ConfirmMatch(ctx context.Context, matchID string, driverID string) error
CancelMatch(ctx context.Context, matchID string) error
}
type TripService interface {
CreateTrip(ctx context.Context, riderID, driverID string, pickup, dropoff Location) (*Trip, error)
UpdateTripState(ctx context.Context, tripID string, state TripState) error
GetTrip(ctx context.Context, tripID string) (*Trip, error)
}
type NotificationService interface {
NotifyDriver(ctx context.Context, driverID string, msg *DriverNotification) error
NotifyRider(ctx context.Context, riderID string, msg *RiderNotification) error
}
Message queues (Kafka or SQS) decouple these services. When a match is confirmed, the Matching Service publishes an event that Trip Service consumes to create the trip record. This prevents cascading failures and allows independent scaling.
Geospatial Data Storage & Indexing
The core challenge is answering “which drivers are near this location?” thousands of times per second. Three main approaches exist:
Geohashing encodes latitude/longitude into a string where shared prefixes indicate proximity. A geohash like 9q8yy represents a rectangular cell; all points in that cell share the prefix.
Quadtrees recursively subdivide space into four quadrants, creating a tree structure optimized for spatial queries.
R-trees group nearby objects into bounding rectangles, commonly used in databases like PostGIS.
For ride-sharing, geohashing wins on simplicity and Redis compatibility. Here’s why: Redis provides native geospatial commands that use geohashing internally, giving us O(log N + M) proximity queries where M is the number of results.
import redis
import geohash
class DriverLocationStore:
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
self.key = "drivers:locations"
def update_driver_location(self, driver_id: str, lat: float, lng: float):
"""Store driver location with geospatial indexing."""
# GEOADD uses geohashing internally
self.redis.geoadd(self.key, (lng, lat, driver_id))
# Also store in a hash for quick lookups and metadata
self.redis.hset(f"driver:{driver_id}", mapping={
"lat": lat,
"lng": lng,
"updated_at": int(time.time()),
"geohash": geohash.encode(lat, lng, precision=7)
})
def find_nearby_drivers(self, lat: float, lng: float, radius_km: float, limit: int = 50):
"""Find drivers within radius, sorted by distance."""
# GEORADIUS returns drivers sorted by distance
results = self.redis.georadius(
self.key,
longitude=lng,
latitude=lat,
radius=radius_km,
unit="km",
withdist=True,
withcoord=True,
count=limit,
sort="ASC"
)
return [
{
"driver_id": r[0].decode(),
"distance_km": r[1],
"lng": r[2][0],
"lat": r[2][1]
}
for r in results
]
Real-Time Location Tracking
Drivers send location updates continuously. WebSockets provide the persistent connection, but you need strategies to handle the firehose of data.
Batching: Collect updates for 1-2 seconds before processing. A driver moving at 30 mph travels about 50 meters in 2 seconds—acceptable granularity.
Throttling: Ignore updates if the driver hasn’t moved significantly (less than 10 meters) since the last recorded position.
State filtering: Only track drivers who are available. Drivers on active trips update a different data structure.
class LocationUpdateHandler:
def __init__(self, location_store: DriverLocationStore, min_distance_meters: float = 10):
self.store = location_store
self.min_distance = min_distance_meters
self.last_locations: Dict[str, Tuple[float, float, float]] = {} # driver_id -> (lat, lng, timestamp)
async def handle_location_update(self, driver_id: str, lat: float, lng: float, driver_state: str):
"""Process incoming location update with filtering."""
# Only index available drivers for matching
if driver_state != "available":
# Store location for trip tracking, but not in matching index
await self._store_trip_location(driver_id, lat, lng)
return
# Check if driver has moved enough to warrant an update
if driver_id in self.last_locations:
last_lat, last_lng, last_time = self.last_locations[driver_id]
distance = self._haversine_distance(last_lat, last_lng, lat, lng)
if distance < self.min_distance:
return # Skip update, driver hasn't moved enough
# Update location store
self.store.update_driver_location(driver_id, lat, lng)
self.last_locations[driver_id] = (lat, lng, time.time())
# Assign to geohash bucket for partitioning
bucket = geohash.encode(lat, lng, precision=4) # ~40km cells
await self._publish_to_bucket(bucket, driver_id, lat, lng)
def _haversine_distance(self, lat1: float, lng1: float, lat2: float, lng2: float) -> float:
"""Calculate distance between two points in meters."""
R = 6371000 # Earth's radius in meters
phi1, phi2 = math.radians(lat1), math.radians(lat2)
delta_phi = math.radians(lat2 - lat1)
delta_lambda = math.radians(lng2 - lng1)
a = math.sin(delta_phi/2)**2 + math.cos(phi1) * math.cos(phi2) * math.sin(delta_lambda/2)**2
return 2 * R * math.atan2(math.sqrt(a), math.sqrt(1-a))
Matching Algorithm Design
The matching algorithm uses an expanding ring search. Start with a small radius (500 meters), find candidates, and only expand if you don’t have enough quality options.
class MatchingEngine:
INITIAL_RADIUS_KM = 0.5
MAX_RADIUS_KM = 10.0
RADIUS_INCREMENT = 0.5
MIN_CANDIDATES = 3
def __init__(self, location_store: DriverLocationStore, driver_service: DriverService):
self.location_store = location_store
self.driver_service = driver_service
async def find_best_driver(self, pickup_lat: float, pickup_lng: float) -> Optional[ScoredDriver]:
"""Find the best available driver using expanding ring search."""
radius = self.INITIAL_RADIUS_KM
best_candidates = []
while radius <= self.MAX_RADIUS_KM:
nearby = self.location_store.find_nearby_drivers(
pickup_lat, pickup_lng, radius, limit=50
)
# Filter to only available drivers
available = await self._filter_available(nearby)
if len(available) >= self.MIN_CANDIDATES:
# Score and rank candidates
scored = await self._score_candidates(available, pickup_lat, pickup_lng)
return max(scored, key=lambda x: x.score)
best_candidates.extend(available)
radius += self.RADIUS_INCREMENT
# Return best from whatever we found, or None
if best_candidates:
scored = await self._score_candidates(best_candidates, pickup_lat, pickup_lng)
return max(scored, key=lambda x: x.score)
return None
async def _score_candidates(
self, candidates: List[dict], pickup_lat: float, pickup_lng: float
) -> List[ScoredDriver]:
"""Score drivers based on multiple factors."""
scored = []
for candidate in candidates:
driver_id = candidate["driver_id"]
driver_info = await self.driver_service.get_driver(driver_id)
# Weights for scoring factors
distance_score = 1.0 / (1.0 + candidate["distance_km"]) # Closer is better
rating_score = driver_info.rating / 5.0 # Normalize to 0-1
acceptance_score = driver_info.acceptance_rate # Already 0-1
# Weighted combination
total_score = (
0.5 * distance_score +
0.3 * rating_score +
0.2 * acceptance_score
)
scored.append(ScoredDriver(
driver_id=driver_id,
distance_km=candidate["distance_km"],
score=total_score,
eta_minutes=self._estimate_eta(candidate["distance_km"])
))
return scored
Scaling & Partitioning Strategy
Geographic sharding is natural for ride-sharing. Drivers in New York never match with riders in Los Angeles. Each city (or region) gets its own partition.
class ShardRouter:
def __init__(self, shard_config: Dict[str, List[str]]):
# Maps geohash prefixes to shard addresses
# e.g., {"9q8": ["redis-sf-1", "redis-sf-2"], "dr5": ["redis-nyc-1"]}
self.shard_map = shard_config
self.connections: Dict[str, redis.Redis] = {}
def get_shard_for_location(self, lat: float, lng: float) -> redis.Redis:
"""Route to correct shard based on location."""
gh = geohash.encode(lat, lng, precision=3) # ~150km cells
# Find matching shard
for prefix, shards in self.shard_map.items():
if gh.startswith(prefix):
# Use consistent hashing within the shard cluster
shard_idx = hash(gh) % len(shards)
return self._get_connection(shards[shard_idx])
# Default shard for unmapped regions
return self._get_connection("redis-default")
def find_nearby_drivers_cross_shard(
self, lat: float, lng: float, radius_km: float
) -> List[dict]:
"""Handle queries that might span shard boundaries."""
# Get all geohash cells that intersect with the search radius
center_gh = geohash.encode(lat, lng, precision=3)
neighbors = geohash.neighbors(center_gh)
all_cells = [center_gh] + list(neighbors.values())
# Query each relevant shard
results = []
seen_shards = set()
for cell in all_cells:
shard = self._get_shard_for_geohash(cell)
if shard not in seen_shards:
seen_shards.add(shard)
shard_results = self._query_shard(shard, lat, lng, radius_km)
results.extend(shard_results)
# Deduplicate and sort by distance
return sorted(results, key=lambda x: x["distance_km"])
Failure Handling & Consistency
The most critical consistency requirement: never assign one driver to two riders simultaneously. Distributed locks solve this.
class DriverReservationLock:
LOCK_TTL_SECONDS = 30 # Lock expires if not released
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
async def reserve_driver(self, driver_id: str, ride_request_id: str) -> bool:
"""Attempt to reserve a driver atomically."""
lock_key = f"driver_lock:{driver_id}"
# SET NX with expiration - atomic operation
acquired = self.redis.set(
lock_key,
ride_request_id,
nx=True, # Only set if not exists
ex=self.LOCK_TTL_SECONDS
)
if not acquired:
# Driver already reserved by another request
return False
# Double-check driver is still available (belt and suspenders)
driver_state = await self._get_driver_state(driver_id)
if driver_state != "available":
self.redis.delete(lock_key)
return False
return True
async def confirm_reservation(self, driver_id: str, ride_request_id: str) -> bool:
"""Confirm reservation and update driver state."""
lock_key = f"driver_lock:{driver_id}"
# Verify we still hold the lock
current_holder = self.redis.get(lock_key)
if current_holder != ride_request_id.encode():
return False
# Update driver state atomically
await self._update_driver_state(driver_id, "matched")
# Release the lock
self.redis.delete(lock_key)
return True
async def release_reservation(self, driver_id: str, ride_request_id: str):
"""Release reservation if driver declines or timeout."""
lock_key = f"driver_lock:{driver_id}"
# Only release if we hold the lock (Lua script for atomicity)
script = """
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
"""
self.redis.eval(script, 1, lock_key, ride_request_id)
For trip creation idempotency, use the ride request ID as a natural idempotency key. Before creating a trip, check if one already exists for that request. This handles retries gracefully without creating duplicate trips.
The combination of geospatial indexing, expanding ring search, and distributed locking gives you a matching system that’s fast, fair, and consistent. Start with this foundation, then optimize based on your specific traffic patterns and geographic distribution.