Design an Authentication System: SSO and OAuth

Key Insights

OAuth 2.0 handles authorization (what you can access), while SSO with OIDC handles authentication (who you are)—most production systems need both working together
The authorization code flow with PKCE should be your default choice for web and mobile applications; client credentials flow is only for machine-to-machine communication
Store tokens in HTTP-only cookies for web apps, never in localStorage, and implement token rotation with short-lived access tokens (15 minutes) paired with longer refresh tokens (7 days)

The Authentication Landscape

Every application eventually faces the same question: how do we know who our users are, and what should they be allowed to do? These are two distinct problems. Authentication verifies identity. Authorization determines access rights. Conflating them leads to security holes and architectural headaches.

OAuth 2.0 is an authorization framework—it grants third-party applications limited access to resources without exposing credentials. Single Sign-On (SSO) is an authentication pattern—it lets users authenticate once and access multiple applications. OpenID Connect (OIDC) bridges both worlds by adding an identity layer on top of OAuth 2.0.

Use OAuth 2.0 when you need to grant API access to third parties or between your own services. Use SSO when you have multiple applications and want unified user sessions. Use OIDC when you need both authentication and authorization in a standards-compliant way. Most enterprise systems end up implementing all three.

OAuth 2.0 Fundamentals

OAuth 2.0 defines four roles: resource owner (the user), client (your application), authorization server (issues tokens), and resource server (hosts protected APIs). The framework supports multiple grant types for different scenarios.

The authorization code flow is the standard for web applications. The client redirects users to the authorization server, which authenticates them and returns an authorization code. The client exchanges this code for tokens server-side, keeping credentials secure.

┌──────────┐                               ┌───────────────┐
│  Client  │──(1) Authorization Request───▶│   Auth Server │
│  (App)   │                               │               │
└──────────┘                               └───────────────┘
     │                                            │
     │◀────────(2) Authorization Code─────────────│
     │                                            │
     │───(3) Code + Client Secret────────────────▶│
     │                                            │
     │◀────────(4) Access Token + Refresh Token───│
     │                                            │
     │───(5) API Request + Access Token──────────▶│ Resource
     │                                            │ Server
     │◀────────(6) Protected Resource─────────────│

For public clients (mobile apps, SPAs), use PKCE (Proof Key for Code Exchange). It prevents authorization code interception attacks by requiring a code verifier that only the legitimate client possesses.

import hashlib
import base64
import secrets

def generate_pkce_pair():
    # Generate a cryptographically random code verifier
    code_verifier = secrets.token_urlsafe(32)
    
    # Create the code challenge using S256 method
    code_challenge = base64.urlsafe_b64encode(
        hashlib.sha256(code_verifier.encode()).digest()
    ).decode().rstrip('=')
    
    return code_verifier, code_challenge

# During authorization request
verifier, challenge = generate_pkce_pair()
auth_url = (
    f"{AUTH_SERVER}/authorize?"
    f"response_type=code&"
    f"client_id={CLIENT_ID}&"
    f"redirect_uri={REDIRECT_URI}&"
    f"code_challenge={challenge}&"
    f"code_challenge_method=S256&"
    f"scope=openid profile email"
)

Access tokens are short-lived credentials (typically 15-60 minutes) that grant API access. Refresh tokens are long-lived and used to obtain new access tokens without user interaction. JWTs are commonly used for access tokens because they’re self-contained and can be validated without database lookups.

import jwt
from datetime import datetime, timedelta

def create_access_token(user_id: str, scopes: list[str], secret: str) -> str:
    payload = {
        "sub": user_id,
        "iat": datetime.utcnow(),
        "exp": datetime.utcnow() + timedelta(minutes=15),
        "scope": " ".join(scopes),
        "iss": "https://auth.yourcompany.com",
        "aud": "https://api.yourcompany.com"
    }
    return jwt.encode(payload, secret, algorithm="RS256")

def validate_access_token(token: str, public_key: str) -> dict:
    return jwt.decode(
        token,
        public_key,
        algorithms=["RS256"],
        audience="https://api.yourcompany.com",
        issuer="https://auth.yourcompany.com"
    )

Single Sign-On Architecture

SSO eliminates the need for users to authenticate separately with each application. When a user logs into one service, they gain access to all connected services within the same trust domain.

Two protocols dominate SSO implementations: SAML 2.0 and OpenID Connect. SAML uses XML-based assertions and is prevalent in enterprise environments. OIDC uses JSON and JWTs, making it simpler to implement and better suited for modern applications.

The Identity Provider (IdP) authenticates users and issues identity assertions. Service Providers (SP) consume these assertions to grant access. In OIDC terminology, the IdP is the OpenID Provider, and SPs are Relying Parties.

OIDC discovery simplifies client configuration. Clients fetch provider metadata from a well-known endpoint:

import httpx

async def discover_oidc_config(issuer: str) -> dict:
    discovery_url = f"{issuer}/.well-known/openid-configuration"
    async with httpx.AsyncClient() as client:
        response = await client.get(discovery_url)
        config = response.json()
    
    # Returns endpoints for authorization, token, userinfo, jwks, etc.
    return {
        "authorization_endpoint": config["authorization_endpoint"],
        "token_endpoint": config["token_endpoint"],
        "userinfo_endpoint": config["userinfo_endpoint"],
        "jwks_uri": config["jwks_uri"],
        "supported_scopes": config["scopes_supported"],
        "supported_claims": config["claims_supported"]
    }

Session management across services requires careful coordination. When a user logs out of one application, should they be logged out everywhere? Single Logout (SLO) propagates logout events across all connected services, but it adds complexity and potential failure points.

System Design: Core Components

A production authentication system requires several interconnected components:

┌─────────────────────────────────────────────────────────────┐
│                       API Gateway                           │
│              (Token validation, rate limiting)              │
└─────────────────────────┬───────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────┐
│                    Auth Service                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │   OAuth     │  │   Session   │  │   User Directory    │  │
│  │   Handler   │  │   Manager   │  │   Integration       │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
└─────────────────────────┬───────────────────────────────────┘
                          │
          ┌───────────────┼───────────────┐
          ▼               ▼               ▼
    ┌──────────┐   ┌──────────┐   ┌──────────────┐
    │  Redis   │   │ Postgres │   │  LDAP/IdP    │
    │ (tokens) │   │ (users)  │   │  (external)  │
    └──────────┘   └──────────┘   └──────────────┘

The token store handles refresh tokens and revocation lists. Redis works well for its speed and built-in expiration:

from redis import Redis
from datetime import timedelta
import json

class TokenStore:
    def __init__(self, redis: Redis):
        self.redis = redis
    
    def store_refresh_token(
        self, 
        token_id: str, 
        user_id: str, 
        client_id: str,
        expires_in: int = 604800  # 7 days
    ):
        token_data = {
            "user_id": user_id,
            "client_id": client_id,
            "created_at": datetime.utcnow().isoformat(),
            "rotated_from": None
        }
        self.redis.setex(
            f"refresh_token:{token_id}",
            timedelta(seconds=expires_in),
            json.dumps(token_data)
        )
    
    def revoke_token(self, token_id: str):
        self.redis.delete(f"refresh_token:{token_id}")
    
    def revoke_all_user_tokens(self, user_id: str):
        # Maintain a set of user's active tokens
        token_ids = self.redis.smembers(f"user_tokens:{user_id}")
        for tid in token_ids:
            self.redis.delete(f"refresh_token:{tid.decode()}")
        self.redis.delete(f"user_tokens:{user_id}")

Security Considerations

Token security requires defense in depth. Never store tokens in localStorage—it’s vulnerable to XSS attacks. Use HTTP-only, secure cookies with SameSite attributes:

from fastapi import Response

def set_token_cookie(response: Response, token: str, max_age: int):
    response.set_cookie(
        key="refresh_token",
        value=token,
        httponly=True,
        secure=True,  # HTTPS only
        samesite="strict",
        max_age=max_age,
        path="/auth"  # Limit cookie scope
    )

Implement token rotation: when a refresh token is used, issue a new one and invalidate the old. This limits the damage from token theft:

async def rotate_refresh_token(old_token: str) -> tuple[str, str]:
    token_data = await token_store.get(old_token)
    if not token_data:
        raise InvalidTokenError("Token not found or expired")
    
    # Check if token was already used (replay attack)
    if token_data.get("used"):
        # Potential token theft - revoke all user tokens
        await token_store.revoke_all_user_tokens(token_data["user_id"])
        raise SecurityError("Token reuse detected")
    
    # Mark old token as used (keep briefly for replay detection)
    await token_store.mark_used(old_token)
    
    # Issue new tokens
    new_access = create_access_token(token_data["user_id"])
    new_refresh = create_refresh_token(token_data["user_id"])
    
    return new_access, new_refresh

Rate limiting prevents brute-force attacks. Apply strict limits to login endpoints and token endpoints:

from fastapi import Request
from redis import Redis

class RateLimiter:
    def __init__(self, redis: Redis):
        self.redis = redis
    
    async def check_rate_limit(
        self, 
        key: str, 
        max_requests: int, 
        window_seconds: int
    ) -> bool:
        current = self.redis.incr(key)
        if current == 1:
            self.redis.expire(key, window_seconds)
        return current <= max_requests

# Usage in middleware
async def login_rate_limit(request: Request):
    client_ip = request.client.host
    if not await limiter.check_rate_limit(
        f"login:{client_ip}", 
        max_requests=5, 
        window_seconds=300
    ):
        raise HTTPException(429, "Too many login attempts")

Scaling and High Availability

Stateless authentication using JWTs scales horizontally—any server can validate tokens without shared state. However, you lose immediate revocation capability. Implement a revocation list check for sensitive operations:

class TokenValidator:
    def __init__(self, redis: Redis, public_key: str):
        self.redis = redis
        self.public_key = public_key
        self._jwks_cache = None
        self._cache_expiry = None
    
    async def validate(self, token: str) -> dict:
        # Decode and verify signature
        payload = jwt.decode(token, self.public_key, algorithms=["RS256"])
        
        # Check revocation list (cached in Redis)
        jti = payload.get("jti")
        if jti and await self.is_revoked(jti):
            raise InvalidTokenError("Token has been revoked")
        
        return payload
    
    async def is_revoked(self, jti: str) -> bool:
        return self.redis.sismember("revoked_tokens", jti)

For cross-region deployments, replicate your token store and use region-aware routing. Consider eventual consistency implications—a token revoked in one region might remain valid briefly in another.

Implementation Checklist and Common Pitfalls

Before launching, verify these essentials:

PKCE enabled for all public clients
Token rotation implemented for refresh tokens
Secure cookie settings (HttpOnly, Secure, SameSite)
Rate limiting on all authentication endpoints
Audit logging for login attempts, token issuance, and revocations
Scope validation on every protected endpoint

Common mistakes that lead to breaches:

Storing tokens in localStorage: Use HTTP-only cookies instead
Long-lived access tokens: Keep them under 15 minutes; use refresh tokens for longevity
Missing audience validation: Always verify the aud claim matches your API
Implicit flow for SPAs: Use authorization code with PKCE instead
Trusting client-provided redirect URIs: Validate against a whitelist

Choose OIDC for new implementations—it’s simpler than SAML and provides both authentication and authorization. Reserve SAML for enterprise integrations that require it. Use OAuth 2.0 client credentials flow only for service-to-service communication where no user is involved.

Authentication is foundational infrastructure. Get it wrong, and nothing else matters. Get it right, and you’ve built a secure foundation for everything that follows.