Open Redirect: URL Validation Strategies
Open redirects occur when an application accepts user-controlled input and uses it to redirect users to an external URL without proper validation. They're classified as a significant vulnerability by...
Key Insights
- Open redirects are deceptively dangerous—they transform your trusted domain into a phishing launchpad and can facilitate OAuth token theft, making them far more severe than many developers realize.
- Allowlist-based validation is the only truly secure approach; regex patterns and blocklists inevitably fall to creative bypass techniques like unicode normalization, double encoding, and parser differentials.
- Defense requires understanding attacker methodology—testing your redirect validation against known bypass patterns is as important as implementing the validation itself.
Introduction to Open Redirect Vulnerabilities
Open redirects occur when an application accepts user-controlled input and uses it to redirect users to an external URL without proper validation. They’re classified as a significant vulnerability by OWASP and appear regularly in bug bounty reports, yet many developers dismiss them as low-severity issues.
This dismissal is a mistake. Open redirects enable sophisticated attacks:
Phishing amplification: An attacker sends https://yourbank.com/redirect?url=https://evil-bank.com. Victims see your legitimate domain, trust the link, and land on a credential-harvesting page.
OAuth token theft: During OAuth flows, attackers manipulate the redirect URI to capture authorization codes or tokens. Even with redirect URI validation on the OAuth provider side, open redirects in your application can chain into token theft.
Malware distribution: Security-conscious users check URLs before clicking. Your trusted domain becomes a vector for malware delivery.
The vulnerability is trivial to exploit and difficult to detect in production. Let’s fix that.
How Open Redirects Occur
Open redirects typically appear in three scenarios: post-login redirects, OAuth callbacks, and marketing/tracking systems. Here’s a vulnerable pattern I see constantly:
# Flask - VULNERABLE
from flask import Flask, request, redirect
app = Flask(__name__)
@app.route('/login', methods=['POST'])
def login():
# Authentication logic here...
user = authenticate(request.form['username'], request.form['password'])
if user:
# Dangerous: blindly trusting the 'next' parameter
next_url = request.args.get('next', '/')
return redirect(next_url)
return render_template('login.html', error='Invalid credentials')
// Express.js - VULNERABLE
app.get('/oauth/callback', (req, res) => {
const { code, state } = req.query;
// Exchange code for token...
const token = await exchangeCodeForToken(code);
// Dangerous: redirect destination from query parameter
const returnTo = req.query.return_to || '/dashboard';
res.redirect(returnTo);
});
An attacker crafts https://yourapp.com/login?next=https://evil.com/fake-login and sends it to victims. After legitimate authentication, users land on the attacker’s page—often a clone of your login page claiming the session expired.
Allowlist-Based Validation
The most secure approach is explicit allowlisting. You define exactly which destinations are permitted, and everything else is rejected.
from urllib.parse import urlparse
from typing import Optional, Set
class RedirectValidator:
def __init__(self, allowed_domains: Set[str], allow_subdomains: bool = False):
self.allowed_domains = {d.lower() for d in allowed_domains}
self.allow_subdomains = allow_subdomains
def is_allowed(self, url: str) -> bool:
try:
parsed = urlparse(url)
# Reject non-http(s) schemes
if parsed.scheme and parsed.scheme.lower() not in ('http', 'https'):
return False
# Allow relative URLs (no netloc)
if not parsed.netloc:
return True
host = parsed.netloc.lower().split(':')[0] # Remove port
# Exact domain match
if host in self.allowed_domains:
return True
# Subdomain match (if enabled)
if self.allow_subdomains:
for domain in self.allowed_domains:
if host.endswith('.' + domain):
return True
return False
except Exception:
return False
# Usage
validator = RedirectValidator(
allowed_domains={'myapp.com', 'auth.myapp.com'},
allow_subdomains=True
)
@app.route('/login', methods=['POST'])
def login():
user = authenticate(request.form['username'], request.form['password'])
if user:
next_url = request.args.get('next', '/')
if not validator.is_allowed(next_url):
next_url = '/' # Default to safe location
return redirect(next_url)
return render_template('login.html', error='Invalid credentials')
Be cautious with subdomain allowlisting. If attackers can create subdomains (via subdomain takeover or user-generated content), they can bypass your validation.
URL Parsing and Validation Techniques
Attackers exploit inconsistencies between how your validator parses URLs and how browsers interpret them. Here are bypass techniques you must handle:
// Comprehensive URL validator handling common bypasses
class SafeRedirectValidator {
constructor(allowedHosts) {
this.allowedHosts = new Set(allowedHosts.map(h => h.toLowerCase()));
}
isValidRedirect(url) {
if (!url || typeof url !== 'string') {
return false;
}
// Normalize and trim
url = url.trim();
// Block javascript: and data: schemes (case-insensitive, with encoding)
const schemePattern = /^[\s]*(?:j[\s]*a[\s]*v[\s]*a[\s]*s[\s]*c[\s]*r[\s]*i[\s]*p[\s]*t|d[\s]*a[\s]*t[\s]*a)[\s]*:/i;
if (schemePattern.test(url)) {
return false;
}
// Block protocol-relative URLs that could redirect externally
// Handles //evil.com, \/\/evil.com, /\/evil.com, etc.
const protocolRelativePattern = /^[\s]*[\/\\]{2}/;
if (protocolRelativePattern.test(url)) {
return false;
}
// Block URLs with credentials (user:pass@host)
// These can be used for phishing: https://trusted.com@evil.com
if (url.includes('@') && /^https?:\/\/[^\/]*@/.test(url)) {
return false;
}
try {
// Use URL constructor for parsing (handles encoding)
const parsed = new URL(url, 'https://placeholder.local');
// If the URL had an explicit host, validate it
if (url.match(/^https?:\/\//i)) {
const host = parsed.hostname.toLowerCase();
if (!this.allowedHosts.has(host)) {
return false;
}
}
// Ensure path doesn't contain dangerous sequences after normalization
const normalizedPath = decodeURIComponent(parsed.pathname);
if (normalizedPath.includes('//') || normalizedPath.includes('\\')) {
return false;
}
return true;
} catch (e) {
// URL parsing failed - reject
return false;
}
}
}
// Usage
const validator = new SafeRedirectValidator(['myapp.com', 'www.myapp.com']);
// These should all return false:
console.log(validator.isValidRedirect('javascript:alert(1)')); // false
console.log(validator.isValidRedirect('//evil.com')); // false
console.log(validator.isValidRedirect('\\/\\/evil.com')); // false
console.log(validator.isValidRedirect('https://trusted.com@evil.com')); // false
console.log(validator.isValidRedirect('https://evil.com/path')); // false
// These should return true:
console.log(validator.isValidRedirect('/dashboard')); // true
console.log(validator.isValidRedirect('https://myapp.com/settings')); // true
Key bypass patterns to block:
- Protocol confusion:
javascript:,data:,vbscript: - Backslash substitution:
\/\/evil.com(browsers normalize to//evil.com) - Credential-based confusion:
https://yoursite.com@evil.com - Unicode normalization: Homoglyph attacks using similar-looking characters
- Double encoding:
%252fdecodes to%2f, then to/
Relative URL Strategies
The simplest secure approach: only allow relative paths. This eliminates external redirects entirely.
import re
from urllib.parse import urlparse, urljoin
from pathlib import PurePosixPath
def validate_relative_redirect(url: str, base_path: str = '/') -> str:
"""
Validates and normalizes a relative redirect URL.
Returns the safe URL or raises ValueError.
"""
if not url:
return base_path
url = url.strip()
# Must start with / and not //
if not url.startswith('/') or url.startswith('//'):
raise ValueError('URL must be a relative path starting with /')
# Block any scheme
if ':' in url.split('/')[0]:
raise ValueError('Schemes not allowed in relative URLs')
# Parse and check for external redirect attempts
parsed = urlparse(url)
if parsed.netloc:
raise ValueError('External hosts not allowed')
# Normalize the path to prevent traversal
# PurePosixPath handles .. and . resolution
try:
normalized = str(PurePosixPath(parsed.path))
# Ensure we haven't escaped the base
if not normalized.startswith('/'):
normalized = '/' + normalized
# Reconstruct with query string if present
if parsed.query:
return f"{normalized}?{parsed.query}"
return normalized
except Exception as e:
raise ValueError(f'Invalid path: {e}')
# Usage in Flask
@app.route('/auth/callback')
def auth_callback():
try:
return_to = validate_relative_redirect(
request.args.get('return_to', '/dashboard')
)
except ValueError:
return_to = '/dashboard'
return redirect(return_to)
This approach is restrictive but eliminates entire classes of vulnerabilities. Use it when you don’t need external redirects.
Signed/Tokenized Redirects
When you need to allow arbitrary redirects (marketing campaigns, email links), cryptographically sign the URLs to prevent tampering:
import hmac
import hashlib
import base64
import time
from urllib.parse import urlencode, parse_qs
class SignedRedirectManager:
def __init__(self, secret_key: str, max_age_seconds: int = 3600):
self.secret_key = secret_key.encode()
self.max_age = max_age_seconds
def create_signed_url(self, redirect_path: str, destination: str) -> str:
"""Generate a signed redirect URL."""
timestamp = str(int(time.time()))
# Create signature over destination + timestamp
message = f"{destination}|{timestamp}".encode()
signature = hmac.new(self.secret_key, message, hashlib.sha256).digest()
sig_b64 = base64.urlsafe_b64encode(signature).decode().rstrip('=')
params = urlencode({
'url': destination,
'ts': timestamp,
'sig': sig_b64
})
return f"{redirect_path}?{params}"
def verify_and_get_url(self, url: str, timestamp: str, signature: str) -> str:
"""Verify signature and return destination if valid."""
# Check timestamp
try:
ts = int(timestamp)
if time.time() - ts > self.max_age:
raise ValueError('Redirect link expired')
except (ValueError, TypeError):
raise ValueError('Invalid timestamp')
# Verify signature
message = f"{url}|{timestamp}".encode()
expected_sig = hmac.new(self.secret_key, message, hashlib.sha256).digest()
expected_b64 = base64.urlsafe_b64encode(expected_sig).decode().rstrip('=')
if not hmac.compare_digest(signature, expected_b64):
raise ValueError('Invalid signature')
return url
# Usage
manager = SignedRedirectManager(secret_key='your-256-bit-secret-key-here')
# Generate link for email campaign
signed_url = manager.create_signed_url(
'/r',
'https://partner-site.com/promo?ref=myapp'
)
# Result: /r?url=https%3A%2F%2Fpartner-site.com%2Fpromo...&ts=1699...&sig=abc...
# Verify on redirect endpoint
@app.route('/r')
def tracked_redirect():
try:
destination = manager.verify_and_get_url(
request.args.get('url'),
request.args.get('ts'),
request.args.get('sig')
)
return redirect(destination)
except ValueError as e:
abort(400, str(e))
This approach lets you redirect anywhere while ensuring the URL was generated by your application, not crafted by an attacker.
Testing and Bypass Prevention
Your validation is only as good as your test coverage. Here’s a test suite covering known bypass patterns:
import pytest
class TestRedirectValidation:
@pytest.fixture
def validator(self):
return SafeRedirectValidator(['myapp.com', 'cdn.myapp.com'])
# Legitimate URLs that should pass
@pytest.mark.parametrize('url', [
'/dashboard',
'/users/123/profile',
'/search?q=test',
'https://myapp.com/settings',
'https://cdn.myapp.com/assets/logo.png',
])
def test_valid_urls_accepted(self, validator, url):
assert validator.is_valid_redirect(url) is True
# External redirects
@pytest.mark.parametrize('url', [
'https://evil.com',
'https://evil.com/path',
'http://attacker.org',
])
def test_external_urls_blocked(self, validator, url):
assert validator.is_valid_redirect(url) is False
# Protocol-relative bypasses
@pytest.mark.parametrize('url', [
'//evil.com',
'///evil.com',
'\\/\\/evil.com',
'/\\/evil.com',
'\\\\evil.com',
])
def test_protocol_relative_blocked(self, validator, url):
assert validator.is_valid_redirect(url) is False
# Dangerous schemes
@pytest.mark.parametrize('url', [
'javascript:alert(1)',
'JAVASCRIPT:alert(1)',
'data:text/html,<script>alert(1)</script>',
'vbscript:msgbox(1)',
'java\tscript:alert(1)', # Tab character bypass
])
def test_dangerous_schemes_blocked(self, validator, url):
assert validator.is_valid_redirect(url) is False
# Credential-based confusion
@pytest.mark.parametrize('url', [
'https://myapp.com@evil.com',
'https://user:pass@evil.com',
'https://myapp.com%40evil.com', # Encoded @
])
def test_credential_confusion_blocked(self, validator, url):
assert validator.is_valid_redirect(url) is False
# Domain confusion
@pytest.mark.parametrize('url', [
'https://myapp.com.evil.com',
'https://notmyapp.com',
'https://myapp.com.evil.com/path',
])
def test_domain_confusion_blocked(self, validator, url):
assert validator.is_valid_redirect(url) is False
Run these tests against every change to your redirect handling code. Add new test cases when you encounter novel bypass techniques in security advisories or bug bounty reports.
Open redirects are preventable. Choose allowlists over blocklists, validate with battle-tested URL parsers, and test against known bypass patterns. Your users are trusting your domain—don’t let attackers abuse that trust.