NumPy - Random Float (np.random.rand, random_sample)

Key Insights

NumPy provides multiple methods for generating random floats, with np.random.rand() and np.random.random_sample() being functionally identical for uniform distributions in [0.0, 1.0)
The legacy numpy.random module remains widely used, but numpy.random.Generator offers better statistical properties, performance, and is the recommended approach for new code
Understanding how to scale and shift random floats enables generation of custom ranges, normal distributions, and reproducible randomness through seeding

Understanding NumPy’s Random Float Generation

NumPy offers several approaches to generate random floating-point numbers. The most common methods—np.random.rand() and np.random.random_sample()—both produce uniformly distributed floats in the half-open interval [0.0, 1.0), meaning values can be 0.0 but never quite reach 1.0.

import numpy as np

# Both produce identical results
rand_values = np.random.rand(5)
sample_values = np.random.random_sample(5)

print(f"rand(): {rand_values}")
print(f"random_sample(): {sample_values}")
# Output example:
# rand(): [0.5488135  0.71518937 0.60276338 0.54488318 0.4236548 ]
# random_sample(): [0.64589411 0.43758721 0.891773   0.96366276 0.38344152]

Legacy vs Modern Random Number Generation

NumPy’s random module has evolved significantly. The legacy numpy.random functions use a global RandomState instance, while the modern approach uses Generator objects with improved algorithms.

# Legacy approach (still works, widely used)
legacy_random = np.random.rand(3, 3)

# Modern approach (recommended)
rng = np.random.default_rng(seed=42)
modern_random = rng.random((3, 3))

print("Legacy:\n", legacy_random)
print("\nModern:\n", modern_random)

The modern Generator approach offers:

Better statistical properties with PCG64 algorithm
Independent random streams
Thread-safe operation
Improved performance for large arrays

Generating Multi-Dimensional Arrays

Both methods support creating arrays of any shape. The syntax differs slightly between rand() and random_sample().

# np.random.rand() takes dimensions as separate arguments
matrix_2d = np.random.rand(3, 4)
tensor_3d = np.random.rand(2, 3, 4)

# np.random.random_sample() takes a tuple
matrix_2d_alt = np.random.random_sample((3, 4))
tensor_3d_alt = np.random.random_sample((2, 3, 4))

print(f"2D shape: {matrix_2d.shape}")
print(f"3D shape: {tensor_3d.shape}")
# Output:
# 2D shape: (3, 4)
# 3D shape: (2, 3, 4)

# Modern Generator approach
rng = np.random.default_rng()
modern_matrix = rng.random((3, 4))

Scaling Random Floats to Custom Ranges

The default [0.0, 1.0) range is rarely what you need in practice. Scale and shift these values to any range using basic arithmetic.

# Generate floats in range [min, max)
def random_range(min_val, max_val, size):
    return np.random.rand(*size) * (max_val - min_val) + min_val

# Examples
temps_celsius = random_range(-10, 35, (10,))
prices = random_range(9.99, 99.99, (5,))
percentages = random_range(0, 100, (3, 3))

print(f"Temperatures: {temps_celsius}")
print(f"Prices: {prices}")
print(f"\nPercentages:\n{percentages}")

# Using modern Generator with uniform method (cleaner)
rng = np.random.default_rng()
temps_modern = rng.uniform(-10, 35, size=10)
print(f"\nModern uniform: {temps_modern}")

Reproducible Random Numbers with Seeds

Seeding ensures reproducibility—critical for debugging, testing, and scientific reproducibility.

# Legacy seeding
np.random.seed(42)
result1 = np.random.rand(5)

np.random.seed(42)
result2 = np.random.rand(5)

print(f"First run:  {result1}")
print(f"Second run: {result2}")
print(f"Identical: {np.array_equal(result1, result2)}")
# Output: Identical: True

# Modern seeding (preferred)
rng1 = np.random.default_rng(seed=42)
rng2 = np.random.default_rng(seed=42)

modern1 = rng1.random(5)
modern2 = rng2.random(5)

print(f"\nModern identical: {np.array_equal(modern1, modern2)}")

Performance Comparison and Best Practices

For large-scale random number generation, performance matters. The modern Generator typically outperforms legacy methods.

import time

# Benchmark legacy vs modern
size = (10000, 1000)

# Legacy
start = time.time()
legacy_large = np.random.rand(*size)
legacy_time = time.time() - start

# Modern
rng = np.random.default_rng()
start = time.time()
modern_large = rng.random(size)
modern_time = time.time() - start

print(f"Legacy time: {legacy_time:.4f}s")
print(f"Modern time: {modern_time:.4f}s")
print(f"Speedup: {legacy_time/modern_time:.2f}x")

Common Distributions Beyond Uniform

While rand() and random_sample() generate uniform distributions, NumPy provides many other distributions.

rng = np.random.default_rng(seed=42)

# Normal (Gaussian) distribution
normal_data = rng.normal(loc=0, scale=1, size=1000)

# Exponential distribution
exponential_data = rng.exponential(scale=2.0, size=1000)

# Beta distribution
beta_data = rng.beta(a=2, b=5, size=1000)

# Log-normal distribution
lognormal_data = rng.lognormal(mean=0, sigma=1, size=1000)

print(f"Normal mean: {normal_data.mean():.3f}, std: {normal_data.std():.3f}")
print(f"Exponential mean: {exponential_data.mean():.3f}")
print(f"Beta mean: {beta_data.mean():.3f}")

Practical Application: Monte Carlo Simulation

Random floats power Monte Carlo simulations. Here’s a simple example estimating π.

def estimate_pi(n_samples):
    rng = np.random.default_rng(seed=42)
    
    # Generate random points in unit square
    x = rng.random(n_samples)
    y = rng.random(n_samples)
    
    # Check if points fall inside unit circle
    inside_circle = (x**2 + y**2) <= 1
    
    # π ≈ 4 * (points inside circle / total points)
    pi_estimate = 4 * np.sum(inside_circle) / n_samples
    
    return pi_estimate

# Test with increasing sample sizes
for n in [1000, 10000, 100000, 1000000]:
    estimate = estimate_pi(n)
    error = abs(estimate - np.pi)
    print(f"n={n:7d}: π ≈ {estimate:.6f}, error = {error:.6f}")

# Output example:
# n=   1000: π ≈ 3.144000, error = 0.002407
# n=  10000: π ≈ 3.150800, error = 0.009207
# n= 100000: π ≈ 3.142920, error = 0.001327
# n=1000000: π ≈ 3.141273, error = 0.000320

Avoiding Common Pitfalls

Several mistakes frequently trip up developers working with random floats.

# WRONG: Reusing the same seed globally
np.random.seed(42)
data1 = np.random.rand(100)
np.random.seed(42)  # Don't reset the seed like this
data2 = np.random.rand(100)
# data1 and data2 are identical - probably not intended

# RIGHT: Use separate Generator instances
rng1 = np.random.default_rng(seed=42)
rng2 = np.random.default_rng(seed=43)
data1 = rng1.random(100)
data2 = rng2.random(100)

# WRONG: Inefficient generation in loops
results = []
for i in range(1000):
    results.append(np.random.rand())  # Slow
results = np.array(results)

# RIGHT: Vectorized generation
results = np.random.rand(1000)  # Much faster

# WRONG: Forgetting half-open interval
max_val = np.random.rand(1000000).max()
print(f"Max value: {max_val}")  # Never exactly 1.0
print(f"Equals 1.0: {max_val == 1.0}")  # Always False

Migration Path from Legacy to Modern

If you’re maintaining legacy code, migrate incrementally to the modern API.

# Legacy code
np.random.seed(42)
old_data = np.random.rand(5, 5)
old_normal = np.random.randn(5, 5)
old_range = np.random.uniform(0, 10, (5, 5))

# Modern equivalent
rng = np.random.default_rng(seed=42)
new_data = rng.random((5, 5))
new_normal = rng.standard_normal((5, 5))
new_range = rng.uniform(0, 10, (5, 5))

# For drop-in replacement, create a global Generator
_rng = np.random.default_rng()

def rand(*args):
    return _rng.random(args if args else None)

# Now rand() works like legacy np.random.rand()
modern_result = rand(3, 3)

The choice between np.random.rand() and np.random.random_sample() is purely stylistic—they’re identical. However, the choice between legacy and modern APIs impacts code quality, performance, and maintainability. For new projects, always use np.random.default_rng() and its Generator methods.