Random

Jan 11, 2026 Scala

Scala - Random Number Generation

• Scala provides multiple approaches to random number generation through scala.util.Random, Java’s java.util.Random, and java.security.SecureRandom for cryptographically secure operations

Read more →

Dec 25, 2025 Engineering

Rendezvous Hashing: Highest Random Weight

Distributed systems face a fundamental challenge: how do you decide which node handles which piece of data? Naive approaches like hash(key) % n fall apart when nodes join or leave—suddenly almost…

Read more →

Dec 25, 2025 Engineering

Reservoir Sampling: Random Selection from Stream

You’re processing a firehose of data—millions of log entries, a continuous social media feed, or network packets flying by at wire speed. You need a random sample of k items, but you can’t store…

Read more →

Dec 25, 2025 Engineering

Reservoir Sampling: Random Selection from Streams

You’re processing a continuous stream of events—server logs, user clicks, sensor readings—and you need a random sample. The catch: you don’t know how many items will arrive, you can’t store…

Read more →

Dec 21, 2025 Machine Learning

Random Forest: Complete Guide with Examples

Random forests leverage the ‘wisdom of crowds’ principle: aggregate predictions from many weak learners outperform any individual prediction. Instead of training one deep, complex decision tree that…

Read more →

Oct 26, 2025 Python

PySpark - Sample DataFrame (Random Rows)

Sampling DataFrames is a fundamental operation in PySpark that you’ll use constantly—whether you’re testing transformations on a subset of production data, exploring unfamiliar datasets, or creating…

Read more →

Oct 22, 2025 Machine Learning

PySpark - Random Forest Classifier with MLlib

PySpark’s MLlib provides a distributed implementation of Random Forest that scales across clusters. Start by initializing a SparkSession and importing the necessary components:

Read more →

Oct 09, 2025 Engineering

Property-Based Testing: Generating Random Inputs

Traditional unit tests are essentially a list of examples. You pick inputs, compute expected outputs, and verify the function behaves correctly for those specific cases. This works, but it has a…

Read more →

Sep 07, 2025 Python

NumPy - Random Seed for Reproducibility

Random number generation in NumPy produces pseudorandom numbers—sequences that appear random but are deterministic given an initial state. Without controlling this state, you’ll get different results…

Read more →

Sep 07, 2025 Python

NumPy - Random Shuffle and Permutation

NumPy provides two primary methods for randomizing array elements: shuffle() and permutation(). The fundamental difference lies in how they handle the original array.

Read more →

Sep 07, 2025 Python

NumPy - Random Uniform Distribution

A uniform distribution represents the simplest probability distribution where every value within a defined interval [a, b] has equal likelihood of occurring. The probability density function (PDF) is…

Read more →

Sep 06, 2025 Python

NumPy - Random Choice from Array (np.random.choice)

import numpy as np

Read more →

Sep 06, 2025 Python

NumPy - Random Exponential Distribution

The exponential distribution describes the time between events in a process where events occur continuously and independently at a constant average rate. In NumPy, you generate exponentially…

Read more →

Sep 06, 2025 Python

NumPy - Random Float (np.random.rand, random_sample)

NumPy offers several approaches to generate random floating-point numbers. The most common methods—np.random.rand() and np.random.random_sample()—both produce uniformly distributed floats in the…

Read more →

Sep 06, 2025 Python

NumPy - Random Generator (np.random.default_rng)

NumPy introduced default_rng() in version 1.17 as part of a complete overhaul of its random number generation infrastructure. The legacy RandomState and module-level functions…

Read more →

Sep 06, 2025 Python

NumPy - Random Integer (np.random.randint)

The np.random.randint() function generates random integers within a specified range. The basic signature takes a low bound (inclusive), high bound (exclusive), and optional size parameter.

Read more →

Sep 06, 2025 Python

NumPy - Random Module (np.random) Complete Guide

• NumPy’s random module provides two APIs: the legacy np.random functions and the modern Generator-based approach with np.random.default_rng(), which offers better statistical properties and…

Read more →

Sep 06, 2025 Python

NumPy - Random Normal Distribution (np.random.randn/normal)

The np.random.randn() function generates samples from the standard normal distribution (Gaussian distribution with mean 0 and standard deviation 1). The function accepts dimensions as separate…

Read more →

Sep 06, 2025 Python

NumPy - Random Poisson Distribution

The Poisson distribution describes the probability of a given number of events occurring in a fixed interval when these events happen independently at a constant average rate. The distribution is…

Read more →

Sep 05, 2025 Python

NumPy - Random Binomial Distribution

The binomial distribution answers a fundamental question: ‘If I perform n independent trials, each with probability p of success, how many successes will I get?’ This applies directly to real-world…

Read more →

Aug 29, 2025 Python

NumPy - Generate Random Boolean Array

The simplest approach to generate random boolean arrays uses numpy.random.choice() with boolean values. This method explicitly selects from True and False values:

Read more →

Aug 28, 2025 Python

NumPy - Create Random Array (np.random)

NumPy offers two approaches for random number generation. The legacy np.random module functions remain widely used but are considered superseded by the Generator-based API introduced in NumPy 1.17.

Read more →

Jul 04, 2025 Machine Learning

How to Use Random Forest for Feature Selection in R

Feature selection is critical for building interpretable, efficient machine learning models. Too many features lead to overfitting, increased computational costs, and models that are difficult to…

Read more →

Jun 09, 2025 Python

How to Set Random Seed in NumPy

Random number generation sits at the heart of modern data science and machine learning. From shuffling datasets and initializing neural network weights to running Monte Carlo simulations, we rely on…

Read more →

Jun 07, 2025 Pandas

How to Sample Random Rows in Pandas

Random sampling is fundamental to practical data work. You need it for exploratory data analysis when you can’t eyeball a million rows. You need it for creating train/test splits in machine learning…

Read more →

May 27, 2025 Machine Learning

How to Perform Random Search in Python

Hyperparameter tuning is the process of finding optimal configuration values that govern your model’s learning process. Unlike model parameters learned during training, hyperparameters must be set…

Read more →

May 10, 2025 Machine Learning

How to Implement Random Forest in Python

Random Forest is an ensemble learning algorithm that builds multiple decision trees and combines their predictions through voting (classification) or averaging (regression). Each tree is trained on a…

Read more →

May 10, 2025 Machine Learning

How to Implement Random Forest in R

Random Forest is an ensemble learning method that constructs multiple decision trees during training and outputs the mode of classes (classification) or mean prediction (regression) of individual…

Read more →

Apr 27, 2025 Statistics

How to Generate Random Numbers from a Poisson Distribution in Python

The Poisson distribution models the probability of a given number of events occurring in a fixed interval of time or space. The key assumption: these events occur independently at a constant average…

Read more →

Apr 27, 2025 Python

How to Generate Random Numbers in NumPy

NumPy’s random module is the workhorse of random number generation in scientific Python. While Python’s built-in random module works fine for simple tasks, it falls short when you need to generate…

Read more →

Apr 26, 2025 Statistics

How to Generate Random Numbers from a Normal Distribution in Python

The normal distribution (also called Gaussian distribution) is the backbone of statistical analysis. It’s that familiar bell-shaped curve where values cluster around a central mean, with probability…

Read more →

Apr 19, 2025 Python

How to Create an Array of Random Numbers in NumPy

Random number generation is foundational to modern computing. Whether you’re running Monte Carlo simulations, initializing neural network weights, generating synthetic test data, or bootstrapping…

Read more →

Mar 31, 2025 Statistics

How to Calculate Variance of a Random Variable

Variance quantifies how much a random variable’s values deviate from its expected value. While the mean tells you the center of a distribution, variance tells you how spread out the values are around…

Read more →

Feb 19, 2025 Engineering

Fisher-Yates Shuffle: Unbiased Random Permutation

Shuffling an array seems trivial. Loop through, swap things around randomly, done. This intuition has led countless developers to write broken shuffle implementations that look correct but produce…

Read more →

Jan 30, 2025 Security

Cryptographic Random Numbers: Secure Generation

In 2012, researchers discovered that 0.2% of all HTTPS certificates shared private keys due to weak random number generation during key creation. The PlayStation 3’s master signing key was extracted…

Read more →

Jan 18, 2025 Engineering

Bogo Sort: Random Permutation Sort (Educational)

Every computer science curriculum teaches efficient sorting algorithms: Quicksort’s elegant divide-and-conquer, Merge Sort’s guaranteed O(n log n) performance, even the humble Bubble Sort that at…

Read more →