Functions

SQL

SQL - Window Functions Complete Guide

Window functions operate on a set of rows and return a single value for each row, unlike aggregate functions that collapse multiple rows into one. They’re called ‘window’ functions because they…

Read more →
SQLite

SQL: Window Functions Explained

Window functions operate on a set of rows related to the current row, performing calculations while preserving individual row identity. Unlike aggregate functions that collapse multiple rows into a…

Read more →
SQL

SQL - User-Defined Functions (UDF)

SQL Server supports three primary UDF types: scalar functions, inline table-valued functions (iTVF), and multi-statement table-valued functions (mTVF). Each type has specific performance…

Read more →
SQL

SQL - ORDER BY in Window Functions

Window functions operate on a ‘window’ of rows related to the current row. The ORDER BY clause within the OVER() specification determines how rows are ordered within each partition for the window…

Read more →
SQL

SQL - JSON Functions in SQL

Most modern relational databases support native JSON data types that validate and optimize JSON storage. PostgreSQL, MySQL 8.0+, SQL Server 2016+, and Oracle 12c+ all provide JSON capabilities with…

Read more →
SQL

SQL - LEAD() and LAG() Functions

LEAD() and LAG() belong to the window function family, operating on a ‘window’ of rows related to the current row. Unlike aggregate functions that collapse multiple rows into one, window functions…

Read more →
SQL

SQL - CAST() and CONVERT() Functions

Type conversion transforms data from one data type to another. SQL handles this through implicit (automatic) and explicit (manual) conversion. Implicit conversion works when SQL Server can safely…

Read more →
SQL

Spark SQL - Window Functions Tutorial

Window functions perform calculations across a set of rows that are related to the current row. Unlike aggregate functions with GROUP BY that collapse multiple rows into one, window functions…

Read more →
SQL

Spark SQL - JSON Functions

• Spark SQL provides over 20 specialized JSON functions for parsing, extracting, and manipulating JSON data directly within DataFrames without requiring external libraries or UDFs

Read more →
SQL

Spark SQL - Map Functions

• Map functions in Spark SQL enable manipulation of key-value pair structures through native SQL syntax, eliminating the need for complex UDFs or RDD operations in most scenarios

Read more →
SQL

Spark SQL - Aggregate Functions

Spark SQL provides comprehensive aggregate functions that operate on grouped data. The fundamental pattern involves grouping rows by one or more columns and applying aggregate functions to compute…

Read more →
SQL

Spark SQL - Array Functions

• Spark SQL provides 50+ array functions that enable complex data transformations without UDFs, significantly improving performance through Catalyst optimizer integration and whole-stage code…

Read more →
Engineering

Spark Scala - Window Functions

Window functions solve a fundamental problem in data processing: how do you compute values across multiple rows while keeping each row intact? Standard aggregations with GROUP BY collapse rows into…

Read more →
Scala

Scala - Partial Functions

A partial function in Scala is a function that is not defined for all possible input values of its domain. Unlike total functions that must handle every input, partial functions explicitly declare…

Read more →
Scala

Scala - Higher-Order Functions

• Higher-order functions in Scala accept functions as parameters or return functions as results, enabling powerful abstraction patterns that reduce code duplication and improve composability

Read more →
Scala

Scala - Anonymous/Lambda Functions

Anonymous functions, also called lambda functions or function literals, are unnamed functions defined inline. In Scala, these are instances of the FunctionN traits (where N is the number of…

Read more →
R

R - Functions - Define and Call

R functions follow a straightforward structure using the function keyword. The basic anatomy includes parameters, a function body, and an optional explicit return statement.

Read more →
R

R dplyr - lag() and lead() Functions

• The lag() and lead() functions shift values within a vector by a specified number of positions, essential for time-series analysis, calculating differences between consecutive rows, and…

Read more →
Python

Python - Nested Functions

Nested functions are functions defined inside other functions. The inner function has access to variables in the enclosing function’s scope, even after the outer function has finished executing. This…

Read more →
Python

Python - First-Class Functions

In Python, functions are first-class citizens. This means they’re treated as objects that can be manipulated like any other value—integers, strings, or custom classes. You can assign them to…

Read more →
Engineering

Python - chr() and ord() Functions

Every character you see on screen is stored as a number. The letter ‘A’ is 65. The digit ‘0’ is 48. The emoji ‘🐍’ is 128013. This mapping between characters and integers is called character encoding,…

Read more →
Python

PySpark - SQL String Functions

String manipulation is one of the most common operations in data processing pipelines. Whether you’re cleaning messy CSV imports, parsing log files, or standardizing user input, you’ll spend…

Read more →
Python

PySpark - SQL Window Functions

Window functions are one of PySpark’s most powerful features for analytical queries. Unlike traditional GROUP BY aggregations that collapse multiple rows into a single result, window functions…

Read more →
Python

PySpark - SQL Date Functions

Date manipulation is the backbone of data engineering. Whether you’re building ETL pipelines, analyzing time-series data, or creating reporting dashboards, you’ll spend significant time working with…

Read more →
Python

PySpark - SQL Aggregate Functions

PySpark aggregate functions are the workhorses of big data analytics. Unlike Pandas, which loads entire datasets into memory on a single machine, PySpark distributes data across multiple nodes and…

Read more →
Python

PySpark - Lead and Lag Functions

Window functions operate on a subset of rows related to the current row, enabling calculations across row boundaries without collapsing the dataset like groupBy() does. Lead and lag functions are…

Read more →
MySQL

How to Use Window Functions in MySQL

Window functions perform calculations across a set of rows that are related to the current row, but unlike aggregate functions with GROUP BY, they don’t collapse multiple rows into a single output…

Read more →
MySQL

How to Use String Functions in MySQL

String manipulation in SQL isn’t just about prettifying output—it’s a critical tool for data cleaning, extraction, and transformation at the database level. When you’re dealing with messy real-world…

Read more →
MySQL

How to Use Date Functions in MySQL

• MySQL stores dates and times in five distinct data types (DATE, DATETIME, TIMESTAMP, TIME, YEAR), each optimized for different use cases and storage requirements—choose DATETIME for most…

Read more →
Go

Go Anonymous Functions and Closures

Anonymous functions, also called function literals, are functions defined without a name. In Go, they’re syntactically identical to regular functions except they omit the function name. You can…

Read more →