Pandas - Select Rows by Index (iloc)
The `iloc` indexer provides purely integer-location based indexing for selection by position. Unlike `loc` which uses labels, `iloc` treats the DataFrame as a zero-indexed array where the first row...
Key Insights
ilocuses integer-based indexing to select rows and columns by their positional location, making it essential for numerical slicing operations regardless of index labels- Understanding the difference between single bracket
[]and double bracket[[]]notation withilocdetermines whether you get a Series or DataFrame return type - Combining
ilocwith boolean arrays, slicing, and list-based selection enables precise row extraction for data analysis and transformation workflows
Understanding iloc Fundamentals
The iloc indexer provides purely integer-location based indexing for selection by position. Unlike loc which uses labels, iloc treats the DataFrame as a zero-indexed array where the first row is position 0.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
'price': [1200, 25, 75, 350, 80],
'stock': [15, 150, 85, 42, 67]
}, index=['A', 'B', 'C', 'D', 'E'])
print(df)
product price stock
A Laptop 1200 15
B Mouse 25 150
C Keyboard 75 85
D Monitor 350 42
E Webcam 80 67
The key advantage: iloc works consistently regardless of your index type or whether you’ve performed operations that disrupted sequential indexing.
Selecting Single Rows
Access individual rows using integer positions. Single bracket notation returns a Series:
# Get first row (position 0)
first_row = df.iloc[0]
print(first_row)
print(type(first_row))
product Laptop
price 1200
stock 15
Name: A, dtype: object
<class 'pandas.core.series.Series'>
For DataFrame output, use double brackets:
# Returns DataFrame with one row
first_row_df = df.iloc[[0]]
print(first_row_df)
print(type(first_row_df))
product price stock
A Laptop 1200 15
<class 'pandas.core.frame.DataFrame'>
Access rows from the end using negative indexing:
# Last row
last_row = df.iloc[-1]
# Second to last row
second_last = df.iloc[-2]
print(second_last)
product Monitor
price 350
stock 42
Name: D, dtype: object
Selecting Multiple Rows with Lists
Pass a list of integer positions to extract specific rows:
# Select rows at positions 0, 2, and 4
selected_rows = df.iloc[[0, 2, 4]]
print(selected_rows)
product price stock
A Laptop 1200 15
C Keyboard 75 85
E Webcam 80 67
Mix positive and negative indices:
# First, third, and last rows
mixed_selection = df.iloc[[0, 2, -1]]
print(mixed_selection)
Order matters - results follow your list order:
# Reverse order selection
reversed_rows = df.iloc[[4, 3, 2, 1, 0]]
print(reversed_rows)
product price stock
E Webcam 80 67
D Monitor 350 42
C Keyboard 75 85
B Mouse 25 150
A Laptop 1200 15
Slicing Rows
Use Python’s slice notation start:stop:step for range-based selection:
# First three rows (0, 1, 2)
first_three = df.iloc[:3]
print(first_three)
product price stock
A Laptop 1200 15
B Mouse 25 150
C Keyboard 75 85
Skip rows with step parameter:
# Every other row
every_other = df.iloc[::2]
print(every_other)
product price stock
A Laptop 1200 15
C Keyboard 75 85
E Webcam 80 67
Slice from middle to end:
# From third row onwards
from_third = df.iloc[2:]
print(from_third)
product price stock
C Keyboard 75 85
D Monitor 350 42
E Webcam 80 67
Negative slicing for tail selection:
# Last two rows
last_two = df.iloc[-2:]
print(last_two)
Combining Row and Column Selection
iloc accepts two arguments: iloc[rows, columns]. Both use integer positions:
# First row, second column (price)
value = df.iloc[0, 1]
print(value) # 1200
# First three rows, first two columns
subset = df.iloc[:3, :2]
print(subset)
product price
A Laptop 1200
B Mouse 25
C Keyboard 75
Select specific rows and columns:
# Rows 0 and 2, columns 1 and 2
custom_subset = df.iloc[[0, 2], [1, 2]]
print(custom_subset)
price stock
A 1200 15
C 75 85
Use : to select all rows or columns:
# All rows, last column only
last_column = df.iloc[:, -1]
print(last_column)
A 15
B 150
C 85
D 42
E 67
Name: stock, dtype: int64
Boolean Array Selection
Combine iloc with boolean arrays for conditional row selection:
# Create boolean array
price_condition = df['price'] > 100
print(price_condition)
A True
B False
C False
D True
E False
Name: price, dtype: bool
Convert boolean to integer positions using np.where:
# Get positions where condition is True
positions = np.where(price_condition)[0]
print(positions) # [0 3]
# Select those rows
expensive_items = df.iloc[positions]
print(expensive_items)
product price stock
A Laptop 1200 15
D Monitor 350 42
Direct boolean indexing without iloc:
# More straightforward approach
expensive_direct = df[df['price'] > 100]
print(expensive_direct)
Practical Use Cases
Sampling random rows:
# Random 3 rows
random_positions = np.random.choice(len(df), size=3, replace=False)
random_sample = df.iloc[random_positions]
print(random_sample)
Train-test split by position:
# 80-20 split
split_point = int(len(df) * 0.8)
train = df.iloc[:split_point]
test = df.iloc[split_point:]
print(f"Train size: {len(train)}, Test size: {len(test)}")
Iterating with position tracking:
for i in range(len(df)):
row = df.iloc[i]
print(f"Position {i}: {row['product']} costs ${row['price']}")
Selecting rows after sorting:
# Sort by price, get top 3
sorted_df = df.sort_values('price', ascending=False)
top_3 = sorted_df.iloc[:3]
print(top_3)
product price stock
A Laptop 1200 15
D Monitor 350 42
E Webcam 80 67
Performance Considerations
iloc provides O(1) access for single row selection and efficient slicing. For large DataFrames, prefer vectorized operations over iterative iloc calls:
# Inefficient
result = []
for i in range(len(df)):
if df.iloc[i]['price'] > 100:
result.append(df.iloc[i])
# Efficient
result = df[df['price'] > 100]
When you need positional access after filtering or sorting, reset_index() combined with iloc maintains clarity:
filtered = df[df['stock'] > 50].reset_index(drop=True)
first_filtered = filtered.iloc[0] # Clean positional access
Use iloc when working with numerical algorithms requiring sequential access, integrating with NumPy operations, or when index labels are unreliable or non-unique.