Pandas - Select Rows by Index (iloc)

The `iloc` indexer provides purely integer-location based indexing for selection by position. Unlike `loc` which uses labels, `iloc` treats the DataFrame as a zero-indexed array where the first row...

Key Insights

  • iloc uses integer-based indexing to select rows and columns by their positional location, making it essential for numerical slicing operations regardless of index labels
  • Understanding the difference between single bracket [] and double bracket [[]] notation with iloc determines whether you get a Series or DataFrame return type
  • Combining iloc with boolean arrays, slicing, and list-based selection enables precise row extraction for data analysis and transformation workflows

Understanding iloc Fundamentals

The iloc indexer provides purely integer-location based indexing for selection by position. Unlike loc which uses labels, iloc treats the DataFrame as a zero-indexed array where the first row is position 0.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
    'price': [1200, 25, 75, 350, 80],
    'stock': [15, 150, 85, 42, 67]
}, index=['A', 'B', 'C', 'D', 'E'])

print(df)
      product  price  stock
A      Laptop   1200     15
B       Mouse     25    150
C    Keyboard     75     85
D     Monitor    350     42
E      Webcam     80     67

The key advantage: iloc works consistently regardless of your index type or whether you’ve performed operations that disrupted sequential indexing.

Selecting Single Rows

Access individual rows using integer positions. Single bracket notation returns a Series:

# Get first row (position 0)
first_row = df.iloc[0]
print(first_row)
print(type(first_row))
product    Laptop
price        1200
stock          15
Name: A, dtype: object
<class 'pandas.core.series.Series'>

For DataFrame output, use double brackets:

# Returns DataFrame with one row
first_row_df = df.iloc[[0]]
print(first_row_df)
print(type(first_row_df))
   product  price  stock
A   Laptop   1200     15
<class 'pandas.core.frame.DataFrame'>

Access rows from the end using negative indexing:

# Last row
last_row = df.iloc[-1]

# Second to last row
second_last = df.iloc[-2]
print(second_last)
product    Monitor
price          350
stock           42
Name: D, dtype: object

Selecting Multiple Rows with Lists

Pass a list of integer positions to extract specific rows:

# Select rows at positions 0, 2, and 4
selected_rows = df.iloc[[0, 2, 4]]
print(selected_rows)
      product  price  stock
A      Laptop   1200     15
C    Keyboard     75     85
E      Webcam     80     67

Mix positive and negative indices:

# First, third, and last rows
mixed_selection = df.iloc[[0, 2, -1]]
print(mixed_selection)

Order matters - results follow your list order:

# Reverse order selection
reversed_rows = df.iloc[[4, 3, 2, 1, 0]]
print(reversed_rows)
      product  price  stock
E      Webcam     80     67
D     Monitor    350     42
C    Keyboard     75     85
B       Mouse     25    150
A      Laptop   1200     15

Slicing Rows

Use Python’s slice notation start:stop:step for range-based selection:

# First three rows (0, 1, 2)
first_three = df.iloc[:3]
print(first_three)
      product  price  stock
A      Laptop   1200     15
B       Mouse     25    150
C    Keyboard     75     85

Skip rows with step parameter:

# Every other row
every_other = df.iloc[::2]
print(every_other)
      product  price  stock
A      Laptop   1200     15
C    Keyboard     75     85
E      Webcam     80     67

Slice from middle to end:

# From third row onwards
from_third = df.iloc[2:]
print(from_third)
      product  price  stock
C    Keyboard     75     85
D     Monitor    350     42
E      Webcam     80     67

Negative slicing for tail selection:

# Last two rows
last_two = df.iloc[-2:]
print(last_two)

Combining Row and Column Selection

iloc accepts two arguments: iloc[rows, columns]. Both use integer positions:

# First row, second column (price)
value = df.iloc[0, 1]
print(value)  # 1200

# First three rows, first two columns
subset = df.iloc[:3, :2]
print(subset)
      product  price
A      Laptop   1200
B       Mouse     25
C    Keyboard     75

Select specific rows and columns:

# Rows 0 and 2, columns 1 and 2
custom_subset = df.iloc[[0, 2], [1, 2]]
print(custom_subset)
   price  stock
A   1200     15
C     75     85

Use : to select all rows or columns:

# All rows, last column only
last_column = df.iloc[:, -1]
print(last_column)
A     15
B    150
C     85
D     42
E     67
Name: stock, dtype: int64

Boolean Array Selection

Combine iloc with boolean arrays for conditional row selection:

# Create boolean array
price_condition = df['price'] > 100
print(price_condition)
A     True
B    False
C    False
D     True
E    False
Name: price, dtype: bool

Convert boolean to integer positions using np.where:

# Get positions where condition is True
positions = np.where(price_condition)[0]
print(positions)  # [0 3]

# Select those rows
expensive_items = df.iloc[positions]
print(expensive_items)
     product  price  stock
A     Laptop   1200     15
D    Monitor    350     42

Direct boolean indexing without iloc:

# More straightforward approach
expensive_direct = df[df['price'] > 100]
print(expensive_direct)

Practical Use Cases

Sampling random rows:

# Random 3 rows
random_positions = np.random.choice(len(df), size=3, replace=False)
random_sample = df.iloc[random_positions]
print(random_sample)

Train-test split by position:

# 80-20 split
split_point = int(len(df) * 0.8)
train = df.iloc[:split_point]
test = df.iloc[split_point:]

print(f"Train size: {len(train)}, Test size: {len(test)}")

Iterating with position tracking:

for i in range(len(df)):
    row = df.iloc[i]
    print(f"Position {i}: {row['product']} costs ${row['price']}")

Selecting rows after sorting:

# Sort by price, get top 3
sorted_df = df.sort_values('price', ascending=False)
top_3 = sorted_df.iloc[:3]
print(top_3)
     product  price  stock
A     Laptop   1200     15
D    Monitor    350     42
E     Webcam     80     67

Performance Considerations

iloc provides O(1) access for single row selection and efficient slicing. For large DataFrames, prefer vectorized operations over iterative iloc calls:

# Inefficient
result = []
for i in range(len(df)):
    if df.iloc[i]['price'] > 100:
        result.append(df.iloc[i])

# Efficient
result = df[df['price'] > 100]

When you need positional access after filtering or sorting, reset_index() combined with iloc maintains clarity:

filtered = df[df['stock'] > 50].reset_index(drop=True)
first_filtered = filtered.iloc[0]  # Clean positional access

Use iloc when working with numerical algorithms requiring sequential access, integrating with NumPy operations, or when index labels are unreliable or non-unique.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.