How to Select Rows by Index in Pandas

Key Insights

Use .iloc[] for position-based selection (like array indexing) and .loc[] for label-based selection (using actual index values)—confusing these is the most common source of bugs in Pandas code.
Integer indexes create ambiguity: when your index contains integers, .loc[5] selects the row with label 5, not the 6th row—always be explicit about which method you need.
Boolean indexing on the index itself is underutilized; filtering rows by index conditions like df.loc[df.index > '2023-01-01'] is both readable and performant.

Introduction

Row selection is fundamental to every Pandas workflow. Whether you’re extracting a subset for analysis, debugging data issues, or preparing training sets, you need precise control over which rows you’re working with.

Pandas provides two primary indexing methods: position-based (where is this row in the DataFrame?) and label-based (what is this row called?). The distinction matters because Pandas indexes aren’t always simple integers counting from zero. Your index might be customer IDs, timestamps, or product codes. Understanding how to select rows in both paradigms will save you hours of debugging and prevent subtle data errors.

Understanding Pandas Index Types

Every DataFrame has an index—a set of labels that identify each row. By default, Pandas assigns a RangeIndex starting at 0, but you can set any column or custom values as the index.

import pandas as pd

# Default integer index
df_default = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'age': [28, 34, 45, 31]
})
print(df_default.index)
# RangeIndex(start=0, stop=4, step=1)

# String index
df_string = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'age': [28, 34, 45, 31]
}, index=['emp_001', 'emp_002', 'emp_003', 'emp_004'])
print(df_string.index)
# Index(['emp_001', 'emp_002', 'emp_003', 'emp_004'], dtype='object')

# DateTime index
df_datetime = pd.DataFrame({
    'sales': [150, 200, 175, 220],
    'returns': [5, 8, 3, 12]
}, index=pd.date_range('2024-01-01', periods=4, freq='D'))
print(df_datetime.index)
# DatetimeIndex(['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04'], dtype='datetime64[ns]', freq='D')

The index type determines which selection methods work intuitively. With a default integer index, .loc[0] and .iloc[0] return the same row. With a string index, .loc['emp_001'] works but .loc[0] raises a KeyError.

Selecting Rows with `.iloc[]` (Position-Based)

The .iloc[] accessor uses integer positions, just like Python list indexing. It ignores the actual index labels entirely—row 0 is always the first row, regardless of what the index says.

Single Row Selection

df = pd.DataFrame({
    'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
    'price': [999, 25, 75, 300, 80],
    'stock': [50, 200, 150, 75, 100]
}, index=['P001', 'P002', 'P003', 'P004', 'P005'])

# Get the first row (returns a Series)
first_row = df.iloc[0]
print(first_row)
# product    Laptop
# price         999
# stock          50
# Name: P001, dtype: object

# Get the last row
last_row = df.iloc[-1]
print(last_row['product'])  # Webcam

Slicing Multiple Rows

Slicing with .iloc[] follows Python conventions: the start is inclusive, the end is exclusive.

# Rows 1 through 3 (positions 1, 2, 3)
subset = df.iloc[1:4]
print(subset)
#      product  price  stock
# P002   Mouse     25    200
# P003 Keyboard    75    150
# P004  Monitor   300     75

# Every other row
alternating = df.iloc[::2]
print(alternating)
#      product  price  stock
# P001  Laptop    999     50
# P003 Keyboard    75    150
# P005   Webcam    80    100

Selecting Non-Contiguous Rows

Pass a list of positions to select specific rows that aren’t adjacent.

# Select rows at positions 0, 2, and 4
selected = df.iloc[[0, 2, 4]]
print(selected)
#      product  price  stock
# P001  Laptop    999     50
# P003 Keyboard    75    150
# P005   Webcam    80    100

Selecting Rows with `.loc[]` (Label-Based)

The .loc[] accessor uses index labels. This is what you want when your index has meaningful values like IDs, dates, or categories.

Single Row Selection

# Using our product DataFrame with string index
row = df.loc['P003']
print(row)
# product    Keyboard
# price            75
# stock           150
# Name: P003, dtype: object

# Access a specific value
keyboard_price = df.loc['P003', 'price']
print(keyboard_price)  # 75

Label-Based Slicing

Here’s where .loc[] differs significantly from .iloc[]: label slices are inclusive on both ends.

# Select rows from P002 to P004 (inclusive!)
subset = df.loc['P002':'P004']
print(subset)
#      product  price  stock
# P002   Mouse     25    200
# P003 Keyboard    75    150
# P004  Monitor   300     75

This inclusive behavior catches many developers off guard. With .iloc[1:4], you get 3 rows. With .loc['P002':'P004'], you also get 3 rows, but the logic is different—both boundary labels are included.

Selecting Multiple Labels

# Select specific rows by label
selected = df.loc[['P001', 'P003', 'P005']]
print(selected)
#      product  price  stock
# P001  Laptop    999     50
# P003 Keyboard    75    150
# P005   Webcam    80    100

DateTime Index Selection

Label-based selection shines with datetime indexes. Pandas supports partial string indexing for dates.

sales_data = pd.DataFrame({
    'revenue': [1000, 1200, 950, 1100, 1300, 1150, 1400],
    'transactions': [45, 52, 41, 48, 55, 50, 60]
}, index=pd.date_range('2024-01-01', periods=7, freq='D'))

# Select a specific date
jan_3 = sales_data.loc['2024-01-03']
print(jan_3)

# Select a range of dates
first_week = sales_data.loc['2024-01-01':'2024-01-05']
print(first_week)

# Partial string matching (all of January)
january = sales_data.loc['2024-01']
print(january)

Boolean Indexing with Index Conditions

You can filter rows based on conditions applied to the index itself. This is particularly useful for datetime indexes or when you need to select rows where the index meets certain criteria.

# Numeric index filtering
df_numeric = pd.DataFrame({
    'value': [10, 20, 30, 40, 50]
}, index=[100, 200, 300, 400, 500])

# Select rows where index > 250
high_index = df_numeric.loc[df_numeric.index > 250]
print(high_index)
#      value
# 300     30
# 400     40
# 500     50

# Using isin() for multiple values
specific = df_numeric.loc[df_numeric.index.isin([100, 300, 500])]
print(specific)

For datetime indexes, this becomes especially powerful:

# Filter by date conditions
recent = sales_data.loc[sales_data.index >= '2024-01-04']
print(recent)

# Combine with column conditions
high_revenue_recent = sales_data.loc[
    (sales_data.index >= '2024-01-03') & 
    (sales_data['revenue'] > 1000)
]
print(high_revenue_recent)

You can also use index methods directly:

# Select weekdays only (Monday=0, Sunday=6)
weekdays = sales_data.loc[sales_data.index.dayofweek < 5]

# Select rows where index string contains a pattern
df_string = pd.DataFrame({'val': [1, 2, 3]}, index=['alpha', 'beta', 'gamma'])
contains_a = df_string.loc[df_string.index.str.contains('a')]
print(contains_a)

Common Pitfalls and Best Practices

The Integer Index Trap

When your index contains integers (but isn’t a default RangeIndex), .loc[] and .iloc[] behave very differently:

# DataFrame with non-sequential integer index
df_int = pd.DataFrame({
    'value': ['A', 'B', 'C', 'D']
}, index=[10, 20, 30, 40])

# These are NOT the same!
print(df_int.iloc[1])  # Row at position 1 (value: B, index: 20)
print(df_int.loc[1])   # KeyError! No label '1' exists

# To get the row with label 20:
print(df_int.loc[20])  # value: B

Avoiding Chained Indexing

Chained indexing creates ambiguity about whether you’re viewing or copying data:

# Bad: Chained indexing
df['price'][0] = 500  # May or may not modify df, raises SettingWithCopyWarning

# Good: Single .loc[] call
df.loc['P001', 'price'] = 500  # Always modifies df

SettingWithCopyWarning

When you create a subset and then modify it, Pandas warns you that you might be modifying a view (affecting the original) or a copy (changes lost):

# This triggers a warning
subset = df[df['price'] > 100]
subset['price'] = 0  # SettingWithCopyWarning

# Correct approach: use .loc[] on the original
df.loc[df['price'] > 100, 'price'] = 0

# Or explicitly copy if you want independence
subset = df[df['price'] > 100].copy()
subset['price'] = 0  # Safe, doesn't affect original df

Use `.at[]` and `.iat[]` for Single Values

When accessing a single scalar value, .at[] and .iat[] are faster:

# Slower
value = df.loc['P001', 'price']

# Faster for single value access
value = df.at['P001', 'price']
value = df.iat[0, 1]  # Position-based equivalent

Quick Reference Summary

Method	Selection Type	Use Case	Slice Behavior
`.iloc[]`	Position-based	When you need rows by position (1st, 5th, last)	End exclusive
`.loc[]`	Label-based	When you have meaningful index values	End inclusive
`.at[]`	Label-based, single value	Fast scalar access	N/A
`.iat[]`	Position-based, single value	Fast scalar access	N/A
`df[condition]`	Boolean	Filtering by column values	N/A
`df.loc[index_condition]`	Boolean on index	Filtering by index values	N/A

Decision guide:

Know your row’s position? Use .iloc[]
Know your row’s label? Use .loc[]
Filtering by values? Use boolean indexing
Modifying data? Always use .loc[] with explicit assignment
Performance-critical single value access? Use .at[] or .iat[]

Master these patterns and you’ll write cleaner, faster, and more predictable Pandas code.

Introduction

Understanding Pandas Index Types

Selecting Rows with .iloc[] (Position-Based)

Single Row Selection

Slicing Multiple Rows

Selecting Non-Contiguous Rows

Selecting Rows with .loc[] (Label-Based)

Single Row Selection

Label-Based Slicing

Selecting Multiple Labels

DateTime Index Selection

Boolean Indexing with Index Conditions

Common Pitfalls and Best Practices

The Integer Index Trap

Avoiding Chained Indexing

SettingWithCopyWarning

Use .at[] and .iat[] for Single Values

Quick Reference Summary

Liked this? There's more.

Similar Articles

Selecting Rows with `.iloc[]` (Position-Based)

Selecting Rows with `.loc[]` (Label-Based)

Use `.at[]` and `.iat[]` for Single Values