How to Select Rows & Columns by Name or Index in Pandas Dataframe - Using loc and iloc
Last Updated :
28 Nov, 2024
When working with labeled data or referencing specific positions in a DataFrame, selecting specific rows and columns from Pandas DataFrame is important. In this article, we’ll focus on pandas functions—loc and iloc—that allow you to select rows and columns either by their labels (names) or their integer positions (indexes).
Let's see an basic example to understand both methods:
Python
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)
# Using loc (label-based)
result_loc = df.loc[0, 'Name'] # Select value at row 0 and column 'Name'
# Using iloc (position-based)
result_iloc = df.iloc[1, 2] # Select value at row 1 and column 2
print("Using loc:", result_loc)
print("Using iloc:", result_iloc)
Output:
Using loc: Alice
Using iloc: Los Angeles
Selecting Rows and Columns Using .loc[] (Label-Based Indexing)
The .loc[] method selects data based on labels (names of rows or columns). It is flexible and supports various operations like selecting single rows/columns, multiple rows/columns, or specific subsets.
Key Features of .loc[]:
- Label-based indexing.
- Can select both rows and columns simultaneously.
- Supports slicing and filtering.
Select a Single Row by Label:
Python
row = df.loc[0] # Select the first row
print(row)
Output:
Name Alice
Age 25
City New York
Name: 0, dtype: object
Select Multiple Rows by Labels:
Python
rows = df.loc[[0, 2]] # Select rows with index labels 0 and 2
print(rows)
Output:
Name Age City
0 Alice 25 New York
2 Charlie 35 Chicago
Select Specific Rows and Columns:
Python
subset = df.loc[0:1, ['Name', 'City']] # Select first two rows and specific columns
print(subset)
Output:
Name City
0 Alice New York
1 Bob Los Angeles
Filter Rows Based on Conditions:
Python
filtered = df.loc[df['Age'] > 25] # Select rows where Age > 25
print(filtered)
Output:
Name Age City
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
Selecting Rows and Columns Using .iloc[] (Integer-Position Based Indexing)
The .iloc[] method selects data based on integer positions (index numbers). It is particularly useful when you don’t know the labels but know the positions.
Key Features of .iloc[]:
- Uses integer positions (0, 1, 2, ...) to index rows and columns.
- Just like .loc[], you can pass a range or a list of indices.
- Supports slicing, similar to Python lists.
- Unlike .loc[], it is exclusive when indexing ranges, meaning that the end index is excluded.
Select a Single Row by Position:
Python
row = df.iloc[1] # Select the second row (index position = 1)
print(row)
Output:
Name Bob
Age 30
City Los Angeles
Name: 1, dtype: object
Select Multiple Rows by Positions:
Python
rows = df.iloc[[0, 2]] # Select first and third rows by position
print(rows)
Output:
Name Age City
0 Alice 25 New York
2 Charlie 35 Chicago
Select Specific Rows and Columns:
Python
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)
subset=df.iloc[[0,2],[1]]
print(subset)
Output:
Age
0 25
2 35
Key Differences Between loc
and iloc
Although both loc
and iloc
allow row and column selection, they differ in how they handle indexes:
Feature | loc | iloc |
---|
Indexing Basis | Label-based (uses row/column labels) | Position-based (uses integer positions) |
Inclusiveness | Inclusive of both start and end points | Exclusive of the end point (slicing) |
Row Selection | Works with index labels (could be strings) | Works with integer positions (0, 1, 2, ...) |
Column Selection | Works with column labels (can be strings) | Works with integer positions (0, 1, 2, ...) |
Use loc
when:
- You have specific labels for rows or columns and want to work with them directly.
- Your dataset has non-numeric indexes (e.g., strings, datetime).
Use iloc
when:
- You are working with numeric positions and don’t need to reference row/column names directly.
- You want to select data based on positions within the structure.
Both loc and iloc are incredibly useful tools for selecting specific data in a Pandas DataFrame. The key difference is whether you're selecting by label (loc) or index position (iloc). Understanding how and when to use these methods is essential for efficient data manipulation.