Skip to main content
April 2, 2026Colin Jaffe/4 min read

Accessing DataFrame Rows and Columns

Master Pandas DataFrame Indexing and Selection Techniques

DataFrame Challenge Overview

3 rows
Last rows to extract
3 columns
Last columns to extract
16 columns
Total columns in dataset
157 rows
Total rows in dataset

iloc vs loc: Key Differences

Featureiloc (Position-based)loc (Label-based)
Indexing TypeInteger positionRow/column labels
Slicing BehaviorExclusive endInclusive end
Column AccessNumeric indices onlyColumn names supported
Negative IndexingSupported (-3:)Limited support
Recommended: Use iloc for position-based selection, loc for label-based selection
Common Off-by-One Error

Remember that computers start counting at zero. A DataFrame with 16 columns has indices 0-15, not 1-16. This is one of the most frequent mistakes when working with iloc indexing.

Fixing the Column Selection Error

1

Identify the mistake

Expected 3 columns but got only 2 due to incorrect index calculation

2

Understand zero-based indexing

16 columns means indices 0-15, so last 3 are indices 13, 14, 15

3

Adjust the slice

Change from [14:17] to [13:16] to get columns 13, 14, and 15

4

Verify the result

Confirm that all 3 expected columns are now properly selected

Hardcoded vs Semantic Indexing

Pros
Semantic indexing (-3:) adapts to data changes automatically
More readable and self-documenting code
Reduces maintenance when dataset size changes
Less prone to off-by-one errors
Cons
Hardcoded indices (154:157) break when data changes
Requires manual updates if rows/columns are added
Less obvious what the actual indices represent
More susceptible to calculation mistakes

Best Practices for DataFrame Slicing

Use Negative Indexing

Employ -3: syntax for last three items instead of hardcoding specific indices. This makes your code more robust and adaptable to changing data sizes.

Prefer Column Names with loc

When using loc, leverage column names like 'Fuel Efficiency' instead of numeric indices. This improves code readability and reduces errors.

Remember Slicing Rules

iloc uses exclusive end slicing while loc uses inclusive end slicing. Understanding this difference prevents common indexing mistakes.

Instead of hardcoding it, we can say, okay, well, whatever the last three are, whatever those numbers are.
This highlights the importance of writing semantic, adaptable code that doesn't break when the underlying dataset changes size.

DataFrame Indexing Methods Comparison

iloc flexibility
75
loc readability
90
iloc error-prone
60
loc adaptability
85

DataFrame Slicing Verification Steps

0/5

This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

Let's examine the solution for this DataFrame slicing challenge. We need to extract the last three rows and columns using iloc first. Start with cars.iloc—and here's a crucial point for Python developers: it's surprisingly easy to forget the .iloc or .loc accessor, especially when you're deep in regular Python programming patterns. Even experienced practitioners make this mistake regularly.

With .iloc, we need to specify positional indices for our last three rows: 154, 155, and 156. However, remember that iloc uses exclusive upper bounds—a fundamental aspect of Python's slicing behavior. To get rows 154-156, we must specify the range as 154:157, even though row 157 doesn't exist. This "up to but not including" logic ensures we capture exactly the rows we want.

For the columns in this 16-column dataset, we want the last three: columns 14, 15, and 16. Following the same exclusive upper bound rule, we specify this as 14:17. The syntax tells pandas to include columns 14 and 15, then stop before the non-existent column 17.

But wait—let's test this approach and see what happens. The result reveals a classic programming error that even seasoned data scientists encounter regularly.

We successfully retrieved three rows, but only two columns appeared. This is the infamous "off-by-one error"—so common in programming that it has an official name and countless debugging hours attributed to it. The issue stems from zero-based indexing: while we have 16 columns total, they're indexed 0 through 15, not 1 through 16.

The correct last three columns are indexed 13, 14, and 15. To capture these with iloc's exclusive upper bound, we need the range 13:16. This fundamental indexing principle trips up developers regularly, particularly when switching between different programming contexts or working under pressure.

Testing the corrected syntax cars.iloc[154:157, 13:16] produces the expected result. However, this hardcoded approach introduces a significant maintainability problem that affects real-world data workflows.


Consider what happens when your dataset changes—a common scenario in production environments. Add one more row to your data, and suddenly indices 154-156 no longer represent the "last three" rows. The same applies to columns: add a new feature, and your hardcoded column indices become obsolete. This brittleness makes your code fragile and error-prone in dynamic data environments.

Python's negative indexing provides an elegant solution that makes your code more semantic and maintainable. Instead of hardcoding specific indices, use cars.iloc[-3:] for rows and cars.iloc[:, -3:] for columns. The syntax [-3:] means "from the third-to-last element onward," with the omitted end value defaulting to "until the end."

This approach offers multiple advantages: it clearly communicates intent (we want the last three elements), eliminates off-by-one errors, and automatically adapts to dataset changes. Whether your DataFrame has 100 or 1000 rows, [-3:] always captures the final three. The code becomes self-documenting and robust.

Now let's explore the equivalent operation using loc, which operates on labels rather than positions. I'll create a separate code block to preserve both examples for comparison—a best practice when demonstrating alternative approaches.

With cars.loc, we still use numeric indices for rows (154:156) since our DataFrame uses default integer row labels. However, loc uses inclusive bounds—a critical distinction from iloc. The range 154:156 includes both endpoints, capturing rows 154, 155, and 156 directly. No need for the 157 we required with iloc.

The real advantage of loc emerges with columns, where we can use meaningful labels instead of cryptic numbers. Instead of remembering that columns 13-15 represent our target data, we specify the actual column names: 'Fuel Efficiency':'Power Perf Factor'. This label-based approach makes code significantly more readable and maintainable.


Verifying our results confirms that both iloc and loc produce identical output, but the loc version communicates intent more clearly through descriptive column names.

We can make the loc approach more dynamic by calculating row indices programmatically. Since we're working with numeric row labels rather than true indices, we can't simply use -3. Instead, calculate the starting position: len(cars.index) - 3 gives us the third-to-last row index. Combined with Python's slice notation, len(cars.index) - 3: creates a range from that calculated position to the end.

This programmatic approach mirrors the flexibility we achieved with negative indexing in iloc, ensuring our code remains robust as datasets evolve. The final solution elegantly combines calculated row positioning with descriptive column labels, resulting in code that's both maintainable and readable.

That completes our exploration of DataFrame slicing techniques. These patterns—avoiding hardcoded indices, leveraging negative indexing, and choosing between positional and label-based selection—form the foundation of robust pandas data manipulation. Master these concepts, and you'll write more resilient, maintainable code for your data science workflows.

Key Takeaways

1iloc uses position-based indexing with exclusive end slicing, while loc uses label-based indexing with inclusive end slicing
2Off-by-one errors are extremely common when working with DataFrame indexing due to zero-based counting
3Semantic indexing using negative numbers (-3:) is more robust than hardcoded indices for selecting last n elements
4When using loc, column names can be used instead of numeric indices, improving code readability
5Adding or removing rows/columns breaks hardcoded index selections but not semantic ones
6Both methods can achieve the same results, but loc is often more readable when working with named columns
7Always verify your DataFrame slicing results, especially when learning or debugging indexing logic
8Understanding DataFrame dimensions (total rows and columns) is crucial before performing complex slicing operations

RELATED ARTICLES