Skip to main content
April 2, 2026Brian McClain/4 min read

Bar Charts and Data Sorting with Matplotlib

Master Data Visualization with Matplotlib Bar Charts

Chart Fundamentals

Charts display data in an X-Y coordinate system where the y-axis typically represents numeric values and the x-axis represents categories or time series data.

Vertical vs Horizontal Bar Charts

FeatureVertical BarsHorizontal Bars
Functionplt.barplt.barh
Label ReadabilityPoor for long labelsExcellent readability
Best Use CaseShort category namesLong category names
Recommended: Use horizontal bars for better label readability with longer category names

Setting Up Bar Chart Data

1

Extract Count Values

Convert the count column to a list using edu_group_df['count'] and listify it for chart data

2

Get Index Labels

Extract category names from the DataFrame index using list() on the index values

3

Prepare Chart Variables

Create two datasets: bar labels and numeric values to set bar lengths

The enumerate() Method

Use enumerate() to unlock both the index and item when looping. Syntax: for index, item in enumerate(some_list). This is essential for positioning chart labels.

Loop Access Patterns

Regular For Loop

Provides access only to the item itself. Limited when you need positional information for chart labeling.

enumerate() Loop

Unlocks both index and item access. Essential for chart text positioning and consecutive data pairing.

Creating the Bar Chart

1

Initialize Chart

Use plt.barh() for horizontal bars, feeding in category list and count values

2

Reverse Order

Apply .reverse() to both lists to display largest bars at the top

3

Add Data Labels

Use plt.text() with enumerate to position count values next to each bar

4

Format and Style

Set colors, title, axis labels, and adjust chart limits for proper spacing

Chart Enhancement Checklist

0/6
Chart Color Coordination

Customize chart element colors including title, axis labels, and ticks using color parameters. Use hex codes like #555 or #237 for precise color control.

Single vs Multi-Level Sorting

FeatureSingle SortMulti-Level Sort
Syntaxsort_values(by='column')sort_values(by=['col1', 'col2'])
Use CaseSimple rankingTiebreaker scenarios
ResultMay have unclear tiesClear hierarchical ranking
Recommended: Use secondary sorting for tiebreaker scenarios with multiple ranking criteria

Advanced Data Sorting

1

Primary Sort

Sort by main criteria like average score using ascending=False for descending order

2

Secondary Sort

Add second column as tiebreaker, such as math_score for tied average scores

3

Slice Results

Use slicing to display top results, like top 40 students with highest scores

Data Visualization Journey Complete

You've progressed through core programming, NumPy, Pandas, and now Matplotlib visualization. This foundation enables endless possibilities for data analysis and presentation.

This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

Data visualization transforms raw numbers into compelling visual narratives, and Matplotlib serves as Python's premier charting library for this purpose. Charts operate within an X-Y coordinate system where your vertical bars can stand side-by-side or stack horizontally, depending on your analytical needs and audience preferences.

The orientation choice matters more than you might think. Vertical bars work well for simple categories, but horizontal bars excel when dealing with lengthy labels—education levels, job titles, or product names that would otherwise appear cramped and rotated beneath vertical columns. The y-axis typically represents your quantitative values (sales figures, counts, percentages), while the X-axis displays categories or, in time series analysis, chronological progression.

Let's build a professional bar chart from our grouped students DataFrame. We'll use plt.bar for vertical orientation and plt.barh for horizontal bars. Given our education level labels, horizontal bars provide superior readability—a critical consideration when presenting to stakeholders who need to quickly interpret your findings.

First, we'll extract our count data into a workable format. The count column becomes our y-values, representing the magnitude each bar will display:

count_counts_list = list(edu_group_df['count'])

This conversion from Series to list isn't strictly necessary—Matplotlib handles Series objects gracefully—but explicit type conversion often prevents unexpected behavior in complex visualizations.

Next, we need the category labels. These aren't stored as columns but as index values, requiring a different extraction approach:

edu_levels_list = list(edu_group_df.index)

Now we possess both essential components: bar labels and the numerical data that determines bar length. This separation of concerns—data preparation followed by visualization—reflects professional data science workflows where clean, structured data enables compelling visual storytelling.

Before diving into chart creation, let's explore a crucial programming concept you'll need for advanced chart labeling. Python's enumerate function unlocks both item values and their positional indices during iteration:

for index, item in enumerate(some_list):

This dual access proves invaluable when you need precise positioning control for chart annotations. Standard loops provide only the item itself, but enumerate grants access to the underlying index structure, enabling sophisticated labeling strategies.

Consider this practical example with a fruits list:

for index, fruit in enumerate(fruits): print(index + 1, fruit)

The index + 1 starts numbering from one rather than zero—a common requirement in business presentations where stakeholders expect human-readable numbering.


Here's a more complex challenge: create smoothies from consecutive fruit pairs (apple-banana, banana-orange). This requires accessing the current item plus the next index position:

We'll store results in a dedicated list and use pretty printing for clean output:

import pprint as pp
smoothies = []

The solution involves conditional logic to prevent index overflow when reaching the list's end:

for i, fruit in enumerate(fruits):
    if i < len(fruits) - 1:
        smoothies.append(f"{fruit}-{fruits[i + 1]}")

This boundary checking prevents the common "list index out of range" error that occurs when trying to access elements beyond the list's scope.

Now, let's apply these concepts to create a professional bar chart. We'll use DodgerBlue for consistency—different colors within the same data category can confuse viewers and violate data visualization best practices:

plt.barh(edu_list, counts_list, color='DodgerBlue')

To improve visual hierarchy, let's reverse the lists so the largest values appear at the top:

edu_list.reverse()
counts_list.reverse()

This ordering follows the natural reading pattern and emphasizes your most significant findings.

Professional charts require clear labeling. We'll add count values directly on each bar using plt.text, which positions text at specific X-Y coordinates:

for i, count in enumerate(counts_list):
    plt.text(count + 5, i, str(count), va='center')

The count + 5 creates breathing room between bars and labels, while va='center' ensures vertical alignment with each bar's center point.


To accommodate these labels, we'll expand the chart's horizontal limits:

plt.xlim(0, 250)

This prevents label truncation and maintains professional appearance standards.

Complete your visualization with descriptive elements that orient your audience:

plt.title("Student Distribution by Educational Background")
plt.xlabel("Number of Students")
plt.yticks(color='#555')

The subtle color adjustment for y-axis labels reduces visual noise while maintaining readability. Professional visualizations balance information density with aesthetic appeal.

Finally, plt.show() renders your completed chart. This command signals plot completion and prepares the system for additional visualizations if needed.

Advanced data analysis often requires multi-level sorting to break ties and establish clear rankings. Consider scenarios where students share identical average scores—how do you prioritize them? Secondary sorting provides the solution:

students_df.sort_values(by=['average', 'math_score'], ascending=[False, False]).head(40)

This approach first ranks by average score, then uses math scores as tiebreakers. The dual ascending parameters ensure both sorts follow descending order, highlighting top performers while providing granular ranking resolution.

This multi-criteria sorting proves essential in competitive analysis, performance reviews, and any scenario where precise ranking matters. Understanding these techniques positions you to handle complex data relationships with confidence and analytical rigor.

Mastering these visualization fundamentals—from basic bar charts to sophisticated sorting algorithms—establishes the foundation for advanced data science work. Each concept builds upon previous knowledge, creating a comprehensive toolkit for transforming raw data into actionable business insights.

Key Takeaways

1Horizontal bar charts (plt.barh) provide better readability for long category labels compared to vertical bars
2The enumerate() method unlocks both index and item access in loops, essential for positioning chart text elements
3Chart data preparation requires extracting both numeric values and category labels from DataFrames
4Professional charts need consistent styling including colors, titles, axis labels, and proper spacing
5Use plt.text() with coordinate positioning to add data labels directly on charts for better readability
6Multi-level sorting with secondary criteria resolves ties and provides clearer data rankings
7Chart limits (plt.xlim) and text alignment (va='center') are crucial for professional presentation
8Data visualization builds upon foundational skills in programming, NumPy, and Pandas for comprehensive analysis

RELATED ARTICLES