Skip to main content
Colin Jaffe/3 min read

Model Confusion: Insights from the Confusion Matrix

Confusion Matrix Quadrants

True Positive (TP)

Predicted positive, actually positive — correct.

False Positive (FP)

Predicted positive, actually negative — Type I error.

False Negative (FN)

Predicted negative, actually positive — Type II error.

True Negative (TN)

Predicted negative, actually negative — correct.

Master Machine Learning at Noble Desktop

Noble Desktop's Python Machine Learning Bootcamp covers scikit-learn, Keras, neural networks, and applied ML.

This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

Evaluate the model's confusion matrix to identify prediction errors and class imbalances. Watch this tutorial to learn the key concepts and techniques.

Let's take a look at what kind of categories we got right or wrong. We're going to make a confusion matrix, which is one of my favorite names. What a good name.

So a confusion matrix shows where we got confused, essentially. Where our model got confused, rather. We're not confused—we've got this.

But our model got pretty confused. Despite this 77%, we're going to see that there was a specific category that really confused it. So let's make that confusion matrix.

We're going to say CM—that's a standard name for confusion matrix. Confusion matrix—that's a function provided by sklearn.metrics. We pass it our actual correct answers from the test data and our predictions. Now we make a DataFrame from that matrix.

We can take a look at it. CMDF is a pandas DataFrame—a new DataFrame. We'll pass it the confusion matrix as the data.

We'll name the columns "Predicted Stayed" and "Predicted Left." And then we'll also pass it the row names, the index: "Actually Stayed" and "Actually Left."

Now let's take a look at that DataFrame. All right. We got quite a lot right, but also quite a lot wrong.

Let's take a look. Now, we predicted that 8,538 stayed and they did. That's great.

We predicted that 734 left and they actually left. These upper-left and lower-right entries are where we got things right. Predicted stayed—they stayed.

Predicted left—they left. That's pretty good. If we look at our "Predicted Stayed, " we got about 80% of them right, maybe a little more.

Out of the 10,000 or so that we predicted stayed, about 80% of them were right. But when we predicted left, we correctly identified slightly more who actually did leave compared to those we got wrong, but not by much.

For the people we predicted left, we were about 55% correct. 55% to 60% correct—I'm not doing the exact math here.

Because guess what? The computer is going to do the math for us. When we look at it differently, considering how many people actually left, here's the "Actually Left" group.

That's around 2,800. We only correctly identified 25% of those people, right?

Actually, no—that's very incorrect. Of the people who actually left, we only got 25% of them right. We predicted they left, and they actually left, great.

We predicted they left, but they actually stayed. Actually, we predicted they stayed, but they actually left. That's a huge number.

Of the people who actually left, we got 2,100 wrong and only 700 right. So that's pretty bad. We definitely have some errors here, and there's a way to measure that, which we'll look at next.