December 10, 2024 (Updated April 19, 2026)Colin Jaffe/4 min read

Prediction Accuracy: Analyzing Model Performance

Model Performance Metrics

Accuracy

Correct predictions / total predictions. Watch for class imbalance.

Precision & Recall

Trade-off between false positives and false negatives.

F1 Score

Harmonic mean of precision and recall — balances both.

Confusion Matrix

Table of true vs predicted classes — see exactly where errors happen.

Master Machine Learning at Noble Desktop

Noble Desktop's Python Machine Learning Bootcamp covers scikit-learn, Keras, and applied ML.

This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

Evaluate model predictions using accuracy score and analyze errors. Watch this tutorial to learn the key concepts and techniques.

Let's run the same evaluation we ran on the linear regression and see how we did. So first, let's just take a look at our predictions. Model, predict based on the test data now.

I'll save that as `predictions`. And then we'll say, okay, I want to print a list version of, there are going to be about 3,000 things here. We don't want to print them all out.

So make a list of Y test and give me the first 20. And then also, same thing for our model, for our predictions. Give me the first 20.

And they're not perfectly matched. In this case, we can actually see this one, they got almost all right. All these zeros indicate employees who stayed.

This one, the actual, this is the actual answers. The third one actually left, but we did not predict that correctly. The first one stayed and we instead predicted it left.

That's two wrong out of 20—90%. That's pretty good. How about we take a look at predictions from 20 to 40?

Let's look at the next 20. Here, we didn't do quite as well. Here, we got one wrong.

We got another one wrong. Remove this number five here. And there were three more that left that we didn't catch at all.

That's five wrong out of 20 in that case. That's only 75%. But these are tiny samples—20 predictions out of 3,000.

Let's get an actual score. Our accuracy score, which is just, in this case, it's not about guessing the mean. This is not about getting numbers closer or not.

This is just, hey, how many predictions did you get correct? How many predictions did you get right out of how many predictions you made? Right, so again, like this one would be 75% because we missed five out of 20. And the previous one, we only missed two out of 20. So that was 90%.

Let's try to take a look. Also, I think the math I just did was incorrect. But that's why we have computers.

Okay, so we're going to say, give me model.score. What does this evaluate to? We pass it the test data and the corresponding answers. And we got overall 77%. That's not bad.

That's pretty good. Now, we're going to analyze next exactly what we got wrong because that's a pretty good score overall. But we're going to see that there's some real variance in what we got right and what we got wrong.

So let's take a look. We could have predicted correctly. We predicted, stayed, and they did.

So that's, you know, ones like this. Third one, we predicted they stayed. Yeah, that was right.

Another one, we predicted they left, and they did. There's actually none in this sample. If I undo this and run it again, we might find one.

I think there were some—maybe not. Let's try predictions from 40 to 60 and see if we find examples there. Let's see, one, two, three, four, five.

One, two, three, four, five, six. Nope, that was wrong. That was wrong.

I think this one is right. One, two, three, four, five, six. Yeah, this one we predicted they left and they did.

When they match, that's a correct prediction. And this is called a true negative. Zero, and it was zero.

True positive. It was one, and we guessed one. We didn't just guess; we estimated it was a one.

We predicted it. Now there's two different kinds of errors, and this will be important. We predicted they stayed, but they left.

We predicted zero, but it was one. Here's an example where we predicted they stayed, but they actually left.

That's a false negative. We said, nope, they didn't leave. But actually, they did leave.

We predicted negative, but that was false. Then there is the opposite situation. We predicted they left, but they actually stayed.

That's a false positive. One, two, three, four, five, six. Our sixth one here, we predicted left.

One, two, three, four, five, six, but they actually stayed. That's a false positive. We'll be analyzing this further because, although our overall score is good, we made some errors.

Overall, it's a good score, but we have some errors. Which ones did we get wrong in general? And which ones did we get right in general? Let's take a look at those more advanced evaluations in a moment.