Train-Test Split for Predictive Modeling in Python
Train/Test Split Workflow
Import
from sklearn.model_selection import train_test_split.
Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2).
Train Model
model.fit(X_train, y_train) on training set only.
Evaluate
score = model.score(X_test, y_test) — never test on training data.
Noble Desktop's Python Machine Learning Bootcamp covers scikit-learn, Keras, and applied ML.
This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.
We've split up our data into X, which are our inputs, our features, and Y, which is our price in thousands. Now that we've got those, we need to talk about training and testing.