Linear Regression with PyTorch
In this tutorial, we will implement linear regression using PyTorch. PyTorch offers a more intuitive approach to building machine learning models due to its dynamic computational graph and flexible API.
Like in the previous examples, we will use the California Housing dataset to predict house prices based on various features such as median income, house age, and other demographic information.
Objective:
We aim to build and train a linear regression model in PyTorch, then evaluate and use it to predict new house prices.
Steps in This Practical Example:
- Load and Explore the Dataset: Prepare the dataset for training.
- Create the Model: Define the linear regression model using PyTorch.
- Train the Model: Train the model on the dataset.
- Evaluate the Model: Measure the performance of the model.
- Make Predictions: Use the trained model to predict house prices for new data.
Step 1: Load and Explore the Dataset
Let’s load the California Housing dataset, convert it into a PyTorch-compatible format, and prepare it for training.
import torch
import torch.nn as nn
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_california_housing
# Load California Housing dataset
california_housing = fetch_california_housing()
# Convert to Pandas DataFrame for easier exploration
X = pd.DataFrame(california_housing.data, columns=california_housing.feature_names)
y = pd.Series(california_housing.target, name='Price')
# Split data into training and testing sets (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Normalize the features for better performance
mean = X_train.mean()
std = X_train.std()
X_train_norm = (X_train - mean) / std
X_test_norm = (X_test - mean) / std
# Convert Pandas data to PyTorch tensors
X_train_tensor = torch.tensor(X_train_norm.values, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test_norm.values, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values.reshape(-1, 1), dtype=torch.float32)
y_test_tensor = torch.tensor(y_test.values.reshape(-1, 1), dtype=torch.float32)
Explanation:
- We normalize the dataset to ensure the features are on a similar scale, helping the model converge faster during training.
- The dataset is split into training and test sets, and we convert the Pandas DataFrame to PyTorch tensors for use in the PyTorch model.
Step 2: Create the PyTorch Linear Regression Model
Now, let's define our linear regression model in PyTorch. We'll use torch.nn.Module
to create the model, which will consist of a single layer (linear transformation).
# Define the linear regression model using nn.Module
class LinearRegressionModel(nn.Module):
def __init__(self):
super(LinearRegressionModel, self).__init__()
self.linear = nn.Linear(X_train_tensor.shape[1], 1) # One output (house price)
def forward(self, x):
return self.linear(x)
# Initialize the model
model = LinearRegressionModel()
# Print model architecture
print(model)
Explanation:
nn.Linear(X, Y)
: A linear layer that applies a transformation to the input dataX
to predict the targetY
.- Model: This model consists of one layer, which represents the equation ( y = XW + b ), where
W
is the weight matrix andb
is the bias.
Step 3: Define the Loss Function and Optimizer
We'll use Mean Squared Error (MSE) as our loss function and Stochastic Gradient Descent (SGD) as the optimizer, similar to our TensorFlow example.
# Define the loss function (Mean Squared Error)
criterion = nn.MSELoss()
# Define the optimizer (Stochastic Gradient Descent)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
Explanation:
nn.MSELoss
: Computes the Mean Squared Error between predicted values and actual values.torch.optim.SGD
: Implements Stochastic Gradient Descent for updating the weights and bias during training.
Step 4: Train the Model
We will now train the model over a fixed number of epochs. During each epoch, we'll perform a forward pass, compute the loss, perform backpropagation to calculate gradients, and update the weights.
# Set training parameters
epochs = 500
batch_size = 32
steps_per_epoch = len(X_train_tensor) // batch_size
# Training loop
for epoch in range(epochs):
# Shuffle the training data
permutation = torch.randperm(X_train_tensor.size(0))
for i in range(steps_per_epoch):
# Get a mini-batch of data
indices = permutation[i * batch_size: (i + 1) * batch_size]
X_batch = X_train_tensor[indices]
y_batch = y_train_tensor[indices]
# Zero the gradients (important to prevent accumulation of gradients)
optimizer.zero_grad()
# Forward pass: Compute predicted y
y_pred = model(X_batch)
# Compute loss
loss = criterion(y_pred, y_batch)
# Backpropagation to compute gradients
loss.backward()
# Update weights
optimizer.step()
# Print loss every 50 epochs
if epoch % 50 == 0:
print(f"Epoch {epoch}, Loss: {loss.item():.4f}")
Explanation:
- Shuffling: At the beginning of each epoch, the data is shuffled to improve the model's learning process.
- Mini-batch Gradient Descent: We use a batch size of 32 to perform updates on smaller subsets of data.
- Zero Gradients: The gradients from the previous batch are cleared before computing new ones to prevent accumulation.
- Backward Pass: Calculates gradients using the chain rule and updates the model parameters.
Step 5: Evaluate the Model
Once the model is trained, we can evaluate its performance using the test set. We will calculate the R² score and Root Mean Squared Error (RMSE).
from sklearn.metrics import r2_score
# Put model in evaluation mode
model.eval()
# Predict house prices on the test set
with torch.no_grad(): # No need to compute gradients during evaluation
y_pred_tensor = model(X_test_tensor)
# Convert predictions to NumPy for metric calculations
y_pred = y_pred_tensor.numpy()
# Calculate R² score
r2 = r2_score(y_test, y_pred)
print(f"R² score: {r2}")
# Calculate RMSE
rmse = torch.sqrt(criterion(torch.tensor(y_pred), y_test_tensor)).item()
print(f"Root Mean Squared Error (RMSE): {rmse}")
Explanation:
model.eval()
: Switches the model to evaluation mode, disabling certain layers like dropout (if used) and ensuring no gradient calculations.torch.no_grad()
: Prevents gradient calculations during prediction, which speeds up evaluation.- R² and RMSE: Standard evaluation metrics to understand the model’s predictive power.
Step 6: Make Predictions on New Data
We can now use our trained model to predict house prices on new data.
# Example of new data for prediction
new_data = pd.DataFrame({
'MedInc': [8.3252],
'HouseAge': [41.0],
'AveRooms': [6.9841],
'AveBedrms': [1.0238],
'Population': [322.0],
'AveOccup': [2.5556],
'Latitude': [37.88],
'Longitude': [-122.23]
})
# Normalize new data
new_data_norm = (new_data - mean) / std
# Convert to tensor and make prediction
new_data_tensor = torch.tensor(new_data_norm.values, dtype=torch.float32)
predicted_price = model(new_data_tensor).item()
print(f"Predicted House Price: {predicted_price:.2f} (in 100,000s)")
This allows us to predict house prices for custom input data.
Summary and Key Takeaways:
- We successfully implemented linear regression using PyTorch, building the model from scratch.
- We trained the model using gradient descent and evaluated it using the R² score and RMSE.
- Finally, we made predictions for new data points using the trained model.
In the next section, we will compare PyTorch with TensorFlow to highlight their differences and strengths when implementing machine learning models.