Skip to main content

Logistic Regression with PyTorch

In this section, we will implement Logistic Regression using PyTorch. Like in the previous examples, we will use the Pima Indians Diabetes dataset to predict whether or not a person has diabetes.


Steps Covered:

  1. Loading and preparing the dataset.
  2. Building the logistic regression model from scratch.
  3. Training the model using gradient descent.
  4. Evaluating the model’s performance.
  5. Making predictions on new data.

1. Load and Prepare the Dataset

We begin by loading the Pima Indians Diabetes dataset using pandas, then converting the data into PyTorch tensors.

# Import required libraries
import torch
import torch.nn as nn
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',
'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome']

data = pd.read_csv(url, names=column_names)

# Split into features (X) and target (y)
X = data.drop('Outcome', axis=1)
y = data['Outcome']

# Split the data into 80% training and 20% testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features for better training performance
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Convert to PyTorch tensors
X_train_tensor = torch.tensor(X_train_scaled, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test_scaled, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.float32).unsqueeze(1)
y_test_tensor = torch.tensor(y_test.values, dtype=torch.float32).unsqueeze(1)

Explanation:

  • We load the dataset and split it into features (X) and target (y).
  • Standardization: Features are standardized to improve model performance.
  • The dataset is then converted into PyTorch tensors for further processing.

2. Build the Logistic Regression Model

We’ll define the logistic regression model in PyTorch using nn.Module, which consists of a linear transformation followed by the sigmoid activation function.

# Define the logistic regression model using nn.Module
class LogisticRegressionModel(nn.Module):
def __init__(self):
super(LogisticRegressionModel, self).__init__()
self.linear = nn.Linear(X_train_tensor.shape[1], 1) # One output

def forward(self, x):
return torch.sigmoid(self.linear(x))

# Initialize the model
model = LogisticRegressionModel()

# Print model architecture
print(model)

Explanation:

  • The LogisticRegressionModel class inherits from nn.Module, where we define a single linear layer and apply the sigmoid activation function to output a probability between 0 and 1.
  • We initialize the model, and it automatically adapts to the number of input features from the training data.

3. Define the Loss Function and Optimizer

For logistic regression, we use binary cross-entropy as the loss function and Stochastic Gradient Descent (SGD) as the optimizer.

# Define the binary cross-entropy loss function
criterion = nn.BCELoss()

# Define the optimizer (Stochastic Gradient Descent)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

Explanation:

  • BCELoss (Binary Cross-Entropy Loss) measures the difference between the predicted probabilities and the actual labels.
  • SGD is used to minimize the loss function by updating the model parameters during training.

4. Train the Model

We now train the logistic regression model using gradient descent for a fixed number of epochs. During each epoch, we compute the loss, backpropagate the gradients, and update the model’s weights.

# Set the number of epochs and batch size
epochs = 1000
batch_size = 32
n_batches = X_train_tensor.shape[0] // batch_size

# Training loop
for epoch in range(epochs):
for i in range(n_batches):
# Get batch of data
start = i * batch_size
end = start + batch_size
X_batch = X_train_tensor[start:end]
y_batch = y_train_tensor[start:end]

# Zero the gradients
optimizer.zero_grad()

# Forward pass: Compute predicted y by passing X_batch to the model
y_pred = model(X_batch)

# Compute the loss
loss = criterion(y_pred, y_batch)

# Backward pass: Compute gradient of the loss with respect to model parameters
loss.backward()

# Update weights
optimizer.step()

# Print loss every 100 epochs
if (epoch + 1) % 100 == 0:
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

Explanation:

  • Mini-batch Gradient Descent: The training loop processes small batches of data at a time to update the model parameters.
  • Backward Propagation: We calculate the gradients of the loss with respect to the model parameters using loss.backward().
  • Optimizer Step: The optimizer updates the parameters using the computed gradients.

5. Evaluate the Model

After training, we evaluate the model on the test set by calculating the accuracy.

from sklearn.metrics import accuracy_score

# Put the model in evaluation mode
model.eval()

# Predict probabilities on the test set
with torch.no_grad(): # No need to compute gradients during evaluation
y_pred_probs = model(X_test_tensor)
y_pred = (y_pred_probs >= 0.5).float() # Convert probabilities to binary outputs

# Convert tensors to numpy arrays
y_pred_np = y_pred.numpy()
y_test_np = y_test_tensor.numpy()

# Calculate accuracy
accuracy = accuracy_score(y_test_np, y_pred_np)
print(f"Test Accuracy: {accuracy * 100:.2f}%")

Explanation:

  • Evaluation Mode: We switch the model to evaluation mode using model.eval() to disable dropout or other training-specific layers.
  • We predict the probabilities on the test set and then convert these probabilities to binary outputs by applying a 0.5 threshold.
  • The accuracy is calculated using the accuracy_score function from scikit-learn.

6. Make Predictions on New Data

We use the trained model to make predictions on new, unseen data.

# Example of new data for prediction (standardized input)
new_data = [[6, 148, 72, 35, 0, 33.6, 0.627, 50]] # Example input
new_data_scaled = scaler.transform(new_data) # Scale the new data

# Convert to tensor
new_data_tensor = torch.tensor(new_data_scaled, dtype=torch.float32)

# Make prediction
with torch.no_grad():
predicted_proba = model(new_data_tensor)
predicted_class = (predicted_proba >= 0.5).float()

print(f"Predicted Class: {int(predicted_class.item())}")
print(f"Probability of being Diabetic: {predicted_proba.item() * 100:.2f}%")

Explanation:

  • New Data: We provide a new example input and predict the probability of being diabetic using the trained model.
  • We apply a threshold of 0.5 to classify the prediction as either 0 (not diabetic) or 1 (diabetic).

Summary

In this section, we successfully implemented Logistic Regression using PyTorch by following these steps:

  • Loading and preparing the dataset: We standardized the features and converted the data into PyTorch tensors.
  • Building the model: We defined a logistic regression model using a linear layer followed by a sigmoid activation function.
  • Training the model: We trained the model using mini-batch gradient descent and monitored the training loss.
  • Evaluating the model: We calculated the accuracy on the test set to evaluate the model’s performance.
  • Making predictions: We used the trained model to predict whether a new input is diabetic.

This example demonstrates how to implement logistic regression from scratch using PyTorch for binary classification tasks. Let me know if you need further adjustments or if you'd like to proceed with the next steps!