Support Vector Machines with TensorFlow

In this article, we will walk through a practical example of implementing Support Vector Machines (SVM) using TensorFlow. While TensorFlow does not have a native SVM implementation, we can create an SVM using TensorFlow’s low-level API and train it through gradient-based optimization.

Steps Covered:

Loading and preparing the dataset.
Building a custom SVM model in TensorFlow.
Training the model using gradient descent.
Evaluating the model’s performance.
Making predictions on new data.

1. Load and Prepare the Dataset

We'll use the Iris dataset, similar to the previous scikit-learn example. To keep things simple, we will focus on a binary classification task (classifying Setosa and Versicolor species).

# Import required libraries
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the Iris dataset
iris = datasets.load_iris()
data = pd.DataFrame(data=iris.data, columns=iris.feature_names)
data['target'] = iris.target

# Select only two classes for binary classification (Setosa and Versicolor)
data = data[data['target'] != 2]

# Split the data into features and target
X = data.drop('target', axis=1)
y = data['target']

# Split the data into training and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features (SVM is sensitive to feature scaling)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Explanation:

We loaded the Iris dataset and reduced it to a binary classification problem by focusing on two classes: Setosa and Versicolor.
We split the dataset into training and testing sets and standardized the features using StandardScaler for better performance.

2. Build the Custom SVM Model in TensorFlow

Now, we will build a custom SVM model using TensorFlow. SVM optimizes for the following objective:

\min_{w, b} \frac{1}{2} \|w\|^2 + C \sum \max(0, 1 - y_i (w^T x_i + b))

This objective combines a hinge loss term and L2 regularization to find the optimal hyperplane.

# Create a custom SVM model using TensorFlow
class SVM(tf.keras.Model):
    def __init__(self):
        super(SVM, self).__init__()
        # Weights and bias
        self.w = tf.Variable(tf.random.normal([X_train_scaled.shape[1], 1]), dtype=tf.float32)
        self.b = tf.Variable(tf.random.normal([1]), dtype=tf.float32)

    def call(self, inputs):
        # Linear combination: w^T x + b
        return tf.matmul(inputs, self.w) + self.b

# Initialize the SVM model
model = SVM()

Explanation:

We defined a custom SVM class in TensorFlow that has learnable weights w and a bias b.
The forward pass (call method) computes the linear combination of inputs: $w^T x + b$ .

3. Define the Hinge Loss and Optimizer

We now define the hinge loss function, which is commonly used for SVM, and choose Stochastic Gradient Descent (SGD) as the optimizer to minimize this loss.

# Define the hinge loss function
def hinge_loss(y_true, y_pred):
    return tf.reduce_mean(tf.maximum(0., 1. - y_true * y_pred))

# Define the optimizer (Stochastic Gradient Descent)
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

Explanation:

Hinge Loss: The hinge loss function is used in SVM to penalize misclassified points, ensuring the margin is maximized.
SGD: We use the Stochastic Gradient Descent optimizer with a learning rate of 0.01 for updating the weights.

4. Train the Model

We will now define the training loop to update the weights and bias of the SVM model based on the hinge loss. We will also include the L2 regularization term to control the size of the weights.

# Training loop
epochs = 100
C = 1.0  # Regularization parameter

for epoch in range(epochs):
    with tf.GradientTape() as tape:
        # Forward pass: Compute predicted values
        y_pred = model(X_train_scaled)

        # Apply hinge loss and L2 regularization
        loss_value = hinge_loss(y_train.reshape(-1, 1), y_pred) + C * tf.reduce_sum(model.w ** 2)

    # Backpropagation: Compute gradients
    gradients = tape.gradient(loss_value, [model.w, model.b])

    # Update weights and bias
    optimizer.apply_gradients(zip(gradients, [model.w, model.b]))

    if (epoch + 1) % 10 == 0:
        print(f"Epoch {epoch+1}, Loss: {loss_value.numpy():.4f}")

Explanation:

L2 Regularization: We added $C \cdot \sum w^2$ to the loss to control the size of the weight vector and avoid overfitting.
Training Loop: The model is trained for 100 epochs. After every 10 epochs, the current loss is printed.

5. Evaluate the Model

After training, we can evaluate the model's performance on the test set.

# Make predictions on the test set
y_pred_test = model(X_test_scaled)

# Convert the predictions to binary class labels
y_pred_labels = tf.where(y_pred_test >= 0, 1, 0)

# Calculate accuracy
accuracy = np.mean(y_pred_labels.numpy().reshape(-1) == y_test)
print(f"Test Accuracy: {accuracy * 100:.2f}%")

Explanation:

Predictions: The model outputs raw scores, so we convert these to binary labels using 0 as the threshold.
Accuracy: We calculate the accuracy of the model by comparing the predicted labels with the true test labels.

6. Make Predictions on New Data

Finally, we will use the trained model to make predictions on new data.

# Example of new data (standardized input)
new_data = np.array([[5.1, 3.5, 1.4, 0.2]])  # Example input
new_data_scaled = scaler.transform(new_data)  # Scale the new data

# Make a prediction
y_new_pred = model(new_data_scaled)
predicted_class = tf.where(y_new_pred >= 0, 1, 0)

print(f"Predicted Class: {int(predicted_class.numpy()[0][0])}")

Explanation:

New Data: We scaled the new input and passed it to the trained model to predict whether the new data belongs to class 0 or 1.

Summary

In this article, we successfully implemented Support Vector Machines (SVM) using TensorFlow. We:

Built a custom SVM model using TensorFlow.
Defined the hinge loss and added L2 regularization to control model complexity.
Trained the model using gradient descent with Stochastic Gradient Descent (SGD).
Evaluated the model's performance and made predictions on new data.

This example demonstrates how to implement an SVM model from scratch in TensorFlow using custom training loops. In the next section, we will explore how to implement SVM using PyTorch.

Steps Covered:​

1. Load and Prepare the Dataset​

Explanation:​

2. Build the Custom SVM Model in TensorFlow​

Explanation:​

3. Define the Hinge Loss and Optimizer​

Explanation:​

4. Train the Model​

Explanation:​

5. Evaluate the Model​

Explanation:​

6. Make Predictions on New Data​

Explanation:​

Summary​

Steps Covered:

1. Load and Prepare the Dataset

Explanation:

2. Build the Custom SVM Model in TensorFlow

Explanation:

3. Define the Hinge Loss and Optimizer

Explanation:

4. Train the Model

Explanation:

5. Evaluate the Model

Explanation:

6. Make Predictions on New Data

Explanation:

Summary