Skip to main content

Support Vector Machines with Scikit-Learn

In this article, we will walk through a practical example of implementing Support Vector Machines (SVM) using scikit-learn. We will apply SVM for classification on a popular dataset, using different kernels, and evaluate the model’s performance.


Steps Covered:

  1. Loading and preparing the dataset.
  2. Training a linear SVM model.
  3. Using different kernels (RBF kernel).
  4. Evaluating the model’s performance.
  5. Hyperparameter tuning with GridSearchCV.

1. Load and Prepare the Dataset

For this example, we will use the Iris dataset, a well-known dataset for classification tasks, which is available directly from scikit-learn.

# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the Iris dataset
iris = datasets.load_iris()

# Convert to pandas DataFrame for easier manipulation
data = pd.DataFrame(data=iris.data, columns=iris.feature_names)
data['target'] = iris.target

# Display the first few rows of the dataset
print(data.head())

Dataset Information:

  • The Iris dataset consists of 150 samples from each of three species of Iris flowers.
  • The features include:
    • Sepal length and sepal width.
    • Petal length and petal width.
  • The target variable represents the species (0: Setosa, 1: Versicolor, 2: Virginica).

We will focus on classifying two of the species, so we will reduce the dataset to a binary classification problem.

# Select only two classes for binary classification (Setosa and Versicolor)
data = data[data['target'] != 2]

# Split the data into features and target
X = data.drop('target', axis=1)
y = data['target']

# Split the data into training and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features (SVM is sensitive to feature scaling)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Explanation:

  • We reduced the dataset to a binary classification problem by selecting only the classes 0 (Setosa) and 1 (Versicolor).
  • The data was split into training and testing sets, and the features were standardized using StandardScaler to improve the performance of the SVM model.

2. Train a Linear SVM Model

Next, we will train an SVM model with a linear kernel using the SVC class from scikit-learn.

from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Initialize the linear SVM model
model_linear = SVC(kernel='linear', C=1.0, random_state=42)

# Train the model on the training data
model_linear.fit(X_train_scaled, y_train)

# Make predictions on the test set
y_pred = model_linear.predict(X_test_scaled)

# Evaluate the model performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy (Linear Kernel): {accuracy * 100:.2f}%")

Explanation:

  • We initialized an SVM model with a linear kernel.
  • The regularization parameter CC is set to 1.0 (you can adjust this value to control the margin and classification trade-off).
  • We trained the model on the scaled training data and evaluated its accuracy on the test set.

3. Train an SVM Model with an RBF Kernel

Next, we will use an RBF kernel to capture potential non-linear relationships in the data.

# Initialize the SVM model with an RBF kernel
model_rbf = SVC(kernel='rbf', C=1.0, gamma='scale', random_state=42)

# Train the model on the training data
model_rbf.fit(X_train_scaled, y_train)

# Make predictions on the test set
y_pred_rbf = model_rbf.predict(X_test_scaled)

# Evaluate the model performance
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)
print(f"Test Accuracy (RBF Kernel): {accuracy_rbf * 100:.2f}%")

Explanation:

  • The RBF kernel is a commonly used kernel for non-linear data.
  • The parameter γ\gamma controls the influence of individual training examples. Setting it to 'scale' adjusts the parameter automatically based on the number of features.

4. Model Evaluation

In addition to accuracy, we can use other metrics like precision, recall, and the confusion matrix to evaluate the model's performance.

from sklearn.metrics import classification_report, confusion_matrix

# Generate the classification report for the RBF kernel model
print("Classification Report (RBF Kernel):")
print(classification_report(y_test, y_pred_rbf))

# Generate the confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred_rbf)
print("Confusion Matrix (RBF Kernel):")
print(conf_matrix)

Explanation:

  • The classification report provides metrics like precision, recall, F1-score, and support for each class.
  • The confusion matrix helps in understanding the number of true positives, true negatives, false positives, and false negatives.

5. Hyperparameter Tuning with GridSearchCV

To find the best values for the parameters CC and γ\gamma, we can use GridSearchCV to perform an exhaustive search over a parameter grid.

from sklearn.model_selection import GridSearchCV

# Define the parameter grid for C and gamma
param_grid = {
'C': [0.1, 1, 10, 100],
'gamma': ['scale', 0.001, 0.01, 0.1, 1],
'kernel': ['rbf']
}

# Initialize the GridSearchCV object with 5-fold cross-validation
grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy', verbose=1)

# Fit the model on the training data
grid_search.fit(X_train_scaled, y_train)

# Get the best parameters and best score
best_params = grid_search.best_params_
best_score = grid_search.best_score_

print(f"Best Parameters: {best_params}")
print(f"Best Cross-Validation Accuracy: {best_score * 100:.2f}%")

Explanation:

  • GridSearchCV performs cross-validation and evaluates multiple combinations of hyperparameters.
  • In this example, we tune the C and gamma parameters for the RBF kernel to find the best configuration for the model.

Summary

In this example, we walked through how to implement Support Vector Machines (SVM) using scikit-learn. We:

  • Loaded and prepared the Iris dataset.
  • Trained SVM models using both linear and RBF kernels.
  • Evaluated the model using accuracy, classification reports, and confusion matrices.
  • Applied GridSearchCV to find the best hyperparameters for the RBF kernel.

This tutorial shows how SVM can be used for classification tasks and how different kernels and hyperparameter tuning can significantly affect model performance.

In the next sections, we will explore how to implement SVMs using TensorFlow and PyTorch.