Theory Behind Affinity Propagation
Affinity Propagation is a unique clustering algorithm that automatically identifies representative data points, known as exemplars, which serve as the center of clusters. Unlike traditional clustering methods, Affinity Propagation does not require the number of clusters to be specified in advance. Instead, it determines the optimal number of clusters based on the data's internal structure.
1. Key Concepts in Affinity Propagation
1.1 Similarity
The Affinity Propagation algorithm starts by defining a similarity measure between pairs of data points. The similarity represents how well-suited data point is to be the exemplar for data point . The similarity can be calculated using various distance metrics, such as Euclidean distance or cosine similarity. In the case of Euclidean distance, the similarity is typically defined as:
Here, and are the feature vectors of data points and , respectively, and denotes the Euclidean distance between them. The negative sign ensures that points closer to each other have higher similarity.
1.2 Responsibility and Availability
Affinity Propagation relies on two types of messages exchanged between data points: responsibility and availability. These messages are iteratively updated to determine which data points should be exemplars.
-
Responsibility : The responsibility message reflects how well-suited point is to be the exemplar for point , considering other potential exemplars for . Mathematically, it is defined as:
This equation implies that the responsibility is the similarity minus the maximum of the sum of availability and similarity for any other candidate exemplar . This helps to focus on the best candidate exemplar.
-
Availability : The availability message reflects how appropriate it would be for point to choose point as its exemplar, considering how many other points want to select as their exemplar. It is computed as:
Here, is the self-responsibility, indicating the extent to which is suitable as its own exemplar. The sum captures how many other points consider to be their exemplar.
1.3 Message Passing
The key to Affinity Propagation lies in iteratively updating these responsibility and availability messages until convergence. Initially, all availability values are set to zero:
The responsibility and availability messages are then updated iteratively according to the following rules:
-
Update Responsibility:
-
Update Availability:
-
Self-Responsibility:
For each point to consider itself as an exemplar:
These updates continue until the changes in responsibility and availability are below a certain threshold, indicating convergence.
1.4 Selection of Exemplars
After the message-passing process converges, the exemplars are selected. For each data point , the exemplar is chosen based on the maximum of the sum of responsibility and availability:
This means that the exemplar for point is the point that maximizes the sum of responsibility and availability.
2. Preference Parameter
The preference parameter plays a crucial role in Affinity Propagation. It determines the likelihood of each point being chosen as an exemplar. Typically, is set to the median or minimum of the input similarities to influence the number of clusters. A higher preference value increases the number of clusters, while a lower value reduces it.
3. Convergence and Stability
Affinity Propagation can be sensitive to initialization and parameter choices, particularly the preference parameter. Convergence is not always guaranteed, and the algorithm may oscillate between configurations. To mitigate this, damping factors are often introduced in the update equations to stabilize the iterations.
3.1 Damping Factor
The damping factor is introduced to control the update of messages: