Deep Learning PyTorch Course, Self-Organizing Map

The Self-Organizing Map (SOM) is an unsupervised learning algorithm used for nonlinear dimensionality reduction and data clustering. In this lecture, we will explain the basic concepts of SOM, how it works, and how to implement it using Pytorch.

What is a Self-Organizing Map (SOM)?

The Self-Organizing Map is a neural network originally developed by Teuvo Kohonen. SOM is used to map high-dimensional data into a lower-dimensional space (usually a 2D grid). In this process, data is organized into a map consisting of neighboring nodes that have similar characteristics.

Main Features of SOM

Unsupervised Learning: It can handle unlabeled data.
Dimensionality Reduction: Reduces high-dimensional data to lower dimensions while preserving important features of the data.
Clustering: Similar data points are grouped in the same region.

How SOM Works

SOM learns by calculating the distance between the input vector and the node vectors. Here are the typical learning steps of SOM:

1. Initialization

All nodes are initialized randomly. Each node has a weight vector with the same dimension as the input data.

2. Input Data Selection

Randomly select a training sample. Each sample becomes an input to the SOM.

3. Finding the Nearest Node

Find the node that is most similar to the selected input data. This node is called the Best Matching Unit (BMU).

4. Weight Update

Update the weights of the BMU and its neighboring nodes to move closer to the input data. The process is as follows:


w_{i}(t+1) = w_{i}(t) + α(t) * h_{i,j}(t) * (x(t) - w_{i}(t))

Where:

w_{i}: Weight vector of the node
α(t): Learning rate
h_{i,j}(t): Neighbor function of node i regarding the BMU
x(t): Input vector

5. Iteration

Repeat steps 2-4 for a sufficient number of epochs to gradually update the weights.

Implementing SOM with Pytorch

Now let’s implement SOM using Pytorch. Here we will show you how to build and visualize a basic SOM.

Installing Required Libraries

First, install the required libraries.

!pip install torch numpy matplotlib

Defining the Model Class

Next, we define the SOM class. This class includes functions for weight initialization, finding the BMU, and updating weights.


import numpy as np
import torch

class SelfOrganizingMap:
    def __init__(self, m, n, input_dim, learning_rate=0.5, sigma=None):
        self.m = m  # grid rows
        self.n = n  # grid columns
        self.input_dim = input_dim
        self.learning_rate = learning_rate
        self.sigma = sigma if sigma else max(m, n) / 2

        # Initialize weight vectors
        self.weights = torch.rand(m, n, input_dim)

    def find_bmu(self, x):
        distances = torch.sqrt(torch.sum((self.weights - x) ** 2, dim=2))
        bmu_index = torch.argmin(distances)
        return bmu_index // self.n, bmu_index % self.n  # return row, column

    def update_weights(self, x, bmu, iteration):
        learning_rate = self.learning_rate * np.exp(-iteration / 100)
        sigma = self.sigma * np.exp(-iteration / 100)

        for i in range(self.m):
            for j in range(self.n):
                h = self.neighbourhood(bmu, (i, j), sigma)
                self.weights[i, j] += learning_rate * h * (x - self.weights[i, j])

    def neighbourhood(self, bmu, point, sigma):
        distance = np.sqrt((bmu[0] - point[0]) ** 2 + (bmu[1] - point[1]) ** 2)
        return np.exp(-distance ** 2 / (2 * sigma ** 2))

    def train(self, data, num_iterations):
        for i in range(num_iterations):
            for x in data:
                bmu = self.find_bmu(x)
                self.update_weights(x, bmu, i)

Preparing Data and Training the Model

We will prepare appropriate data and train the SOM model. Here we will use randomly generated data.


# Generate random data
data = torch.rand(200, 3)  # 200 samples, 3 dimensions

# Create and train SOM
som = SelfOrganizingMap(10, 10, 3)
som.train(data, 100)

Visualizing the Results

We will visualize the weights of the trained SOM to check the distribution of the data.


import matplotlib.pyplot as plt

def plot_som(som):
    plt.figure(figsize=(8, 8))
    for i in range(som.m):
        for j in range(som.n):
            plt.scatter(som.weights[i, j, 0].item(), som.weights[i, j, 1].item(), c='blue')
    plt.title('Self Organizing Map')
    plt.xlabel('Dimension 1')
    plt.ylabel('Dimension 2')
    plt.show()

plot_som(som)

Conclusion

In this lecture, we explored the basic principles of Self-Organizing Maps (SOM) and how to implement SOM using Pytorch. SOM is an effective unsupervised learning technique that is useful for identifying patterns in data and performing clustering. In the future, we can experiment with SOM’s application on more complex datasets or apply optimization techniques to enhance learning performance.

I hope this article has helped you explore the world of deep learning! If you have any questions or feedback, please leave a comment.