Deep Learning PyTorch Course, Model Definition

Deep learning is a field of artificial intelligence and machine learning, based on artificial neural networks that mimic the human brain.
Thanks to the advancement of large datasets and powerful computing power, deep learning technology is receiving much more attention than in the past.
In particular, PyTorch is one of the deep learning frameworks, preferred by many researchers and developers due to its ease of use and flexibility.
This post will delve deeply into model definitions using PyTorch.

1. Understanding Deep Learning Models

A deep learning model is an algorithm that uses multiple layers of neural networks to perform predictions on input data.
The model consists of an input layer, hidden layers, and an output layer, each composed of nodes (neurons) of the neural network.

1.1 Basic Structure of Neural Networks

A basic neural network is composed of the following elements:

  • Input Layer: The layer that receives the data entering the model.
  • Hidden Layer: The layer that processes the input information, which can be stacked into multiple layers.
  • Output Layer: The layer that outputs the final prediction results.

2. Installing PyTorch

To use PyTorch, it must first be installed. You can install it using pip, the Python package management tool.
Please enter the following command in your terminal to install it:

pip install torch torchvision

3. Defining Models

The way to define a model in PyTorch is very intuitive. When defining the structure of a network, it typically involves
creating a custom module by inheriting from the torch.nn.Module class.
Additionally, various layers and functions can be utilized through the torch.nn module.

3.1 Simple Model Example

The example below shows code for defining a simple multi-layer perceptron (MLP) model. This model takes
a 784-dimensional vector as input and outputs a 10-dimensional vector (classifying digits 0-9).


import torch
import torch.nn as nn
import torch.optim as optim

# Model Definition
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(784, 128)  # Input layer
        self.fc2 = nn.Linear(128, 64)    # Hidden layer
        self.fc3 = nn.Linear(64, 10)      # Output layer
        self.relu = nn.ReLU()             # ReLU activation function
        self.softmax = nn.Softmax(dim=1)  # Softmax function

    def forward(self, x):
        x = self.fc1(x)          # Input layer -> Hidden layer 1
        x = self.relu(x)        # Activation
        x = self.fc2(x)          # Hidden layer 1 -> Hidden layer 2
        x = self.relu(x)        # Activation
        x = self.fc3(x)          # Hidden layer 2 -> Output layer
        return self.softmax(x)   # Softmax applied
    

In the code above, the MLP class defines the neural network model. It includes three linear layers and two ReLU activation layers. The model’s forward method defines how the data flows through the network.

3.2 Training the Model

After defining the model, it must be trained using data. For this, loss functions and optimization algorithms need to be set up.
Generally, for multi-class classification, cross-entropy loss function and Adam optimizer are commonly used.


# Setting the loss function and optimizer
model = MLP()
criterion = nn.CrossEntropyLoss()  # Loss function
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adam optimizer
    

Subsequently, the model will undergo many iterations to train. The code below shows an example of the entire training process.
It performs training for 10,000 iterations using the MNIST dataset.
It is important to note that the data is trained in mini-batches.


from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Dataset and loader settings
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

# Model Training
for epoch in range(5):  # Training for 5 epochs
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()    # Gradient initialization
        outputs = model(images.view(-1, 784))  # Flattening the images into a 784-dimensional vector
        loss = criterion(outputs, labels)  # Loss calculation
        loss.backward()         # Backpropagation
        optimizer.step()        # Weight update
        
        if (i+1) % 100 == 0:  # Logging every 100 batches
            print(f'Epoch [{epoch+1}/5], Batch [{i+1}/{len(train_loader)}], Loss: {loss.item():.4f}')
    

4. Various Model Architectures

The example above defined a simple multi-layer perceptron model, but in real deep learning, a variety of model architectures are needed.
For the second example, we will look at how to define a convolutional neural network (CNN).

4.1 Defining Convolutional Neural Networks (CNN)

Convolutional neural networks, widely used for processing image data, are defined with the following structure.


class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)  # Convolutional layer
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)  # Max pooling layer
        self.fc1 = nn.Linear(64 * 7 * 7, 128)  # Linear layer
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))  # First convolution + activation + pooling
        x = self.pool(F.relu(self.conv2(x)))  # Second convolution + activation + pooling
        x = x.view(-1, 64 * 7 * 7)  # Flattening
        x = F.relu(self.fc1(x))  # Linear layer + activation
        x = self.fc2(x)  # Output layer
        return x
    

5. Conclusion

In this post, we explored the methods to define models in PyTorch.
The codes discussed above can be applied widely from basic neural networks to CNNs.
Understanding the basics of deep learning and deepening your understanding of model definitions will lay the groundwork for solving more complex problems.
I encourage you to continue learning and practicing various deep learning technologies to gain extensive experience.

6. References

– Official PyTorch Documentation: https://pytorch.org/docs/stable/index.html
– Introductory Book on Deep Learning: “Deep Learning” by Ian Goodfellow, Yoshua Bengio, Aaron Courville