Deep Learning PyTorch Course, Model Training

Deep learning has received significant attention in the fields of data science and machine learning in recent years. In this article, we will detail the process of training a model using a deep learning framework called PyTorch. We will explain not only the theory but also provide Python code examples to help readers implement and train deep learning models. Finally, we will deliver the results in HTML format suitable for use on a WordPress blog.

1. Basics of Deep Learning

Deep learning is a subset of machine learning based on artificial neural networks (ANN, Artificial Neural Networks). It is the process of creating models that perform diagnoses or predictions based on input data. The model learns through training with data and can make predictions on new data as a result.

1.1 Artificial Neural Networks

Artificial neural networks are data processing systems composed of an input layer, hidden layers, and an output layer. Each node is assigned specific weights to process input signals and generates outputs by passing through activation functions. This process learns increasingly complex and abstract features as it passes through multiple layers.

2. What is PyTorch?

PyTorch is an open-source machine learning library developed by Facebook’s AI Research group. PyTorch is particularly useful for deep learning research and prototype development. It provides tensor operations and automatic differentiation features, making it easy to implement the training process of models.

2.1 Advantages of PyTorch

  • Dynamic computation graph: You can create graphs during code execution, allowing for more flexible model configuration.
  • Multiple GPU support: PyTorch operates effectively even when using multiple GPUs.
  • Active community: There is extensive documentation and various tutorials available to facilitate learning.

3. Overview of Model Training

The model training process consists of the following steps:

  1. Data Preparation: Collect and preprocess the data.
  2. Model Definition: Define the structure of the neural network model to be used.
  3. Set Loss Function and Optimization Algorithm: Define a loss function to calculate the difference between predictions and actual values and choose an optimization algorithm to update the model’s weights.
  4. Training Loop: Train the model by iterating through the entire dataset.
  5. Model Evaluation: Evaluate the model’s performance using new datasets.

4. Practice: Training a Simple Classification Model

Now let’s actually train a simple image classification model using PyTorch. In this example, we will use the MNIST dataset (a dataset of handwritten digits).

4.1 Installing Required Libraries

First, you need to install the required libraries. Use the following command to install:

pip install torch torchvision

4.2 Loading the Dataset

You can load the MNIST dataset using PyTorch’s torchvision library. First, set up the data loader.

import torch
import torchvision.transforms as transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader

# Data preprocessing
transform = transforms.Compose([
    transforms.ToTensor(),  # Convert image to tensor
    transforms.Normalize((0.5,), (0.5,))  # Normalize
])

# Download and load MNIST dataset
train_dataset = MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = MNIST(root='./data', train=False, download=True, transform=transform)

# Set up data loader
train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=64, shuffle=False)

4.3 Defining the Model

Next, we define the neural network model. We will build a simple Fully Connected Neural Network (FCNN).

import torch.nn as nn
import torch.nn.functional as F

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 128)  # Input layer
        self.fc2 = nn.Linear(128, 64)        # Hidden layer
        self.fc3 = nn.Linear(64, 10)         # Output layer

    def forward(self, x):
        x = x.view(-1, 28 * 28)  # Flatten image to 1D
        x = F.relu(self.fc1(x))  # ReLU activation function
        x = F.relu(self.fc2(x))  # ReLU activation function
        x = self.fc3(x)          # Output
        return x

4.4 Setting the Loss Function and Optimization Algorithm

We will use Cross Entropy Loss as the loss function and set Stochastic Gradient Descent (SGD) as the optimization algorithm.

model = SimpleNN()  # Create a model instance
criterion = nn.CrossEntropyLoss()  # Loss function
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)  # Optimization algorithm

4.5 Implementing the Training Loop

Implement the training loop to train the model. You can train it over multiple epochs.

num_epochs = 5  # Number of epochs

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Initialize model to 0
        optimizer.zero_grad()

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass and optimization
        loss.backward()
        optimizer.step()

        if (i+1) % 100 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(train_loader)}], Loss: {loss.item():.4f}')

4.6 Model Evaluation

After training, evaluate the model using the test dataset.

model.eval()  # Switch to evaluation mode
with torch.no_grad():  # Do not compute gradients
    correct = 0
    total = 0
    for images, labels in test_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)  # Predicted values
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy of the model on the test images: {100 * correct / total:.2f}%')

5. Conclusion

In this article, we detailed the process of training deep learning models and demonstrated how to train a simple classification model using PyTorch. You should now have a better understanding of how to structure and train deep learning models with PyTorch. As you progress, I encourage you to tackle more complex models and work with various datasets to deepen your understanding of deep learning.

Through this tutorial, I hope you expand your understanding of deep learning and gain practical experience. If you have any questions or comments, please leave them in the comments!

6. References

Deep Learning PyTorch Course, Definition of Model Parameters

Deep learning is a technology that learns and predicts data through artificial neural networks. In this article, we will take a closer look at how to define model parameters using PyTorch. PyTorch is a very useful library that provides dynamic computation graphs, making it great for research and prototype development. The parameters of the model are updated during the learning process and directly affect the performance of the neural network.

Structure of a Deep Learning Model

A deep learning model typically consists of an input layer, hidden layers, and an output layer. Each layer is made up of several nodes (or neurons), and each node is connected to the nodes of the previous layer. The strength of these connections is the model’s parameters. Generally, we define the following parameters:

  • Weights: Responsible for linear transformations between input and output.
  • Biases: A constant value added to each neuron, which increases the flexibility of the model.

Defining Model Parameters in PyTorch

When defining a model in PyTorch, you need to inherit from the torch.nn.Module class. By inheriting this class and creating a custom model, you can implement the forward pass of the model by defining the forward method.

Example: Implementing a Simple Neural Network Model

The code below is an example of defining a simple multi-layer perceptron (MLP) model using PyTorch. In this example, we implement a model with an input layer, two hidden layers, and an output layer.

    
import torch
import torch.nn as nn
import torch.optim as optim

class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_size1, hidden_size2, output_size):
        super(SimpleNN, self).__init__()
        # Define the model's parameters
        self.fc1 = nn.Linear(input_size, hidden_size1)  # First hidden layer
        self.fc2 = nn.Linear(hidden_size1, hidden_size2)  # Second hidden layer
        self.fc3 = nn.Linear(hidden_size2, output_size)  # Output layer

    def forward(self, x):
        x = torch.relu(self.fc1(x))  # Activation function for the first hidden layer
        x = torch.relu(self.fc2(x))  # Activation function for the second hidden layer
        x = self.fc3(x)  # Output layer
        return x

# Create model
input_size = 10
hidden_size1 = 20
hidden_size2 = 10
output_size = 1
model = SimpleNN(input_size, hidden_size1, hidden_size2, output_size)

# Check model parameters
print("Model parameters:")
for param in model.parameters():
    print(param.shape)
    
    

In the above code, we use nn.Linear to automatically initialize the weights and biases for each layer. You can check all model parameters via the model.parameters() method. The shape of each parameter is returned as a torch.Size object, which allows you to check the dimensions of the weights and biases.

Parameter Initialization of the Model

Model parameters must be initialized before training. By default, nn.Linear initializes weights using a normal distribution, but other initialization methods can be used. For example, there are He initialization and Xavier initialization methods.

Initialization Example

    
def initialize_weights(model):
    for m in model.modules():
        if isinstance(m, nn.Linear):
            nn.init.kaiming_normal_(m.weight)  # He initialization
            nn.init.zeros_(m.bias)  # Initialize bias to 0

initialize_weights(model)
    
    

Proper initialization is important to achieve better performance. The initialization pattern can significantly affect model training, allowing learning to speed up with each epoch.

Parameter Updates During Model Training

During training, parameters are updated through the backpropagation algorithm. After calculating the gradient of the loss function, the optimizer uses it to update the weights and biases.

Training Code Example

    
# Define loss function and optimizer
criterion = nn.MSELoss()  # Mean Squared Error Loss
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adam optimizer

# Generate dummy data
x_train = torch.randn(100, input_size)  # Input data
y_train = torch.randn(100, output_size)  # Target output

# Train model
num_epochs = 100
for epoch in range(num_epochs):
    model.train()  # Switch model to training mode

    # Forward pass
    outputs = model(x_train)
    loss = criterion(outputs, y_train)

    # Update parameters
    optimizer.zero_grad()  # Zero the gradients
    loss.backward()  # Backpropagation
    optimizer.step()  # Update parameters

    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
    
    

As training progresses, you can observe that the value of the loss function decreases. This indicates that the model is learning the parameters to fit the given data.

Conclusion

In this article, we explored how to define the parameters of a neural network model using PyTorch. We learned how to define the model structure and set the weights and biases. We also discussed the importance of initialization methods and parameter updates during the training process. Defining and updating these parameters is essential for maximizing the performance of deep learning models. We recommend practicing with Python and PyTorch to enhance your understanding and experiment with various models.

Deep Learning PyTorch Course, Model Definition

Deep learning is a field of artificial intelligence and machine learning, based on artificial neural networks that mimic the human brain.
Thanks to the advancement of large datasets and powerful computing power, deep learning technology is receiving much more attention than in the past.
In particular, PyTorch is one of the deep learning frameworks, preferred by many researchers and developers due to its ease of use and flexibility.
This post will delve deeply into model definitions using PyTorch.

1. Understanding Deep Learning Models

A deep learning model is an algorithm that uses multiple layers of neural networks to perform predictions on input data.
The model consists of an input layer, hidden layers, and an output layer, each composed of nodes (neurons) of the neural network.

1.1 Basic Structure of Neural Networks

A basic neural network is composed of the following elements:

  • Input Layer: The layer that receives the data entering the model.
  • Hidden Layer: The layer that processes the input information, which can be stacked into multiple layers.
  • Output Layer: The layer that outputs the final prediction results.

2. Installing PyTorch

To use PyTorch, it must first be installed. You can install it using pip, the Python package management tool.
Please enter the following command in your terminal to install it:

pip install torch torchvision

3. Defining Models

The way to define a model in PyTorch is very intuitive. When defining the structure of a network, it typically involves
creating a custom module by inheriting from the torch.nn.Module class.
Additionally, various layers and functions can be utilized through the torch.nn module.

3.1 Simple Model Example

The example below shows code for defining a simple multi-layer perceptron (MLP) model. This model takes
a 784-dimensional vector as input and outputs a 10-dimensional vector (classifying digits 0-9).


import torch
import torch.nn as nn
import torch.optim as optim

# Model Definition
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(784, 128)  # Input layer
        self.fc2 = nn.Linear(128, 64)    # Hidden layer
        self.fc3 = nn.Linear(64, 10)      # Output layer
        self.relu = nn.ReLU()             # ReLU activation function
        self.softmax = nn.Softmax(dim=1)  # Softmax function

    def forward(self, x):
        x = self.fc1(x)          # Input layer -> Hidden layer 1
        x = self.relu(x)        # Activation
        x = self.fc2(x)          # Hidden layer 1 -> Hidden layer 2
        x = self.relu(x)        # Activation
        x = self.fc3(x)          # Hidden layer 2 -> Output layer
        return self.softmax(x)   # Softmax applied
    

In the code above, the MLP class defines the neural network model. It includes three linear layers and two ReLU activation layers. The model’s forward method defines how the data flows through the network.

3.2 Training the Model

After defining the model, it must be trained using data. For this, loss functions and optimization algorithms need to be set up.
Generally, for multi-class classification, cross-entropy loss function and Adam optimizer are commonly used.


# Setting the loss function and optimizer
model = MLP()
criterion = nn.CrossEntropyLoss()  # Loss function
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adam optimizer
    

Subsequently, the model will undergo many iterations to train. The code below shows an example of the entire training process.
It performs training for 10,000 iterations using the MNIST dataset.
It is important to note that the data is trained in mini-batches.


from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Dataset and loader settings
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

# Model Training
for epoch in range(5):  # Training for 5 epochs
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()    # Gradient initialization
        outputs = model(images.view(-1, 784))  # Flattening the images into a 784-dimensional vector
        loss = criterion(outputs, labels)  # Loss calculation
        loss.backward()         # Backpropagation
        optimizer.step()        # Weight update
        
        if (i+1) % 100 == 0:  # Logging every 100 batches
            print(f'Epoch [{epoch+1}/5], Batch [{i+1}/{len(train_loader)}], Loss: {loss.item():.4f}')
    

4. Various Model Architectures

The example above defined a simple multi-layer perceptron model, but in real deep learning, a variety of model architectures are needed.
For the second example, we will look at how to define a convolutional neural network (CNN).

4.1 Defining Convolutional Neural Networks (CNN)

Convolutional neural networks, widely used for processing image data, are defined with the following structure.


class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)  # Convolutional layer
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)  # Max pooling layer
        self.fc1 = nn.Linear(64 * 7 * 7, 128)  # Linear layer
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))  # First convolution + activation + pooling
        x = self.pool(F.relu(self.conv2(x)))  # Second convolution + activation + pooling
        x = x.view(-1, 64 * 7 * 7)  # Flattening
        x = F.relu(self.fc1(x))  # Linear layer + activation
        x = self.fc2(x)  # Output layer
        return x
    

5. Conclusion

In this post, we explored the methods to define models in PyTorch.
The codes discussed above can be applied widely from basic neural networks to CNNs.
Understanding the basics of deep learning and deepening your understanding of model definitions will lay the groundwork for solving more complex problems.
I encourage you to continue learning and practicing various deep learning technologies to gain extensive experience.

6. References

– Official PyTorch Documentation: https://pytorch.org/docs/stable/index.html
– Introductory Book on Deep Learning: “Deep Learning” by Ian Goodfellow, Yoshua Bengio, Aaron Courville

Deep Learning PyTorch Course, What is Machine Learning

1. Definition of Machine Learning

Machine Learning is a subfield of artificial intelligence that enables computers to learn from data and perform specific tasks. Typically, machine learning is characterized by the use of algorithms that can learn without being explicitly programmed. This is very useful for recognizing patterns in data, making predictions, and automating decision-making.

2. Basic Principles of Machine Learning

Machine learning models generally operate through the following process:

  1. Data Collection: Collect the data to be used for learning.
  2. Data Preprocessing: Perform tasks such as handling missing values and normalization to improve the quality of the data.
  3. Model Selection: Choose a machine learning model that is suitable for the problem.
  4. Training: Train the selected model using the data.
  5. Evaluation: Assess the model’s performance and adjust it if necessary.
  6. Prediction: Use the trained model to make predictions on new data.

3. Types of Machine Learning

Machine learning can primarily be divided into three types:

  • Supervised Learning: Learns the relationship between input and output when given input and output data. This mainly includes regression and classification problems.
  • Unsupervised Learning: Focuses on finding the structure or patterns in data when there is no output data available. Clustering is a representative example.
  • Reinforcement Learning: An agent learns strategies to maximize rewards through interaction with the environment.

4. What is PyTorch?

PyTorch is an open-source machine learning library developed by Facebook, primarily used as a framework for deep learning. PyTorch provides dynamic computation graphs, enabling flexible and intuitive coding. This is one of the reasons it is popular among researchers and developers.

Main Features of PyTorch

  • Dynamic Computation Graph: The computation graph is generated as soon as the code is executed, allowing easy modification of the model structure.
  • Diverse Tensor Operations: Enables tensor operations similar to NumPy, making it easy to preprocess training data.
  • GPU Support: Allows fast execution of large-scale operations by utilizing GPUs.
  • Scalability: Custom layers and models can be easily defined, making it applicable for various deep learning research.

5. Hands-on Machine Learning with PyTorch

Now we will build a simple machine learning model using PyTorch. We will be using the Iris dataset to create a model that classifies the types of flowers.

5.1. Loading the Dataset

First, we install the required libraries and load the data.


import torch
import torch.nn as nn
import torch.optim as optim
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder
import numpy as np

    

5.2. Data Preprocessing

After loading the Iris dataset, we separate the features and labels and carry out data preprocessing.


# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Normalize the Data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Convert to Tensors
X_train_tensor = torch.FloatTensor(X_train)
y_train_tensor = torch.LongTensor(y_train)
X_test_tensor = torch.FloatTensor(X_test)
y_test_tensor = torch.LongTensor(y_test)

    

5.3. Defining the Model

We define a simple neural network model consisting of an input layer, a hidden layer, and an output layer.


class IrisModel(nn.Module):
    def __init__(self):
        super(IrisModel, self).__init__()
        self.fc1 = nn.Linear(4, 10)  # 4 input features and 10 hidden nodes
        self.fc2 = nn.Linear(10, 3)   # 10 hidden nodes and 3 output nodes (types of flowers)

    def forward(self, x):
        x = torch.relu(self.fc1(x))  # Using ReLU as the activation function
        x = self.fc2(x)
        return x

model = IrisModel()

    

5.4. Training the Model

After defining the loss function and optimization technique, we train the model.


# Define Loss Function and Optimization Technique
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Train the Model
num_epochs = 100
for epoch in range(num_epochs):
    model.train()
    
    # Forward Pass
    outputs = model(X_train_tensor)
    loss = criterion(outputs, y_train_tensor)
    
    # Backward Pass and Optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

    

5.5. Evaluating the Model

Using the trained model, we perform predictions on the test data and evaluate the accuracy.


# Evaluate the Model
model.eval()
with torch.no_grad():
    test_outputs = model(X_test_tensor)
    _, predicted = torch.max(test_outputs.data, 1)
    accuracy = (predicted == y_test_tensor).sum().item() / y_test_tensor.size(0)
    print(f'Accuracy: {accuracy:.2f}')

    

6. Conclusion

In this tutorial, we explored the basic concepts of machine learning and the process of building a simple machine learning model using PyTorch. Machine learning is utilized in various fields, and PyTorch serves as a powerful tool for this purpose. We hope you will conduct in-depth research on a wider range of topics in the future.

We wish the advancements in deep learning and machine learning will aid your research!

Deep Learning PyTorch Course, Machine Learning Learning Algorithms

Today, artificial intelligence (AI) and machine learning (ML) play a crucial role in various industries and research fields. In particular, deep learning has established itself as a powerful tool for learning and predicting complex data patterns. PyTorch is an open-source deep learning library that helps build these deep learning models easily and intuitively. In this course, we will closely examine the basic concepts and implementation methods of machine learning algorithms using PyTorch.

1. Overview of Machine Learning

Machine learning is a collection of algorithms that analyze and learn from data to make predictions or decisions. One important classification of machine learning is supervised learning. In supervised learning, input and corresponding correct answers (labels) are provided to train the model. Here, we will explain linear regression, a representative machine learning algorithm, as an example.

2. Linear Regression

Linear regression is a method for modeling the linear relationship between input features and outputs. Mathematically, it is expressed as follows:

y = wx + b

Here, y represents the predicted value, w represents the weight, x represents the input value, and b represents the bias. The goal during the learning process is to find the optimal w and b. To do this, a loss function is defined and minimized. Generally, Mean Squared Error (MSE) is used.

2.1. Implementing Linear Regression with PyTorch


import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt

# Data generation
np.random.seed(42)
x_numpy = np.random.rand(100, 1) * 10  # Random numbers from 0 to 10
y_numpy = 2.5 * x_numpy + np.random.randn(100, 1)  # y = 2.5x + noise

# Convert NumPy arrays to PyTorch tensors
x_train = torch.FloatTensor(x_numpy)
y_train = torch.FloatTensor(y_numpy)

# Define linear regression model
model = nn.Linear(1, 1)

# Define loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Learning process
num_epochs = 100
for epoch in range(num_epochs):
    model.train()

    # Calculate predicted values
    y_pred = model(x_train)

    # Calculate loss
    loss = criterion(y_pred, y_train)

    # Print elapsed loss
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')

    # Initialize gradients, backpropagate and update weights
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

# Visualizing predictions
plt.scatter(x_numpy, y_numpy, label='Data')
plt.plot(x_numpy, model(x_train).detach().numpy(), color='red', label='Prediction')
plt.legend()
plt.show()
    

The code above creates a linear regression model, trains it based on data, and finally visualizes the prediction results. As learning progresses, the loss decreases. This indicates that the model is successfully learning the patterns in the data.

3. Deep Neural Networks

In deep learning, multiple layers of artificial neural networks are used to learn more complex data patterns. Such deep learning models can be implemented using a simple Multi-Layer Perceptron (MLP) structure. An MLP consists of an input layer, hidden layers, and an output layer, with each layer made up of nodes. Each node is connected to the nodes of the previous layer, and nonlinearity is introduced through an activation function.

3.1. Implementing an MLP Model


class NeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)  # First hidden layer
        self.fc2 = nn.Linear(hidden_size, output_size)  # Output layer
        self.relu = nn.ReLU()  # Activation function

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

# Prepare dataset
from sklearn.datasets import make_moons
x, y = make_moons(n_samples=1000, noise=0.2)

# Convert NumPy arrays to PyTorch tensors
x_train = torch.FloatTensor(x)
y_train = torch.FloatTensor(y).view(-1, 1)

# Define model, loss function, and optimizer
input_size = 2
hidden_size = 10
output_size = 1

model = NeuralNetwork(input_size, hidden_size, output_size)
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Learning process
num_epochs = 1000
for epoch in range(num_epochs):
    model.train()

    # Calculate predicted values
    y_pred = model(x_train)

    # Calculate loss
    loss = criterion(y_pred, y_train)

    # Initialize gradients, backpropagate and update weights
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print elapsed loss
    if (epoch + 1) % 100 == 0:
        print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')
    

The code above defines a basic Multi-Layer Perceptron model and demonstrates how to train it on a ‘make_moons’ dataset containing 1000 samples. ‘BCEWithLogitsLoss’ is a commonly used loss function for binary classification. You can observe that the loss decreases as the model learns.

4. Convolutional Neural Networks (CNN)

CNNs are primarily used for 2D data such as images. CNNs are composed of convolutional layers and pooling layers, which are effective in extracting features from images. Convolutional layers capture local characteristics from images, while pooling layers reduce the size of images to decrease computational workload.

4.1. Implementing a CNN Model


class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)  # First convolutional layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)  # Pooling layer
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)  # Second convolutional layer
        self.fc1 = nn.Linear(64 * 7 * 7, 128)  # First fully connected layer
        self.fc2 = nn.Linear(128, 10)  # Output layer
        
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))  # First convolution + pooling
        x = self.pool(F.relu(self.conv2(x)))  # Second convolution + pooling
        x = x.view(-1, 64 * 7 * 7)  # Flatten
        x = F.relu(self.fc1(x))  # First fully connected layer
        x = self.fc2(x)  # Output layer
        return x

# Load example data (MNIST)
import torchvision.transforms as transforms
from torchvision import datasets

transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

# Define model, loss function, and optimizer
model = CNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Learning process
num_epochs = 5
for epoch in range(num_epochs):
    for images, labels in train_loader:
        optimizer.zero_grad()  # Initialize gradients
        outputs = model(images)  # Calculate predicted values
        loss = criterion(outputs, labels)  # Calculate loss
        loss.backward()  # Backpropagate
        optimizer.step()  # Update weights

    print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')
    

The code above constructs a simple CNN model and demonstrates training it on the MNIST dataset. CNNs effectively learn features of images through convolution and pooling operations.

5. Conclusion

In this course, we have implemented linear regression in machine learning and various deep learning models using PyTorch. Deep learning is very useful for learning complex data, and PyTorch is a valuable tool in that process. I hope you experiment with various models using PyTorch and gain a deeper understanding of the data.

6. Additional Resources

If you would like to know more, please refer to the following materials:

© 2023 Deep Learning Institute. All rights reserved.