Deep learning is a field of artificial intelligence (AI) and machine learning (ML) that uses algorithms to learn features from data by mimicking the structure of the human brain. The focus is on enabling computers to recognize and make judgments similarly to humans through this learning process.
1. History of Deep Learning
The concept of deep learning dates back to the 1940s and 1950s. During this period, a neural network technique called the Perceptron was proposed, which was one of the simple models that machines could learn from. However, due to initial limitations, deep learning did not receive much attention for a while.
As the 1990s approached, advancements in multi-layer perceptrons and backpropagation algorithms occurred. After 2000, deep learning began to gain attention once again as the amount of data exploded and advancements in GPUs were made. In particular, the popularity of deep learning surged when AlexNet was introduced at the ImageNet competition in 2012.
2. Basic Concepts of Deep Learning
Deep learning uses artificial neural networks composed of multiple layers. The nodes in each layer transform the features of the input data and pass them to the next layer. The output from the final output layer is used as the prediction.
2.1 Structure of Artificial Neural Networks
Artificial neural networks have the following basic structure:
- Input Layer: The layer where the model receives data.
- Hidden Layer: Located between the input layer and output layer, it performs various functions.
- Output Layer: Generates the final results of the model.
2.2 Activation Function
An activation function is a function that introduces non-linearity to the results computed at each node before passing them to the next layer. Common activation functions include:
- Sigmoid: The output range is between 0 and 1.
- ReLU (Rectified Linear Unit): Values less than 0 are converted to 0, and the remaining values are output as they are.
- Softmax: Primarily used for multi-class classification problems.
3. Introduction to PyTorch
PyTorch is a widely used open-source library for implementing deep learning models. It is suitable for both research and production, featuring powerful flexibility and dynamic computation graphs. Additionally, due to its excellent compatibility with Python, it is favored by many researchers and developers.
3.1 Advantages of PyTorch
- Dynamic Computation Graph: Allows for changes to the network structure during training, making experimentation and adjustments easier.
- Flexible Tensor Operations: Tensors can be easily used in a manner similar to NumPy.
- Rich Community: Many users and a variety of tutorials and examples are available.
4. Example of Image Classification using Deep Learning
Now let’s implement a deep learning model using PyTorch through a simple example. In this example, we will create a model to classify handwritten digits using the MNIST dataset.
4.1 Installing Required Libraries
pip install torch torchvision
4.2 Preparing the Dataset
The MNIST dataset consists of images of handwritten digits. The following code can be used to load the dataset.
import torch
from torchvision import datasets, transforms
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
trainset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
4.3 Defining the Model
Next, we define a simple artificial neural network model.
import torch.nn as nn
import torch.nn.functional as F
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(28 * 28, 128)
self.fc2 = nn.Linear(128, 64)
self.fc3 = nn.Linear(64, 10)
def forward(self, x):
x = x.view(-1, 28 * 28) # Flatten the input
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
model = SimpleNN()
4.4 Defining the Loss Function and Optimizer
To compute and update the loss of the model, we define the loss function and optimizer.
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
4.5 Training the Model
To train the model, we define the training loop.
epochs = 5
for epoch in range(epochs):
for images, labels in trainloader:
optimizer.zero_grad() # Zero the gradients
output = model(images) # Forward pass
loss = criterion(output, labels) # Calculate loss
loss.backward() # Backward pass
optimizer.step() # Update weights
print(f'Epoch {epoch+1}/{epochs}, Loss: {loss.item()}')
4.6 Evaluating the Model
To evaluate the performance of the model, we can use the test dataset.
testset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)
correct = 0
total = 0
with torch.no_grad():
for images, labels in testloader:
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f'Accuracy: {100 * correct / total}%')
5. Conclusion
Deep learning is bringing innovation across many fields, and PyTorch is a very powerful tool for its implementation. In this course, we covered the basic concepts of deep learning and implemented a simple model using PyTorch. I hope to build skills through more diverse projects in the future.