Deep learning is a field of artificial intelligence (AI) that involves creating machines that learn from data through artificial neural networks to perform prediction and classification tasks. The advancements in deep learning over the past few years have brought about revolutionary changes and achievements in the field of artificial intelligence. In this course, we will explore the fundamental structure of deep learning in detail using PyTorch.
1. Basic Concepts of Deep Learning
In deep learning, data is received as input, processed through multiple layers, and generates the final output. During this process, artificial neural networks (ANN) are used. Neural networks are composed of multiple connected units called nodes (or neurons), and each neuron receives input, multiplies it by weights, adds a bias, and applies a nonlinear activation function.
1.1 Basic Structure of Neural Networks
The basic structure of a neural network consists of an input layer, hidden layers, and an output layer. Each layer is connected to the neurons of the next layer; the input layer accepts data, and the output layer provides results.
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(2, 3) # 2 inputs, 3 outputs
self.fc2 = nn.Linear(3, 1) # 3 inputs, 1 output
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
2. Introduction to PyTorch
PyTorch is a popular deep learning framework developed by Facebook AI Research, which offers easy-to-use and flexible features. Using PyTorch allows for simple GPU acceleration with tensor operations and supports dynamic computation graphs.
2.1 Basic Tensor
In deep learning, a tensor is the fundamental structure for representing data. A 1D tensor can be thought of as a vector, a 2D tensor as a matrix, and a 3D tensor as a multidimensional array.
import torch
# 1D tensor
tensor_1d = torch.tensor([1, 2, 3])
# 2D tensor
tensor_2d = torch.tensor([[1, 2], [3, 4]])
# 3D tensor
tensor_3d = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
3. Building a Deep Learning Model
Now, let’s build a simple deep learning model. We will create a basic neural network model using various APIs provided by PyTorch.
3.1 Data Preprocessing
Data preprocessing plays an important role in deep learning. It is necessary to prepare the dataset and transform it into a suitable format for training.
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
X, y = make_moons(n_samples=1000, noise=0.2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Data standardization
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
3.2 Model Definition
As mentioned earlier, the model is defined by inheriting from nn.Module. This time, let’s use the sigmoid activation function instead of Relu.
import torch.nn as nn
import torch.nn.functional as F
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(2, 3)
self.fc2 = nn.Linear(3, 1)
def forward(self, x):
x = F.sigmoid(self.fc1(x))
x = self.fc2(x)
return x
3.3 Model Training
To train the model, we need to define the loss function and optimization algorithm. We can use binary cross-entropy (BCE) as the loss function and Adam for optimization.
import torch.optim as optim
model = SimpleNN()
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.float32).view(-1, 1)
for epoch in range(1000):
model.train()
optimizer.zero_grad()
outputs = model(X_train_tensor)
loss = criterion(outputs, y_train_tensor)
loss.backward()
optimizer.step()
if (epoch + 1) % 100 == 0:
print(f'Epoch [{epoch + 1}/1000], Loss: {loss.item():.4f}')
3.4 Model Evaluation
After the model training is complete, we evaluate the model’s performance using the test data. Here, we measure accuracy.
model.eval()
with torch.no_grad():
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_pred = model(X_test_tensor)
y_pred = (y_pred > 0).float()
accuracy = (y_pred.view(-1) == torch.tensor(y_test, dtype=torch.float32)).float().mean()
print(f'Accuracy: {accuracy:.4f}')
4. Conclusion
In this lecture, we examined the basic concepts of deep learning and the process of building a simple neural network model using PyTorch. Deep learning can be applied to various fields, and more complex models require deeper structures and diverse techniques. In the next lecture, we will learn about more complex deep learning architectures such as CNNs (Convolutional Neural Networks) and RNNs (Recurrent Neural Networks).