The advancement of artificial intelligence and machine learning has brought innovation to all areas of our lives. Among them, GAN (Generative Adversarial Networks) and RNN (Recurrent Neural Networks) are gaining attention as very powerful deep learning techniques.
In this article, we will implement a GAN model using PyTorch and discuss how to collect training data for RNN in detail.
1. What is GAN?
GAN is a learning method in which two neural networks (Generator and Discriminator) compete with each other.
The Generator generates data similar to reality, and the Discriminator determines whether this data is real or fake.
GAN is used in various fields such as image generation, video creation, and music generation.
2. Structure of GAN
GAN consists of two parts:
- Generator: Generates new data based on a given random vector.
- Discriminator: Distinguishes between real data and fake data generated by the Generator.
The two networks compete to improve each other’s performance, and through this process, they generate higher quality data.
3. Learning Process of GAN
The learning process of GAN generally includes the following steps:
- (1) Generate random noise and input it into the Generator.
- (2) The Generator generates fake data.
- (3) The Discriminator receives real and fake data and outputs predictions for each.
- (4) GAN updates the weights of the Generator based on the Discriminator’s output.
- (5) Repeat this process until training is complete.
4. PyTorch Implementation of GAN
Environment Setup
First, you need to install the PyTorch library. Run the command below to install it.
pip install torch torchvision
GAN Code Example Using PyTorch
Below is a simple implementation example of GAN. We will create a model that generates handwritten digits using the MNIST dataset.
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
from torchvision import datasets
from torch.utils.data import DataLoader
# Hyperparameters
latent_size = 64
batch_size = 100
learning_rate = 0.0002
num_epochs = 200
# Load dataset
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
mnist = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
data_loader = DataLoader(dataset=mnist, batch_size=batch_size, shuffle=True)
# Define Generator class
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.main = nn.Sequential(
nn.Linear(latent_size, 256),
nn.ReLU(True),
nn.Linear(256, 512),
nn.ReLU(True),
nn.Linear(512, 1024),
nn.ReLU(True),
nn.Linear(1024, 784),
nn.Tanh()
)
def forward(self, x):
return self.main(x)
# Define Discriminator class
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.main = nn.Sequential(
nn.Linear(784, 1024),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(1024, 512),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(512, 256),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(256, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.main(x)
generator = Generator().cuda()
discriminator = Discriminator().cuda()
criterion = nn.BCELoss()
optimizer_G = optim.Adam(generator.parameters(), lr=learning_rate)
optimizer_D = optim.Adam(discriminator.parameters(), lr=learning_rate)
# Start training
for epoch in range(num_epochs):
for i, (images, _) in enumerate(data_loader):
# Real data labels
real_images = images.view(-1, 28*28).cuda()
real_labels = torch.ones(batch_size, 1).cuda()
# Fake data labels
noise = torch.randn(batch_size, latent_size).cuda()
fake_images = generator(noise)
fake_labels = torch.zeros(batch_size, 1).cuda()
# Discriminator training
optimizer_D.zero_grad()
outputs_real = discriminator(real_images)
outputs_fake = discriminator(fake_images.detach())
loss_D_real = criterion(outputs_real, real_labels)
loss_D_fake = criterion(outputs_fake, fake_labels)
loss_D = loss_D_real + loss_D_fake
loss_D.backward()
optimizer_D.step()
# Generator training
optimizer_G.zero_grad()
outputs = discriminator(fake_images)
loss_G = criterion(outputs, real_labels)
loss_G.backward()
optimizer_G.step()
print(f"Epoch [{epoch+1}/{num_epochs}], Loss D: {loss_D.item()}, Loss G: {loss_G.item()}")
if (epoch+1) % 10 == 0:
# Code to save results can be added here
pass
5. Introduction to RNN (Recurrent Neural Network)
RNN is a neural network structure suitable for processing ordered data, or sequence data. For example, data such as text, music, and time-series data fall into this category.
RNN works by remembering previous states and updating the current state based on these memories.
Structure of RNN
RNN consists of the following components:
- Input Layer: The first layer of the model that receives sequence data.
- Hidden Layer: Remembers previous states and combines them with the current input to produce outputs.
- Output Layer: The layer that generates the final output.
6. Collecting Training Data for RNN
To train an RNN, appropriate training data is required. Here, we will explain the process of collecting and preprocessing text data.
6.1 Data Collection
The data that can be used to train RNNs varies. For example, text data in various forms such as movie reviews, novels, and news articles is possible.
Data can be collected using web scraping tools (e.g., BeautifulSoup).
import requests
from bs4 import BeautifulSoup
url = 'https://example.com/articles' # Change to the desired URL
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
articles = []
for item in soup.find_all('article'):
title = item.find('h2').text
content = item.find('p').text
articles.append(f"{title}\n{content}")
with open('data.txt', 'w', encoding='utf-8') as f:
for article in articles:
f.write(article + "\n\n")
6.2 Data Preprocessing
The collected data needs to undergo a preprocessing procedure before being used as input to the RNN model. A typical preprocessing process includes:
- Lowercasing
- Removing special characters and numbers
- Removing stop words
import re
import nltk
from nltk.corpus import stopwords
# Downloading NLTK's list of stopwords
nltk.download('stopwords')
stop_words = set(stopwords.words('english'))
def preprocess_text(text):
# Lowercasing
text = text.lower()
# Remove special characters and numbers
text = re.sub(r'[^a-z\s]', '', text)
# Remove stop words
text = ' '.join([word for word in text.split() if word not in stop_words])
return text
# Apply preprocessing
preprocessed_articles = [preprocess_text(article) for article in articles]
7. RNN Model Implementation Example
Environment Setup
pip install torch torchvision nltk
RNN Code Example Using PyTorch
Below is a simple RNN model implementation example. It processes text data using word embedding.
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
# Define RNN model
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.embedding = nn.Embedding(input_size, hidden_size)
self.rnn = nn.RNN(hidden_size, hidden_size)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.embedding(x)
output, hidden = self.rnn(x)
output = self.fc(output[-1])
return output
# Create training dataset
class TextDataset(Dataset):
def __init__(self, texts, labels):
self.texts = texts
self.labels = labels
def __len__(self):
return len(self.labels)
def __getitem__(self, idx):
return torch.tensor(self.texts[idx]), torch.tensor(self.labels[idx])
# Set hyperparameters
input_size = 1000 # Number of words
hidden_size = 128
output_size = 2 # Number of classes to classify (e.g., positive/negative)
num_epochs = 20
learning_rate = 0.001
# Load and preprocess data
# Here replaced with dummy data.
texts = [...] # Preprocessed text data
labels = [...] # Corresponding class labels
dataset = TextDataset(texts, labels)
data_loader = DataLoader(dataset, batch_size=32, shuffle=True)
# Initialize model
model = RNN(input_size, hidden_size, output_size)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Start training
for epoch in range(num_epochs):
for texts, labels in data_loader:
optimizer.zero_grad()
outputs = model(texts)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item()}")
8. Conclusion
In this article, we learned the basic principles and implementation examples of GAN and RNN using PyTorch.
We examined the process of generating image data using GAN and processing text data in the case of RNN.
These technologies will continue to evolve and be used in more fields.
I encourage you to start new projects using these technologies.