Deep Learning GAN using PyTorch, Question-Answer Generator

In recent years, the rapid development of artificial intelligence (AI) technology has greatly improved the field of Natural Language Processing (NLP). In particular, Generative Adversarial Networks (GAN) are a powerful technique used to create new data samples. In this post, we will discuss how to implement GAN using PyTorch and the process of creating a question-answer generator.

1. Overview of GAN

Generative Adversarial Networks (GAN) are a machine learning framework introduced by Ian Goodfellow in 2014, where two neural networks, the Generator and the Discriminator, are trained in a competitive manner.

  • Generator: Responsible for generating fake data. It takes random noise as input and generates samples that resemble real data.
  • Discriminator: Responsible for determining whether the given data is real data or fake data created by the generator.

These two networks compete with each other to achieve their respective goals, ultimately leading the generator to produce more sophisticated data and the discriminator to differentiate it more accurately.

2. Mathematical Principles of GAN

The training process of GAN involves defining and optimizing the loss functions of the two networks. Each network has the following objective functions:

        L(D) = -E[log(D(x))] - E[log(1 - D(G(z)))]
        L(G) = -E[log(D(G(z)))]
    

Where:

  • D(x): The probability that the discriminator correctly classifies the real data x
  • G(z): The fake data generated by the generator from the random vector z
  • E[…]: Expected value

3. Overview of Question-Answer Generator

Using the GAN model, we can implement a question-answer generator in the field of natural language processing. The goal of this system is to generate questions and answers based on given context.

Now, we will explore how to create a question-answer generator using the basic structure of GAN.

4. Setting Up the PyTorch Environment

First, we need to install the PyTorch library. You can install PyTorch using the command below.

pip install torch torchvision

5. Preparing the Dataset

To create a question-answer generator, we first need to prepare a dataset. In this example, we will utilize a simple public dataset. We will use data that consists of pairs of questions and answers.

Example of the dataset:

  • Question: “What is Python?”
  • Answer: “Python is a high-level programming language.”
  • Question: “What is deep learning?”
  • Answer: “Deep learning is a machine learning technique based on artificial neural networks.”

6. Implementing the GAN Model

Now, let’s define the GAN architecture. The generator takes questions as input to generate answers, and the discriminator determines whether the generated answers are real data.

6.1 Implementing the Generator


import torch
import torch.nn as nn

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(100, 128),
            nn.ReLU(),
            nn.Linear(128, 256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.ReLU(),
            nn.Linear(512, 1024),
            nn.ReLU(),
            nn.Linear(1024, 2048),
            nn.ReLU(),
            nn.Linear(2048, 1)  # Output layer: 1 for solidity of generated answer
        )
        
    def forward(self, z):
        return self.net(z)
    

6.2 Implementing the Discriminator


class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(1, 512),
            nn.ReLU(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, 128),
            nn.ReLU(),
            nn.Linear(128, 1),
            nn.Sigmoid()  # Output layer: probability (0 or 1)
        )
        
    def forward(self, x):
        return self.net(x)
    

7. Training Process of GAN

Now we are ready to train the GAN model. We will use the question-answer pairs as training data, where the generator receives random noise as input to generate answers, and the discriminator differentiates between real answers and generated answers.


import torch.optim as optim

# Hyperparameters
num_epochs = 100
batch_size = 64
learning_rate = 0.0002

# Initialize models
generator = Generator()
discriminator = Discriminator()

# Loss and Optimizers
criterion = nn.BCELoss()
optimizer_G = optim.Adam(generator.parameters(), lr=learning_rate)
optimizer_D = optim.Adam(discriminator.parameters(), lr=learning_rate)

for epoch in range(num_epochs):
    for i, (questions, answers) in enumerate(dataloader):
        # Generate random noise
        z = torch.randn(batch_size, 100)

        # Generate fake answers
        fake_answers = generator(z)

        # Create labels
        real_labels = torch.ones(batch_size, 1)
        fake_labels = torch.zeros(batch_size, 1)

        # Train Discriminator
        optimizer_D.zero_grad()
        outputs = discriminator(real_answers)
        d_loss_real = criterion(outputs, real_labels)
        
        outputs = discriminator(fake_answers.detach())
        d_loss_fake = criterion(outputs, fake_labels)
        d_loss = d_loss_real + d_loss_fake
        d_loss.backward()
        optimizer_D.step()

        # Train Generator
        optimizer_G.zero_grad()
        outputs = discriminator(fake_answers)
        g_loss = criterion(outputs, real_labels)
        g_loss.backward()
        optimizer_G.step()

    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{num_epochs}], d_loss: {d_loss.item()}, g_loss: {g_loss.item()}')
    

8. Results and Performance Evaluation

Once the training is complete, the generator learns the conditional probability distribution to generate an answer for a given question. To evaluate the results, we need to compare the generated texts with real-world question-answer pairs. Various metrics, such as the BLEU score used in NLU evaluation, can be employed.

9. Conclusion

In this post, we explored how to implement a GAN-based question-answer generator using PyTorch. GANs are a powerful tool for generating simple data pairs in the real world. It is important to continue advancing GANs and researching ways to apply them to various applications in the future.

10. References