Using PyTorch for GAN Deep Learning, First GAN

Generative Adversarial Networks (GANs) are an innovative deep learning model proposed by Ian Goodfellow in 2014, where two neural networks learn in opposition to each other. GANs are widely used in various fields such as image generation, text generation, and video generation. In this post, we will explain the basic concepts and implementation methods of GANs step by step using PyTorch.

1. Basic Concepts of GAN

GAN consists of two neural networks: the Generator and the Discriminator. The role of the Generator is to create data that looks real, and the Discriminator’s role is to determine whether the given data is real or fake data produced by the Generator. These two networks learn simultaneously; the Generator evolves to create increasingly sophisticated data to fool the Discriminator, while the Discriminator evolves to identify fake data more accurately.

1.1 Structure of GAN

The structure of GAN can be described simply as follows:

  • Generator: Accepts random noise as input and generates data that looks real.
  • Discriminator: Accepts real data and generated fake data as input and predicts whether each piece of data is real or not.

1.2 Learning Process of GAN

The learning process of GAN proceeds as follows:

  1. Using real data and random noise, the Generator (G) creates fake data.
  2. The generated data and real data are input to the Discriminator (D), and predictions for each data are obtained.
  3. The loss function of the Generator is set to maximize the probability that the Discriminator judges fake data as real.
  4. The loss function of the Discriminator is set to maximize the probability that it judges real data as real and fake data as fake.
  5. This process is repeated so that both networks compete with each other, improving their performance.

2. Implementing GAN using PyTorch

Now, let’s implement a simple GAN using PyTorch. Here, we will work on creating a GAN that generates images in numerical form using the MNIST dataset.

2.1 Environment Setup

First, we install and import the necessary libraries. We will use PyTorch and torchvision to load the dataset and build the model.

    
    !pip install torch torchvision matplotlib
    
    

2.2 Preparing the Dataset

We will load the MNIST dataset and perform data preprocessing. This will scale the image data between 0 and 1 and divide it into batches.

    
    import torch
    from torchvision import datasets, transforms
    from torch.utils.data import DataLoader

    # Load and preprocess the data
    transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.5,), (0.5,))
    ])

    dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
    dataloader = DataLoader(dataset, batch_size=64, shuffle=True)
    
    

2.3 Defining the Generator and Discriminator

Next, we will define the two key components of GAN: the Generator and the Discriminator. Here, the Generator takes random noise as input to generate images, and the Discriminator takes images as input to determine whether they are real or fake.

    
    import torch.nn as nn

    class Generator(nn.Module):
        def __init__(self):
            super(Generator, self).__init__()
            self.model = nn.Sequential(
                nn.Linear(100, 256),
                nn.ReLU(),
                nn.Linear(256, 512),
                nn.ReLU(),
                nn.Linear(512, 1024),
                nn.ReLU(),
                nn.Linear(1024, 28 * 28),
                nn.Tanh()
            )

        def forward(self, z):
            return self.model(z).view(-1, 1, 28, 28)

    class Discriminator(nn.Module):
        def __init__(self):
            super(Discriminator, self).__init__()
            self.model = nn.Sequential(
                nn.Flatten(),
                nn.Linear(28 * 28, 512),
                nn.LeakyReLU(0.2),
                nn.Linear(512, 256),
                nn.LeakyReLU(0.2),
                nn.Linear(256, 1),
                nn.Sigmoid()
            )

        def forward(self, img):
            return self.model(img)
    
    

2.4 Initializing the Model and Setting Loss Function, Optimizer

We will initialize the Generator and Discriminator and specify the loss function and optimizers. We will use CrossEntropyLoss and the Adam optimizer.

    
    generator = Generator()
    discriminator = Discriminator()

    ad = torch.optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))
    ag = torch.optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))

    criterion = nn.BCELoss()
    
    

2.5 Training the GAN

Now, let’s train the GAN. During each epoch, we train the Generator and Discriminator, and we can see the generated images.

    
    import matplotlib.pyplot as plt
    import numpy as np

    def train_gan(generator, discriminator, criterion, ag, ad, dataloader, epochs=50):
        for epoch in range(epochs):
            for real_imgs, _ in dataloader:
                batch_size = real_imgs.size(0)

                # Generate real images and labels
                real_labels = torch.ones(batch_size, 1)
                noise = torch.randn(batch_size, 100)
                fake_imgs = generator(noise)
                fake_labels = torch.zeros(batch_size, 1)

                # Train the Discriminator
                discriminator.zero_grad()
                real_loss = criterion(discriminator(real_imgs), real_labels)
                fake_loss = criterion(discriminator(fake_imgs.detach()), fake_labels)
                d_loss = real_loss + fake_loss
                d_loss.backward()
                ad.step()

                # Train the Generator
                generator.zero_grad()
                g_loss = criterion(discriminator(fake_imgs), real_labels)
                g_loss.backward()
                ag.step()

            print(f'Epoch [{epoch + 1}/{epochs}], D Loss: {d_loss.item():.4f}, G Loss: {g_loss.item():.4f}')

            # Save generated images
            if (epoch + 1) % 10 == 0:
                save_generated_images(generator, epoch + 1)

    def save_generated_images(generator, epoch):
        noise = torch.randn(64, 100)
        generated_imgs = generator(noise)
        generated_imgs = generated_imgs.detach().numpy()
        generated_imgs = (generated_imgs + 1) / 2  # Rescale to [0, 1]

        fig, axs = plt.subplots(8, 8, figsize=(8, 8))
        for i, ax in enumerate(axs.flat):
            ax.imshow(generated_imgs[i][0], cmap='gray')
            ax.axis('off')
        plt.savefig(f'generated_images_epoch_{epoch}.png')
        plt.close()

    train_gan(generator, discriminator, criterion, ag, ad, dataloader, epochs=50)
    
    

2.6 Checking the Results

After training is completed, check the generated images. As GANs undergo iterative training, they become capable of generating data images that increasingly resemble real ones. Ultimately, the performance of GANs is evaluated by the quality of the generated images. If the training goes well, the generated images will have unfamiliar yet beautiful forms.

3. Conclusion

In this post, we explained how to implement GANs using PyTorch. I hope you were able to experience creating your own GAN along with the basic concepts of GANs and actual code. GANs are powerful tools, but building a robust model requires diverse and in-depth research. We invite you into the world of GANs that create beautiful and creative results!

4. References

Using PyTorch for GAN Deep Learning, First CycleGAN

One of the innovative advancements in artificial intelligence is the emergence of Generative Adversarial Networks (GANs). GAN consists of a structure where two neural networks compete with each other, comprising a Generator and a Discriminator. This article will explore a variant of GAN called CycleGAN and provide a detailed explanation of how to implement it using PyTorch.

1. Basic Concept of GAN

GAN is a model proposed by Ian Goodfellow in 2014 that operates by having two networks learn adversarially. The generator creates data, while the discriminator determines whether the data is real or fake. In this process, the generator progressively improves to produce data that can deceive the discriminator.

2. Introduction to CycleGAN

CycleGAN is a variant of GAN that learns image transformation between two domains. For example, it can perform tasks such as converting summer landscape images into winter landscape images. CycleGAN has a significant advantage in that it can learn mappings between two domains without paired training data.

The main feature is its structure, which consists of two generators and two discriminators. In this structure, the generator transforms images from one domain to another, maintaining cycle consistency by transforming the generated image back to the original domain.

3. Basic Idea of CycleGAN

The basic idea of CycleGAN is as follows:

  • Assume there are two domains, A and B, each containing images with different characteristics.
  • Generator G transforms images from domain A to domain B.
  • Generator F transforms images from domain B to domain A.
  • To maintain cycle consistency, when an image from A is transformed to B and back to A, it should be similar to the original image. This principle is referred to as “Cycle Consistency Loss”.

4. Loss Function for CycleGAN

The loss function for CycleGAN is structured as follows:

  • Main Loss
    • Adversarial Loss: A loss used to determine whether the generated image is real or fake.
    • Cycle Consistency Loss: A loss used to verify if the transformed image can return to the original image.

Using the generators and discriminators for the two domains, the loss function is calculated. The final loss function is defined as a weighted sum of Adversarial Loss and Cycle Consistency Loss.

5. Implementing CycleGAN: PyTorch Example

Now, let’s implement CycleGAN using PyTorch. The project structure will be organized as follows:

  • data/
    • trainA/ (Images from domain A)
    • trainB/ (Images from domain B)
  • models.py
  • train.py
  • utils.py

5.1 Data Loading

To train CycleGAN, we will first write the code to load the data. We will prepare the data using PyTorch’s Dataset and DataLoader.

Deep Learning GAN using PyTorch, Question-Answer Generator

In recent years, the rapid development of artificial intelligence (AI) technology has greatly improved the field of Natural Language Processing (NLP). In particular, Generative Adversarial Networks (GAN) are a powerful technique used to create new data samples. In this post, we will discuss how to implement GAN using PyTorch and the process of creating a question-answer generator.

1. Overview of GAN

Generative Adversarial Networks (GAN) are a machine learning framework introduced by Ian Goodfellow in 2014, where two neural networks, the Generator and the Discriminator, are trained in a competitive manner.

  • Generator: Responsible for generating fake data. It takes random noise as input and generates samples that resemble real data.
  • Discriminator: Responsible for determining whether the given data is real data or fake data created by the generator.

These two networks compete with each other to achieve their respective goals, ultimately leading the generator to produce more sophisticated data and the discriminator to differentiate it more accurately.

2. Mathematical Principles of GAN

The training process of GAN involves defining and optimizing the loss functions of the two networks. Each network has the following objective functions:

        L(D) = -E[log(D(x))] - E[log(1 - D(G(z)))]
        L(G) = -E[log(D(G(z)))]
    

Where:

  • D(x): The probability that the discriminator correctly classifies the real data x
  • G(z): The fake data generated by the generator from the random vector z
  • E[…]: Expected value

3. Overview of Question-Answer Generator

Using the GAN model, we can implement a question-answer generator in the field of natural language processing. The goal of this system is to generate questions and answers based on given context.

Now, we will explore how to create a question-answer generator using the basic structure of GAN.

4. Setting Up the PyTorch Environment

First, we need to install the PyTorch library. You can install PyTorch using the command below.

pip install torch torchvision

5. Preparing the Dataset

To create a question-answer generator, we first need to prepare a dataset. In this example, we will utilize a simple public dataset. We will use data that consists of pairs of questions and answers.

Example of the dataset:

  • Question: “What is Python?”
  • Answer: “Python is a high-level programming language.”
  • Question: “What is deep learning?”
  • Answer: “Deep learning is a machine learning technique based on artificial neural networks.”

6. Implementing the GAN Model

Now, let’s define the GAN architecture. The generator takes questions as input to generate answers, and the discriminator determines whether the generated answers are real data.

6.1 Implementing the Generator


import torch
import torch.nn as nn

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(100, 128),
            nn.ReLU(),
            nn.Linear(128, 256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.ReLU(),
            nn.Linear(512, 1024),
            nn.ReLU(),
            nn.Linear(1024, 2048),
            nn.ReLU(),
            nn.Linear(2048, 1)  # Output layer: 1 for solidity of generated answer
        )
        
    def forward(self, z):
        return self.net(z)
    

6.2 Implementing the Discriminator


class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(1, 512),
            nn.ReLU(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, 128),
            nn.ReLU(),
            nn.Linear(128, 1),
            nn.Sigmoid()  # Output layer: probability (0 or 1)
        )
        
    def forward(self, x):
        return self.net(x)
    

7. Training Process of GAN

Now we are ready to train the GAN model. We will use the question-answer pairs as training data, where the generator receives random noise as input to generate answers, and the discriminator differentiates between real answers and generated answers.


import torch.optim as optim

# Hyperparameters
num_epochs = 100
batch_size = 64
learning_rate = 0.0002

# Initialize models
generator = Generator()
discriminator = Discriminator()

# Loss and Optimizers
criterion = nn.BCELoss()
optimizer_G = optim.Adam(generator.parameters(), lr=learning_rate)
optimizer_D = optim.Adam(discriminator.parameters(), lr=learning_rate)

for epoch in range(num_epochs):
    for i, (questions, answers) in enumerate(dataloader):
        # Generate random noise
        z = torch.randn(batch_size, 100)

        # Generate fake answers
        fake_answers = generator(z)

        # Create labels
        real_labels = torch.ones(batch_size, 1)
        fake_labels = torch.zeros(batch_size, 1)

        # Train Discriminator
        optimizer_D.zero_grad()
        outputs = discriminator(real_answers)
        d_loss_real = criterion(outputs, real_labels)
        
        outputs = discriminator(fake_answers.detach())
        d_loss_fake = criterion(outputs, fake_labels)
        d_loss = d_loss_real + d_loss_fake
        d_loss.backward()
        optimizer_D.step()

        # Train Generator
        optimizer_G.zero_grad()
        outputs = discriminator(fake_answers)
        g_loss = criterion(outputs, real_labels)
        g_loss.backward()
        optimizer_G.step()

    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{num_epochs}], d_loss: {d_loss.item()}, g_loss: {g_loss.item()}')
    

8. Results and Performance Evaluation

Once the training is complete, the generator learns the conditional probability distribution to generate an answer for a given question. To evaluate the results, we need to compare the generated texts with real-world question-answer pairs. Various metrics, such as the BLEU score used in NLU evaluation, can be employed.

9. Conclusion

In this post, we explored how to implement a GAN-based question-answer generator using PyTorch. GANs are a powerful tool for generating simple data pairs in the real world. It is important to continue advancing GANs and researching ways to apply them to various applications in the future.

10. References

Deep Learning with GAN using PyTorch, Structured Data and Unstructured Data

1. Overview of GAN

GAN (Generative Adversarial Networks) is an innovative deep learning model proposed by Ian Goodfellow in 2014, operating by having a generator model and a discriminator model compete against each other during training. The basic components of GAN are the generator and the discriminator. The generator tries to create data that is similar to real data, while the discriminator determines whether the given data is real or generated. Through the competition between these two models, the generator increasingly produces data that resembles real data.

2. Key Components of GAN

2.1 Generator

The generator is a neural network that takes a random noise vector as input and generates data similar to real data. This network typically uses either a multilayer perceptron or a convolutional neural network.

2.2 Discriminator

The discriminator is a neural network that judges whether the input data is real or generated. Its ability to effectively distinguish between fake data created by the generator and real data is crucial.

2.3 Loss Function

The loss function of GAN is divided into the loss of the generator and the loss of the discriminator. The goal of the generator is to deceive the discriminator, while the goal of the discriminator is to accurately distinguish the generated data.

3. Training Process of GAN

The training process of GAN is iterative and consists of the following steps:

  1. Generate fake data through the generator using real data and random noise.
  2. Input the fake data and real data into the discriminator.
  3. Calculate how well the discriminator distinguished between fake and real data and update the loss.
  4. Update the generator to deceive the discriminator.

4. Applications of GAN

GAN is used in various fields, including:

  • Image generation
  • Video generation
  • Voice synthesis
  • Text generation
  • Data augmentation

5. Difference Between Structured Data and Unstructured Data

Structured data is organized in a format that can easily be represented in a relational database. On the other hand, unstructured data refers to data without a unique form or structure, such as text, images, and videos. GAN is primarily used for unstructured data but can also be utilized for structured data.

6. Example of GAN Implementation Using PyTorch

Below is a simple implementation example of GAN using PyTorch. In this example, we will generate handwritten digits using the MNIST dataset.

6.1 Setting Up the Environment

First, install and import the necessary libraries.

!pip install torch torchvision matplotlib

6.2 Preparing the Dataset

Load the MNIST dataset and transform it to a tensor.

import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,)),
])

train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

6.3 Model Definition

Define the generator and discriminator models.

import torch.nn as nn

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(100, 256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.ReLU(),
            nn.Linear(512, 1024),
            nn.ReLU(),
            nn.Linear(1024, 784), # 28x28 images like MNIST
            nn.Tanh(),
        )

    def forward(self, x):
        return self.model(x)

class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(784, 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 1),
            nn.Sigmoid(),
        )

    def forward(self, x):
        return self.model(x)

6.4 Setting Loss Function and Optimizer

Use BCELoss as the loss function and proceed with learning using the Adam optimizer.

criterion = nn.BCELoss()
generator = Generator()
discriminator = Discriminator()

optimizer_G = torch.optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))
optimizer_D = torch.optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))

6.5 Model Training

Perform model training. At the end of each epoch, generated samples can be checked.

import matplotlib.pyplot as plt

num_epochs = 100
for epoch in range(num_epochs):
    for i, (real_images, _) in enumerate(train_loader):
        batch_size = real_images.size(0)
        
        # Label generation
        real_labels = torch.ones(batch_size, 1)
        fake_labels = torch.zeros(batch_size, 1)

        # Discriminator training
        optimizer_D.zero_grad()
        outputs = discriminator(real_images.view(batch_size, -1))
        d_loss_real = criterion(outputs, real_labels)
        d_loss_real.backward()

        noise = torch.randn(batch_size, 100)
        fake_images = generator(noise)
        outputs = discriminator(fake_images.detach())
        d_loss_fake = criterion(outputs, fake_labels)
        d_loss_fake.backward()
        optimizer_D.step()

        # Generator training
        optimizer_G.zero_grad()
        outputs = discriminator(fake_images)
        g_loss = criterion(outputs, real_labels)
        g_loss.backward()
        optimizer_G.step()

    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], d_loss: {d_loss_real.item() + d_loss_fake.item():.4f}, g_loss: {g_loss.item():.4f}')
        # Check generated images
        with torch.no_grad():
            fake_images = generator(noise).view(-1, 1, 28, 28)
            grid = torchvision.utils.make_grid(fake_images, nrow=8, normalize=True)
            plt.imshow(grid.permute(1, 2, 0).numpy(), cmap='gray')
            plt.show()

6.6 Checking Results

As training progresses, the generator increasingly produces realistic handwritten digits, demonstrating the performance of GAN.

7. Conclusion

This article covered the overview of GAN, its components, training process, and implementation methods using PyTorch. GAN exhibits excellent performance in generating unstructured data and can be applied in various fields. With advancements in technology, the future possibilities are endless.

8. References

1. Ian Goodfellow et al., “Generative Adversarial Networks”, NeurIPS 2014.

2. PyTorch Documentation – https://pytorch.org/docs/stable/index.html

3. torchvision Documentation – https://pytorch.org/vision/stable/index.html

PyTorch-based GAN Deep Learning, Encoder-Decoder Model

Today, we will take a deep dive into the concepts of Generative Adversarial Networks (GAN) and Encoder-Decoder models. We will implement these two models using the PyTorch framework. GAN is a deep learning technique for generating data using two neural networks, while the Encoder-Decoder model is used to transform the structure of the data.

1. GAN (Generative Adversarial Networks)

GAN is a generative model proposed by Ian Goodfellow in 2014, primarily used for generation-related tasks. GAN consists of two main components: the Generator and the Discriminator. The Generator creates fake data, and the Discriminator determines whether the data is real or fake.

1.1 How GAN Works

The working principle of GAN can be summarized as follows:

  1. The Generator receives a random noise vector as input and generates fake data.
  2. The Discriminator compares the real data with the generated data to decide whether it’s real or fake.
  3. The Generator is continuously improved to fool the Discriminator.
  4. The Discriminator enhances its ability in response to the Generator’s improvements.

1.2 Mathematical Definition of GAN

The goal of GAN is to optimize the following two neural networks:

min_G max_D V(D, G) = E[log(D(x))] + E[log(1 - D(G(z)))].

Here, D(x) is the output of the Discriminator for real data, and G(z) is the fake data generated by the Generator.

2. Implementing GAN in PyTorch

2.1 Setting Up the Environment

!pip install torch torchvision

2.2 Preparing the Dataset

We will use the MNIST dataset to generate handwritten digits.

import torch
from torchvision import datasets, transforms

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

2.3 Defining the GAN Model

import torch.nn as nn

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(100, 256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.ReLU(),
            nn.Linear(512, 1024),
            nn.ReLU(),
            nn.Linear(1024, 784),
            nn.Tanh()  # Pixel values for MNIST range from -1 to 1
        )

    def forward(self, z):
        return self.model(z)

class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(784, 1024),
            nn.LeakyReLU(0.2),
            nn.Linear(1024, 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 1),
            nn.Sigmoid()
        )

    def forward(self, img):
        return self.model(img)

2.4 Implementing the Training Loop

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

generator = Generator().to(device)
discriminator = Discriminator().to(device)

criterion = nn.BCELoss()
optimizer_G = torch.optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))
optimizer_D = torch.optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))

for epoch in range(50):
    for i, (imgs, _) in enumerate(train_loader):
        imgs = imgs.view(imgs.size(0), -1).to(device)
        z = torch.randn(imgs.size(0), 100).to(device)

        real_labels = torch.ones(imgs.size(0), 1).to(device)
        fake_labels = torch.zeros(imgs.size(0), 1).to(device)

        # Training the Discriminator
        optimizer_D.zero_grad()
        outputs = discriminator(imgs)
        d_loss_real = criterion(outputs, real_labels)
        d_loss_real.backward()

        fake_imgs = generator(z)
        outputs = discriminator(fake_imgs.detach())
        d_loss_fake = criterion(outputs, fake_labels)
        d_loss_fake.backward()

        optimizer_D.step()

        # Training the Generator
        optimizer_G.zero_grad()
        outputs = discriminator(fake_imgs)
        g_loss = criterion(outputs, real_labels)
        g_loss.backward()
        optimizer_G.step()

    print(f'Epoch [{epoch+1}/50], d_loss: {d_loss_real.item() + d_loss_fake.item():.4f}, g_loss: {g_loss.item():.4f}')

3. Encoder-Decoder Model

The Encoder-Decoder model consists of two neural network structures that compress the input data and reconstruct the original data based on the compressed data. This model is primarily used in tasks such as natural language processing (NLP) and image transformation.

3.1 Encoder-Decoder Structure

The Encoder converts the input data into a latent space, while the Decoder restores it back to the original data from the latent space. This structure is particularly useful in applications like machine translation and image captioning.

3.2 Model Implementation

class Encoder(nn.Module):
        def __init__(self):
            super(Encoder, self).__init__()
            self.model = nn.Sequential(
                nn.Linear(784, 256),
                nn.ReLU(),
                nn.Linear(256, 64)
            )

        def forward(self, x):
            return self.model(x)

    class Decoder(nn.Module):
        def __init__(self):
            super(Decoder, self).__init__()
            self.model = nn.Sequential(
                nn.Linear(64, 256),
                nn.ReLU(),
                nn.Linear(256, 784),
                nn.Sigmoid()
            )

        def forward(self, z):
            return self.model(z)

3.3 Training Loop

encoder = Encoder().to(device)
decoder = Decoder().to(device)

optimizer = torch.optim.Adam(list(encoder.parameters()) + list(decoder.parameters()), lr=0.001)
criterion = nn.BCELoss()

for epoch in range(50):
    for imgs, _ in train_loader:
        imgs = imgs.view(imgs.size(0), -1).to(device)
        z = encoder(imgs)

        optimizer.zero_grad()
        reconstructed = decoder(z)
        loss = criterion(reconstructed, imgs)
        loss.backward()
        optimizer.step()

    print(f'Epoch [{epoch+1}/50], Loss: {loss.item():.4f}') 

Conclusion

In this article, we explored detailed explanations of GAN and Encoder-Decoder models and how to implement them in PyTorch. We understood the structure and working principles of GANs, enabling us to perform image generation tasks. Additionally, we learned how to efficiently process input data using the Encoder-Decoder model. These models can be applied in various fields of deep learning and have great potential for future advancements.

I hope this course helps readers deepen their understanding of advanced topics in deep learning.