PyTorch-based GAN Deep Learning, Apples and Oranges

Generative Adversarial Networks (GANs) are a type of generative model that generate data similar to real data through competition between two neural networks (generator and discriminator).
In this article, we will explore how to generate images of apples and oranges using GANs. We will implement GAN using the PyTorch framework and provide Python code for practice.

1. What is GAN?

GAN is a model proposed by Ian Goodfellow in 2014, where two artificial neural network structures compete against each other to learn.
This structure can be divided into the following two parts:

Generator: It takes random noise as input and generates data similar to real data.
Discriminator: It determines whether the input data is real data or fake data generated by the generator.

The training process of GAN is as follows:

The generator generates data through random noise.
The discriminator compares the real data and the generated data to judge if it is real or fake.
The generator is updated to create more realistic data based on the discriminator’s judgment.
The discriminator is updated to distinguish between real and fake data more accurately.

2. Preparing the Dataset

To train the GAN, a dataset containing images of apples and oranges needs to be prepared. In this example, we will collect apple and orange data from Kaggle or other open datasets.
The image data will be resized and normalized to the same size, then converted to tensors. Typically, it is common to resize images to (64, 64)
and normalize them to the range of [-1, 1].

2.1. Image Preprocessing

Below is the Python code to implement the image preprocessing process:


import os
import numpy as np
import cv2
from torchvision import transforms
from PIL import Image
import torch

def load_images_from_folder(folder):
    images = []
    for filename in os.listdir(folder):
        img = cv2.imread(os.path.join(folder, filename))
        if img is not None:
            img = cv2.resize(img, (64, 64))
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            images.append(img)
    return np.array(images)

folder = 'path_to_your_dataset'
dataset = load_images_from_folder(folder)

transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])
tensor_images = [transform(Image.fromarray(img)).unsqueeze(0) for img in dataset]

images_tensor = torch.cat(tensor_images)

3. Implementing the GAN Structure

To implement GAN, we first need to define the generator and discriminator.
The generator typically uses Fully Connected Layers and Convolutional Layers to generate images.
The discriminator uses Convolutional Layers to judge the authenticity of images.
Below is a simple GAN model written in PyTorch.

3.1. Generator Model


import torch.nn as nn

class Generator(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, 128),
            nn.ReLU(),
            nn.Linear(128, 256),
            nn.ReLU(),
            nn.Linear(256, output_dim),
            nn.Tanh()  # Output range to [-1, 1]
        )
    
    def forward(self, z):
        img = self.model(z)
        return img.view(img.size(0), 3, 64, 64)  # Reshape for image output

3.2. Discriminator Model


class Discriminator(nn.Module):
    def __init__(self, input_dim):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=4, stride=2, padding=1),
            nn.LeakyReLU(0.2),
            nn.Conv2d(32, 64, kernel_size=4, stride=2, padding=1),
            nn.LeakyReLU(0.2),
            nn.Flatten(),
            nn.Linear(64 * 16 * 16, 1),
            nn.Sigmoid()  # Output range to [0, 1]
        )

    def forward(self, img):
        return self.model(img)

4. Training the GAN

To train the GAN, the following steps are repeated.
The generator generates images using random noise,
and the discriminator distinguishes between the generated images and real images.
Then, each model is updated based on the loss function.


import torch.optim as optim

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Hyperparameters
input_dim = 100
output_dim = 3 * 64 * 64
lr = 0.0002
num_epochs = 200

# Models and optimizers
generator = Generator(input_dim, output_dim).to(device)
discriminator = Discriminator(output_dim).to(device)
criterion = nn.BCELoss()
optimizer_G = optim.Adam(generator.parameters(), lr=lr)
optimizer_D = optim.Adam(discriminator.parameters(), lr=lr)

# Label for real and fake images
real_labels = torch.ones(batch_size, 1).to(device)
fake_labels = torch.zeros(batch_size, 1).to(device)

for epoch in range(num_epochs):
    for i, imgs in enumerate(dataloader):
        # Train Discriminator
        optimizer_D.zero_grad()
        real_imgs = imgs.to(device)
        real_loss = criterion(discriminator(real_imgs), real_labels)
        
        z = torch.randn(batch_size, input_dim).to(device)
        fake_imgs = generator(z)
        fake_loss = criterion(discriminator(fake_imgs.detach()), fake_labels)
        
        d_loss = real_loss + fake_loss
        d_loss.backward()
        optimizer_D.step()
        
        # Train Generator
        optimizer_G.zero_grad()
        g_loss = criterion(discriminator(fake_imgs), real_labels)
        g_loss.backward()
        optimizer_G.step()

        if (i + 1) % 100 == 0:
            print(f'Epoch [{epoch + 1}/{num_epochs}], Step [{i + 1}/{len(dataloader)}], '
                  f'D Loss: {d_loss.item():.4f}, G Loss: {g_loss.item():.4f}')

5. Results and Visualization

Once the training is complete, generated images can be visualized to evaluate performance.
Below is the Python code to display the generated images in a grid format.


import matplotlib.pyplot as plt

def show_generated_images(generator, num_images):
    z = torch.randn(num_images, input_dim).to(device)
    generated_images = generator(z)

    grid = torchvision.utils.make_grid(generated_images.cpu().detach(), nrow=5, normalize=True)
    
    plt.imshow(grid.permute(1, 2, 0))
    plt.axis('off')
    plt.show()

show_generated_images(generator, 25)

6. Conclusion

We explored the process of building and training a generative model for apples and oranges using GAN. We learned how to implement a model
using the powerful capabilities of the PyTorch framework with a practical dataset. Having experienced the potential of GANs that can be used in various fields,
we encourage you to create more advanced models in the future.

If you want to learn more, studying various variants of GAN, such as CycleGAN or StyleGAN, would also be a good idea.
Through these advanced topics, we hope you expand your knowledge of deep learning technology.