Deep Learning with GAN using PyTorch, AE – Autoencoder

1. GAN (Generative Adversarial Network)

GAN is a model proposed by Ian Goodfellow in 2014, consisting of two neural networks: the generator and the discriminator, that compete with each other. Through this competition, the generator produces data that looks real.

1.1 Structure of GAN

GAN consists of two neural networks. The generator takes a random noise vector as input and generates fake data, while the discriminator distinguishes whether the input data is real or generated. The generator and discriminator are trained with their respective objectives.

1.2 Loss Function of GAN

The loss function of GAN is used to evaluate the performance of the generator and the discriminator. The generator tries to fool the discriminator, and the discriminator works to distinguish between the two.
\[
\text{Loss}_D = – \mathbb{E}_{x \sim p_{data}(x)}[\log(D(x))] – \mathbb{E}_{z \sim p_z(z)}[\log(1 – D(G(z)))]
\]
\[
\text{Loss}_G = – \mathbb{E}_{z \sim p_z(z)}[\log(D(G(z)))]
\]

1.3 GAN Example Code

The following is a simple GAN implemented using PyTorch:

        
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# Define hyperparameters
latent_size = 100
batch_size = 64
num_epochs = 200
learning_rate = 0.0002

# Load dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True)

# Define generator
class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(latent_size, 256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.ReLU(),
            nn.Linear(512, 1024),
            nn.ReLU(),
            nn.Linear(1024, 28*28),
            nn.Tanh()
        )

    def forward(self, z):
        return self.model(z).view(-1, 1, 28, 28)

# Define discriminator
class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(28*28, 1024),
            nn.LeakyReLU(0.2),
            nn.Linear(1024, 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 1),
            nn.Sigmoid()
        )

    def forward(self, img):
        return self.model(img.view(-1, 28*28))

# Initialize models
generator = Generator()
discriminator = Discriminator()

# Define loss function and optimizers
criterion = nn.BCELoss()
optimizer_G = optim.Adam(generator.parameters(), lr=learning_rate)
optimizer_D = optim.Adam(discriminator.parameters(), lr=learning_rate)

# Training process
for epoch in range(num_epochs):
    for i, (imgs, _) in enumerate(data_loader):
        # Real and fake image labels
        real_imgs = imgs
        real_labels = torch.ones(imgs.size(0), 1)  # Real labels
        fake_labels = torch.zeros(imgs.size(0), 1)  # Fake labels

        # Train discriminator
        optimizer_D.zero_grad()
        outputs = discriminator(real_imgs)
        d_loss_real = criterion(outputs, real_labels)

        z = torch.randn(imgs.size(0), latent_size)
        fake_imgs = generator(z)
        outputs = discriminator(fake_imgs.detach())
        d_loss_fake = criterion(outputs, fake_labels)

        d_loss = d_loss_real + d_loss_fake
        d_loss.backward()
        optimizer_D.step()

        # Train generator
        optimizer_G.zero_grad()
        outputs = discriminator(fake_imgs)
        g_loss = criterion(outputs, real_labels)
        g_loss.backward()
        optimizer_G.step()

    print(f"Epoch [{epoch}/{num_epochs}], d_loss: {d_loss.item()}, g_loss: {g_loss.item()}")
        
    

2. Autoencoder

Autoencoders are an unsupervised learning method that compresses and reconstructs input data. They aim to produce outputs that are the same as the inputs while learning features to compress the data.

2.1 Structure of Autoencoder

An autoencoder is divided into two parts: an encoder and a decoder. The encoder transforms the input into a low-dimensional latent representation, while the decoder uses this latent representation to reconstruct the original input.

2.2 Loss Function of Autoencoder

Autoencoders mainly use Mean Squared Error (MSE) as the loss function to minimize the difference between the inputs and outputs.
\[
\text{Loss} = \frac{1}{N} \sum_{i=1}^N (x_i – \hat{x}_i)^2
\]

2.3 Autoencoder Example Code

The following is a simple implementation of an autoencoder using PyTorch:

        
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# Define hyperparameters
batch_size = 64
num_epochs = 20
learning_rate = 0.001

# Load dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True)

# Define autoencoder
class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(28*28, 128),
            nn.ReLU(),
            nn.Linear(128, 64),
            nn.ReLU()
        )
        self.decoder = nn.Sequential(
            nn.Linear(64, 128),
            nn.ReLU(),
            nn.Linear(128, 28*28),
            nn.Sigmoid()
        )

    def forward(self, x):
        x = x.view(-1, 28*28)
        encoded = self.encoder(x)
        reconstructed = self.decoder(encoded)
        return reconstructed.view(-1, 1, 28, 28)

# Initialize model
autoencoder = Autoencoder()

# Define loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(autoencoder.parameters(), lr=learning_rate)

# Training process
for epoch in range(num_epochs):
    for imgs, _ in data_loader:
        optimizer.zero_grad()
        outputs = autoencoder(imgs)
        loss = criterion(outputs, imgs)
        loss.backward()
        optimizer.step()

    print(f"Epoch [{epoch}/{num_epochs}], Loss: {loss.item()}")
        
    

3. Conclusion

GANs and autoencoders are powerful deep learning techniques for image generation, data representation, and compression. By understanding and practicing their structures and training methods, one can build a higher level of deep learning knowledge.
These models can be applied to various fields and can yield better results with customized architectures.