Deep Learning PyTorch Course, 1D, 2D, 3D Convolution

Convolutional Neural Networks (CNNs) demonstrate excellent performance in learning patterns from images, time-series data, etc., which is attributed to their method of processing data through convolution operations. CNNs are primarily used for processing 2D image data, but in certain tasks, processing of 1D or 3D data is also required. In this article, we will explore 1D, 2D, and 3D convolutions in detail, including how to implement them using PyTorch.

1D Convolution

1D convolution is mainly used for processing time-series or sequential data. Examples include audio signals, stock price data, and sensor data, which correspond to 1D data. In 1D convolution, the filter moves along one dimension of the input data to perform operations.

1D Convolution Example

Here is an example to implement 1D convolution using PyTorch.


import torch
import torch.nn as nn

# Define 1D convolution layer
class Simple1DConv(nn.Module):
    def __init__(self):
        super(Simple1DConv, self).__init__()
        self.conv1 = nn.Conv1d(in_channels=1, out_channels=1, kernel_size=3)
    
    def forward(self, x):
        return self.conv1(x)

# Example input data (1 x 1 x 5) shape
input_data = torch.tensor([[[1.0, 2.0, 3.0, 4.0, 5.0]]])  # Batch size 1, Channel 1
model = Simple1DConv()
output = model(input_data)
print(output)

In the code above, the Simple1DConv class has a 1D convolution layer, and the input data is in the form of (batch size, number of channels, length). The output size after the convolution operation is calculated as follows:

Output length = (Input length – Kernel size) + 1

Here, the input length is 5 and the kernel size is 3, so the output length becomes 3.

2D Convolution

2D convolution is applied to 2-dimensional data such as images. The filter moves across the two dimensions of the input image to perform operations, which is useful for extracting features from images.

2D Convolution Example

A simple example to implement 2D convolution using PyTorch.


import torch
import torch.nn as nn

# Define 2D convolution layer
class Simple2DConv(nn.Module):
    def __init__(self):
        super(Simple2DConv, self).__init__()
        self.conv2d = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3)
    
    def forward(self, x):
        return self.conv2d(x)

# Example input data (Batch size 1, Channel 1, Height 5, Width 5)
input_data = torch.tensor([[[[1.0, 2.0, 3.0, 4.0, 5.0],
                              [6.0, 7.0, 8.0, 9.0, 10.0],
                              [11.0, 12.0, 13.0, 14.0, 15.0],
                              [16.0, 17.0, 18.0, 19.0, 20.0],
                              [21.0, 22.0, 23.0, 24.0, 25.0]]]])  # Batch size 1, Channel 1
model = Simple2DConv()
output = model(input_data)
print(output)

In the code above, the Simple2DConv class defines a 2D convolution layer. The input data is in the shape of (batch size, number of channels, height, width), and the output size is calculated as follows:

Output height = (Input height – Kernel height) + 1

Output width = (Input width – Kernel width) + 1

Since both the input height and width are 5 and the kernel size is 3, the output will have the shape (1, 1, 3, 3).

3D Convolution

3D convolution is applied to 3-dimensional data such as video data or volumetric data. It is useful for analyzing 3D structures such as data that changes over time or medical images.

3D Convolution Example

Here is an example of implementing 3D convolution using PyTorch.


import torch
import torch.nn as nn

# Define 3D convolution layer
class Simple3DConv(nn.Module):
    def __init__(self):
        super(Simple3DConv, self).__init__()
        self.conv3d = nn.Conv3d(in_channels=1, out_channels=1, kernel_size=3)
    
    def forward(self, x):
        return self.conv3d(x)

# Example input data (Batch size 1, Channel 1, Depth 5, Height 5, Width 5)
input_data = torch.tensor([[[[[1.0, 2.0, 3.0, 4.0, 5.0],
                               [6.0, 7.0, 8.0, 9.0, 10.0],
                               [11.0, 12.0, 13.0, 14.0, 15.0],
                               [16.0, 17.0, 18.0, 19.0, 20.0],
                               [21.0, 22.0, 23.0, 24.0, 25.0]],
                             
                              [[1.0, 2.0, 3.0, 4.0, 5.0],
                               [6.0, 7.0, 8.0, 9.0, 10.0],
                               [11.0, 12.0, 13.0, 14.0, 15.0],
                               [16.0, 17.0, 18.0, 19.0, 20.0],
                               [21.0, 22.0, 23.0, 24.0, 25.0]],
                              
                              [[1.0, 2.0, 3.0, 4.0, 5.0],
                               [6.0, 7.0, 8.0, 9.0, 10.0],
                               [11.0, 12.0, 13.0, 14.0, 15.0],
                               [16.0, 17.0, 18.0, 19.0, 20.0],
                               [21.0, 22.0, 23.0, 24.0, 25.0]]]]])  # Batch size 1, Channel 1
model = Simple3DConv()
output = model(input_data)
print(output)

The above code defines the Simple3DConv class, which contains a 3D convolution layer. The input data has the shape (batch size, number of channels, depth, height, width). The output size is calculated as follows:

Output depth = (Input depth – Kernel depth) + 1

Output height = (Input height – Kernel height) + 1

Output width = (Input width – Kernel width) + 1

Conclusion

In this tutorial, we explored how to implement 1D, 2D, and 3D convolutions using PyTorch. By utilizing appropriate convolution layers according to different types of data, effective models can be designed. To build deeper models, multiple convolution layers can be stacked, or various activation functions and optimization techniques can be applied. Future exercises will cover how to design advanced model architectures using these techniques.