{"id":36533,"date":"2024-11-01T09:49:19","date_gmt":"2024-11-01T09:49:19","guid":{"rendered":"http:\/\/atmokpo.com\/w\/?p=36533"},"modified":"2024-11-01T11:52:48","modified_gmt":"2024-11-01T11:52:48","slug":"deep-learning-pytorch-course-data-preparation","status":"publish","type":"post","link":"https:\/\/atmokpo.com\/w\/36533\/","title":{"rendered":"Deep Learning PyTorch Course, Data Preparation"},"content":{"rendered":"<div>\n<p>In order to build a deep learning model, the data preparation step is essential. If the correct dataset is not prepared, the model&#8217;s performance may degrade, which can ultimately have a negative impact on the quality of real applications. Therefore, this course will explain the data preparation methods using PyTorch step by step, and we will practice through example code.<\/p>\n<h2>1. Importance of Data Preparation<\/h2>\n<p>The success of deep learning often depends on the quality and quantity of data. Therefore, the data preparation and preprocessing processes have the following key purposes:<\/p>\n<ul>\n<li><strong>Accuracy:<\/strong> Ensures the accuracy of the data to prevent the model from being fed incorrect information during training.<\/li>\n<li><strong>Consistency:<\/strong> Maintains a consistent data format so that the model can easily understand it.<\/li>\n<li><strong>Balance:<\/strong> In classification problems, it&#8217;s important to balance the classes.<\/li>\n<li><strong>Data Augmentation:<\/strong> In case of insufficient data, data augmentation techniques can be used to increase the training data.<\/li>\n<\/ul>\n<h2>2. Data Preparation Using PyTorch<\/h2>\n<p>PyTorch provides the <strong>torch.utils.data<\/strong> module for data preparation. This module helps to easily create datasets and data loaders. Here are the basic steps for data preparation:<\/p>\n<h3>2.1 Creating a Dataset<\/h3>\n<p>A dataset includes the images needed for the model to learn. To create a dataset, you must inherit the <strong>torch.utils.data.Dataset<\/strong> class and override the __getitem__ and __len__ methods. Here is a simple example:<\/p>\n<pre><code>import torch\nfrom torch.utils.data import Dataset\n\nclass CustomDataset(Dataset):\n    def __init__(self, data, labels):\n        self.data = data\n        self.labels = labels\n\n    def __len__(self):\n        return len(self.data)\n\n    def __getitem__(self, idx):\n        return self.data[idx], self.labels[idx]\n\n# Example data\ndata = torch.randn(100, 3, 32, 32)  # 100 32x32 RGB images\nlabels = torch.randint(0, 10, (100,))  # 100 random labels (0~9)\n\n# Creating the dataset\ndataset = CustomDataset(data, labels)\nprint(f\"Dataset size: {len(dataset)}\")  # 100\n    <\/code><\/pre>\n<h3>2.2 Creating a Data Loader<\/h3>\n<p>A data loader is used to fetch data in batches. Using a data loader allows you to effectively split the dataset into mini-batches to pass to the model. Here\u2019s how to create a data loader:<\/p>\n<pre><code>from torch.utils.data import DataLoader\n\n# Creating the data loader\ndata_loader = DataLoader(dataset, batch_size=16, shuffle=True)\n\n# Outputting batch data\nfor batch_data, batch_labels in data_loader:\n    print(f\"Batch data size: {batch_data.size()}\")  # [16, 3, 32, 32]\n    print(f\"Batch label size: {batch_labels.size()}\")  # [16]\n    break  # Output only the first batch\n    <\/code><\/pre>\n<h2>3. Data Preprocessing<\/h2>\n<p>The data preprocessing step is crucial in deep learning. Taking image data as an example, common tasks that should be performed during the preprocessing stage include:<\/p>\n<ul>\n<li><strong>Normalization:<\/strong> Normalizing the data to enhance the training speed and enable the model to generalize better.<\/li>\n<li><strong>Resizing:<\/strong> Adjusting the image size to fit the model.<\/li>\n<li><strong>Data Augmentation:<\/strong> Augmenting data to prevent overfitting and secure a broader dataset.<\/li>\n<\/ul>\n<h3>3.1 Image Data Preprocessing Example<\/h3>\n<p>The following is an example of image data preprocessing using torchvision.transforms:<\/p>\n<pre><code>from torchvision import transforms\n\n# Define preprocessing steps\ntransform = transforms.Compose([\n    transforms.Resize((32, 32)),  # Resizing the image\n    transforms.ToTensor(),  # Convert to tensor\n    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])  # Normalization\n])\n\n# Modifying the dataset class\nclass CustomDatasetWithTransform(Dataset):\n    def __init__(self, data, labels, transform=None):\n        self.data = data\n        self.labels = labels\n        self.transform = transform\n\n    def __len__(self):\n        return len(self.data)\n\n    def __getitem__(self, idx):\n        image = self.data[idx]\n        label = self.labels[idx]\n        \n        if self.transform:\n            image = self.transform(image)  # Apply transformations\n        \n        return image, label\n\n# Creating the modified dataset\ndataset_with_transform = CustomDatasetWithTransform(data, labels, transform=transform)\ndata_loader_with_transform = DataLoader(dataset_with_transform, batch_size=16, shuffle=True)\n\n# Outputting batch data\nfor batch_data, batch_labels in data_loader_with_transform:\n    print(f\"Batch data size: {batch_data.size()}\")\n    print(f\"Batch label size: {batch_labels.size()}\")\n    break\n    <\/code><\/pre>\n<h2>4. Data Augmentation<\/h2>\n<p>Data augmentation helps the deep learning model to generalize better by providing additional data points. Here are some data augmentation techniques:<\/p>\n<ul>\n<li><strong>Rotation:<\/strong> Rotating the image at random angles.<\/li>\n<li><strong>Cropping:<\/strong> Cropping random parts of the image.<\/li>\n<li><strong>Inversion:<\/strong> Inverting the colors of the image.<\/li>\n<\/ul>\n<h3>4.1 Data Augmentation Example<\/h3>\n<p>The following is an example of data augmentation using torchvision:<\/p>\n<pre><code>from torchvision import transforms\n\n# Define data augmentation steps\naugment = transforms.Compose([\n    transforms.RandomHorizontalFlip(),  # Random horizontal flip\n    transforms.RandomRotation(20),  # Random rotation\n    transforms.ToTensor(),  # Convert to tensor\n    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])  # Normalization\n])\n\n# Applying augmentation steps to the dataset\ndataset_with_augmentation = CustomDatasetWithTransform(data, labels, transform=augment)\ndata_loader_with_augmentation = DataLoader(dataset_with_augmentation, batch_size=16, shuffle=True)\n\n# Outputting batch data\nfor batch_data, batch_labels in data_loader_with_augmentation:\n    print(f\"Batch data size: {batch_data.size()}\")\n    print(f\"Batch label size: {batch_labels.size()}\")\n    break\n    <\/code><\/pre>\n<h2>5. Conclusion<\/h2>\n<p>Data preparation is a very important step in deep learning. It is essential to generate an appropriate dataset, use a data loader to fetch data in batches, and perform necessary data preprocessing and augmentation. In this lecture, we covered the basic processes of data preparation using PyTorch.<\/p>\n<p>Apply these principles to maximize your model&#8217;s performance in your deep learning projects. Data is the most critical asset for a deep learning model. Therefore, proper data preparation is the cornerstone of a successful deep learning project.<\/p>\n<h2>References<\/h2>\n<ul>\n<li><a href=\"https:\/\/pytorch.org\/docs\/stable\/index.html\">PyTorch Documentation<\/a><\/li>\n<li><a href=\"https:\/\/pytorch.org\/tutorials\/\">PyTorch Tutorials<\/a><\/li>\n<li><a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a><\/li>\n<\/ul>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In order to build a deep learning model, the data preparation step is essential. If the correct dataset is not prepared, the model&#8217;s performance may degrade, which can ultimately have a negative impact on the quality of real applications. Therefore, this course will explain the data preparation methods using PyTorch step by step, and we &hellip; <a href=\"https:\/\/atmokpo.com\/w\/36533\/\" class=\"more-link\">\ub354 \ubcf4\uae30<span class=\"screen-reader-text\"> &#8220;Deep Learning PyTorch Course, Data Preparation&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[149],"tags":[],"class_list":["post-36533","post","type-post","status-publish","format-standard","hentry","category-pytorch-study"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Deep Learning PyTorch Course, Data Preparation - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/atmokpo.com\/w\/36533\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Deep Learning PyTorch Course, Data Preparation - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"og:description\" content=\"In order to build a deep learning model, the data preparation step is essential. If the correct dataset is not prepared, the model&#8217;s performance may degrade, which can ultimately have a negative impact on the quality of real applications. Therefore, this course will explain the data preparation methods using PyTorch step by step, and we &hellip; \ub354 \ubcf4\uae30 &quot;Deep Learning PyTorch Course, Data Preparation&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/atmokpo.com\/w\/36533\/\" \/>\n<meta property=\"og:site_name\" content=\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-01T09:49:19+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-11-01T11:52:48+00:00\" \/>\n<meta name=\"author\" content=\"root\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:site\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"root\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"4\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/atmokpo.com\/w\/36533\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36533\/\"},\"author\":{\"name\":\"root\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\"},\"headline\":\"Deep Learning PyTorch Course, Data Preparation\",\"datePublished\":\"2024-11-01T09:49:19+00:00\",\"dateModified\":\"2024-11-01T11:52:48+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36533\/\"},\"wordCount\":501,\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"articleSection\":[\"PyTorch Study\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/atmokpo.com\/w\/36533\/\",\"url\":\"https:\/\/atmokpo.com\/w\/36533\/\",\"name\":\"Deep Learning PyTorch Course, Data Preparation - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#website\"},\"datePublished\":\"2024-11-01T09:49:19+00:00\",\"dateModified\":\"2024-11-01T11:52:48+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36533\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/atmokpo.com\/w\/36533\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/atmokpo.com\/w\/36533\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\ud648\",\"item\":\"https:\/\/atmokpo.com\/w\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Deep Learning PyTorch Course, Data Preparation\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/atmokpo.com\/w\/#website\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/atmokpo.com\/w\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"contentUrl\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"width\":400,\"height\":400,\"caption\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\"},\"image\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/bebubo4\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\",\"name\":\"root\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"caption\":\"root\"},\"sameAs\":[\"http:\/\/atmokpo.com\/w\"],\"url\":\"https:\/\/atmokpo.com\/w\/author\/root\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Deep Learning PyTorch Course, Data Preparation - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/atmokpo.com\/w\/36533\/","og_locale":"ko_KR","og_type":"article","og_title":"Deep Learning PyTorch Course, Data Preparation - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","og_description":"In order to build a deep learning model, the data preparation step is essential. If the correct dataset is not prepared, the model&#8217;s performance may degrade, which can ultimately have a negative impact on the quality of real applications. Therefore, this course will explain the data preparation methods using PyTorch step by step, and we &hellip; \ub354 \ubcf4\uae30 \"Deep Learning PyTorch Course, Data Preparation\"","og_url":"https:\/\/atmokpo.com\/w\/36533\/","og_site_name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","article_published_time":"2024-11-01T09:49:19+00:00","article_modified_time":"2024-11-01T11:52:48+00:00","author":"root","twitter_card":"summary_large_image","twitter_creator":"@bebubo4","twitter_site":"@bebubo4","twitter_misc":{"\uae00\uc4f4\uc774":"root","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"4\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/atmokpo.com\/w\/36533\/#article","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/36533\/"},"author":{"name":"root","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7"},"headline":"Deep Learning PyTorch Course, Data Preparation","datePublished":"2024-11-01T09:49:19+00:00","dateModified":"2024-11-01T11:52:48+00:00","mainEntityOfPage":{"@id":"https:\/\/atmokpo.com\/w\/36533\/"},"wordCount":501,"publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"articleSection":["PyTorch Study"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/atmokpo.com\/w\/36533\/","url":"https:\/\/atmokpo.com\/w\/36533\/","name":"Deep Learning PyTorch Course, Data Preparation - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/#website"},"datePublished":"2024-11-01T09:49:19+00:00","dateModified":"2024-11-01T11:52:48+00:00","breadcrumb":{"@id":"https:\/\/atmokpo.com\/w\/36533\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/atmokpo.com\/w\/36533\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/atmokpo.com\/w\/36533\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\ud648","item":"https:\/\/atmokpo.com\/w\/en\/"},{"@type":"ListItem","position":2,"name":"Deep Learning PyTorch Course, Data Preparation"}]},{"@type":"WebSite","@id":"https:\/\/atmokpo.com\/w\/#website","url":"https:\/\/atmokpo.com\/w\/","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","description":"","publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/atmokpo.com\/w\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/atmokpo.com\/w\/#organization","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","url":"https:\/\/atmokpo.com\/w\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/","url":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","contentUrl":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","width":400,"height":400,"caption":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8"},"image":{"@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/bebubo4"]},{"@type":"Person","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7","name":"root","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","caption":"root"},"sameAs":["http:\/\/atmokpo.com\/w"],"url":"https:\/\/atmokpo.com\/w\/author\/root\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36533","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/comments?post=36533"}],"version-history":[{"count":1,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36533\/revisions"}],"predecessor-version":[{"id":36534,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36533\/revisions\/36534"}],"wp:attachment":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/media?parent=36533"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/categories?post=36533"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/tags?post=36533"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}