{"id":36171,"date":"2024-11-01T09:46:21","date_gmt":"2024-11-01T09:46:21","guid":{"rendered":"http:\/\/atmokpo.com\/w\/?p=36171"},"modified":"2024-11-01T09:46:21","modified_gmt":"2024-11-01T09:46:21","slug":"using-hugging-face-transformers-setting-trainingarguments","status":"publish","type":"post","link":"https:\/\/atmokpo.com\/w\/36171\/","title":{"rendered":"Using Hugging Face Transformers, Setting TrainingArguments"},"content":{"rendered":"<p><body><\/p>\n<p>In the field of deep learning and natural language processing (NLP), the <strong>Hugging Face<\/strong>&#8216;s <code>Transformers<\/code> library is a very useful tool. In this course, we will explain in detail the <code>TrainingArguments<\/code> class used in Hugging Face&#8217;s <code>Trainer<\/code> API, how to configure it, and provide actual code examples.<\/p>\n<h2>What is TrainingArguments?<\/h2>\n<p>The <code>TrainingArguments<\/code> class is used to define various hyperparameters and settings for model training. This class allows you to set multiple arguments that include training, validation, and logging requirements.<\/p>\n<h3>Main Parameters of TrainingArguments<\/h3>\n<ul>\n<li><code>output_dir<\/code>: The directory path where model checkpoints will be saved.<\/li>\n<li><code>num_train_epochs<\/code>: Sets how many times to iterate through the entire training dataset.<\/li>\n<li><code>per_device_train_batch_size<\/code>: The batch size to use per device (e.g., GPU).<\/li>\n<li><code>learning_rate<\/code>: Sets the learning rate.<\/li>\n<li><code>evaluation_strategy<\/code>: Sets the evaluation strategy. For example, options like &#8220;epoch&#8221; or &#8220;steps&#8221; are available.<\/li>\n<li><code>logging_dir<\/code>: The directory path where log files will be saved.<\/li>\n<li><code>weight_decay<\/code>: Applies regularization using weight decay.<\/li>\n<li><code>save_total_limit<\/code>: Limits the maximum number of checkpoints to be saved.<\/li>\n<\/ul>\n<h2>Setting Up TrainingArguments<\/h2>\n<p>Now let\u2019s practically set up the parameters needed for training using <code>TrainingArguments<\/code>. The example code below describes how to use this class and the role of each parameter.<\/p>\n<h3>Python Example Code<\/h3>\n<pre><code>from transformers import TrainingArguments\n\n# Create TrainingArguments object\ntraining_args = TrainingArguments(\n    output_dir='.\/results',                       # Directory path to save checkpoints\n    num_train_epochs=3,                           # Number of epochs to train\n    per_device_train_batch_size=16,               # Batch size to use on each device\n    per_device_eval_batch_size=64,                # Batch size to use for evaluation\n    learning_rate=2e-5,                           # Learning rate\n    evaluation_strategy=\"epoch\",                   # Evaluation strategy\n    logging_dir='.\/logs',                          # Directory to save log files\n    weight_decay=0.01,                            # Weight decay\n    save_total_limit=2                            # Maximum number of saved checkpoints\n)\n\nprint(training_args)\n<\/code><\/pre>\n<h3>Code Explanation<\/h3>\n<p>The code above is an example of creating a <code>TrainingArguments<\/code> object. Let\u2019s take a closer look at each parameter:<\/p>\n<ul>\n<li><code>output_dir='.\/results'<\/code>: Specifies the folder where the model checkpoints will be saved after training.<\/li>\n<li><code>num_train_epochs=3<\/code>: Trains the model by iterating through the entire dataset 3 times.<\/li>\n<li><code>per_device_train_batch_size=16<\/code>: Uses a batch of 16 samples for training on each device.<\/li>\n<li><code>per_device_eval_batch_size=64<\/code>: Processes 64 samples in a batch for evaluation on each device.<\/li>\n<li><code>learning_rate=2e-5<\/code>: Sets the learning rate at the start of training.<\/li>\n<li><code>evaluation_strategy=\"epoch\"<\/code>: Configures the model to be evaluated after each epoch ends.<\/li>\n<li><code>logging_dir='.\/logs'<\/code>: Directory to save training logs.<\/li>\n<li><code>weight_decay=0.01<\/code>: Applies 1% weight decay to prevent model overfitting.<\/li>\n<li><code>save_total_limit=2<\/code>: Limits the maximum number of checkpoints being saved to 2.<\/li>\n<\/ul>\n<h2>Integrating TrainingArguments with the Trainer API<\/h2>\n<p>After setting the training parameters, you can use the <code>Trainer<\/code> API to train your model. Below is an example showing how to integrate the <code>Trainer<\/code> class with <code>TrainingArguments<\/code>.<\/p>\n<pre><code>from transformers import Trainer, TrainingArguments, AutoModelForSequenceClassification, AutoTokenizer\n\n# Load model and tokenizer\nmodel = AutoModelForSequenceClassification.from_pretrained(\"bert-base-uncased\", num_labels=2)\ntokenizer = AutoTokenizer.from_pretrained(\"bert-base-uncased\")\n\n# Prepare training and evaluation datasets (example is omitted)\ntrain_dataset = ...\neval_dataset = ...\n\n# Create Trainer object\ntrainer = Trainer(\n    model=model,\n    args=training_args,\n    train_dataset=train_dataset,\n    eval_dataset=eval_dataset\n)\n\n# Train the model\ntrainer.train()\n<\/code><\/pre>\n<h3>Code Explanation<\/h3>\n<p>The code above performs the following steps:<\/p>\n<ul>\n<li>Loads the BERT model for classification tasks using <code>AutoModelForSequenceClassification<\/code>.<\/li>\n<li>Also loads the appropriate tokenizer using <code>AutoTokenizer<\/code>.<\/li>\n<li>Declared empty variables as examples to insert the training and evaluation datasets. Actual datasets should be prepared and assigned.<\/li>\n<li>Creates a <code>Trainer<\/code> object, which takes the model, training arguments, training dataset, and evaluation dataset.<\/li>\n<li>Finally, calls <code>trainer.train()<\/code> to start the model training.<\/li>\n<\/ul>\n<h2>Common Configurations for TrainingArguments<\/h2>\n<p>Though there are various arguments in TrainingArguments, let\u2019s look at a few commonly used configurations:<\/p>\n<h3>1. Gradient Accumulation<\/h3>\n<p>If you encounter memory limitations that make it difficult to train with large batches during model training, you can use gradient accumulation. For example, if the batch size is set to 32 and you accumulate gradients over 4 batches, the total effective batch size will be 128.<\/p>\n<pre><code>training_args = TrainingArguments(\n    per_device_train_batch_size=8,\n    gradient_accumulation_steps=4,  # Accumulate gradients over 4 batches\n)\n<\/code><\/pre>\n<h3>2. Mixed Precision Training<\/h3>\n<p>If your GPU supports Mixed Precision Training, it can accelerate training and reduce memory usage. In this case, you can add the <code>fp16=True<\/code> setting.<\/p>\n<pre><code>training_args = TrainingArguments(\n    fp16=True,  # Mixed precision training\n)\n<\/code><\/pre>\n<h3>3. Early Stopping<\/h3>\n<p>You can configure early stopping to prevent unnecessary training if there is no improvement in performance. This should be combined with <code>EarlyStoppingCallback<\/code>.<\/p>\n<pre><code>from transformers import EarlyStoppingCallback\n\ntrainer = Trainer(\n    ...\n    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],  # Stop if no improvement for 3 epochs\n)\n<\/code><\/pre>\n<h2>Conclusion<\/h2>\n<p>In this course, we thoroughly explained how to set up the <code>TrainingArguments<\/code> class in Hugging Face&#8217;s Transformers library. You can optimize model training through various hyperparameters.<\/p>\n<p>To train deep learning models more effectively, it is important to make good use of the various parameters in <code>TrainingArguments<\/code>. We hope you find the optimal hyperparameters through experimentation, continuously improving the model&#8217;s performance.<\/p>\n<p>If you have any further questions or would like to know more, please leave a comment, and we will be happy to respond.<\/p>\n<footer>\n<p>\u00a9 2023 Hugging Face Transformers Course<\/p>\n<\/footer>\n<p><\/body><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the field of deep learning and natural language processing (NLP), the Hugging Face&#8216;s Transformers library is a very useful tool. In this course, we will explain in detail the TrainingArguments class used in Hugging Face&#8217;s Trainer API, how to configure it, and provide actual code examples. What is TrainingArguments? The TrainingArguments class is used &hellip; <a href=\"https:\/\/atmokpo.com\/w\/36171\/\" class=\"more-link\">\ub354 \ubcf4\uae30<span class=\"screen-reader-text\"> &#8220;Using Hugging Face Transformers, Setting TrainingArguments&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[108],"tags":[],"class_list":["post-36171","post","type-post","status-publish","format-standard","hentry","category---en"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Using Hugging Face Transformers, Setting TrainingArguments - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/atmokpo.com\/w\/36171\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Using Hugging Face Transformers, Setting TrainingArguments - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"og:description\" content=\"In the field of deep learning and natural language processing (NLP), the Hugging Face&#8216;s Transformers library is a very useful tool. In this course, we will explain in detail the TrainingArguments class used in Hugging Face&#8217;s Trainer API, how to configure it, and provide actual code examples. What is TrainingArguments? The TrainingArguments class is used &hellip; \ub354 \ubcf4\uae30 &quot;Using Hugging Face Transformers, Setting TrainingArguments&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/atmokpo.com\/w\/36171\/\" \/>\n<meta property=\"og:site_name\" content=\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-01T09:46:21+00:00\" \/>\n<meta name=\"author\" content=\"root\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:site\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"root\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"4\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/atmokpo.com\/w\/36171\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36171\/\"},\"author\":{\"name\":\"root\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\"},\"headline\":\"Using Hugging Face Transformers, Setting TrainingArguments\",\"datePublished\":\"2024-11-01T09:46:21+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36171\/\"},\"wordCount\":591,\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"articleSection\":[\"Using Hugging Face\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/atmokpo.com\/w\/36171\/\",\"url\":\"https:\/\/atmokpo.com\/w\/36171\/\",\"name\":\"Using Hugging Face Transformers, Setting TrainingArguments - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#website\"},\"datePublished\":\"2024-11-01T09:46:21+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36171\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/atmokpo.com\/w\/36171\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/atmokpo.com\/w\/36171\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\ud648\",\"item\":\"https:\/\/atmokpo.com\/w\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Using Hugging Face Transformers, Setting TrainingArguments\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/atmokpo.com\/w\/#website\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/atmokpo.com\/w\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"contentUrl\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"width\":400,\"height\":400,\"caption\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\"},\"image\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/bebubo4\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\",\"name\":\"root\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"caption\":\"root\"},\"sameAs\":[\"http:\/\/atmokpo.com\/w\"],\"url\":\"https:\/\/atmokpo.com\/w\/author\/root\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Using Hugging Face Transformers, Setting TrainingArguments - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/atmokpo.com\/w\/36171\/","og_locale":"ko_KR","og_type":"article","og_title":"Using Hugging Face Transformers, Setting TrainingArguments - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","og_description":"In the field of deep learning and natural language processing (NLP), the Hugging Face&#8216;s Transformers library is a very useful tool. In this course, we will explain in detail the TrainingArguments class used in Hugging Face&#8217;s Trainer API, how to configure it, and provide actual code examples. What is TrainingArguments? The TrainingArguments class is used &hellip; \ub354 \ubcf4\uae30 \"Using Hugging Face Transformers, Setting TrainingArguments\"","og_url":"https:\/\/atmokpo.com\/w\/36171\/","og_site_name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","article_published_time":"2024-11-01T09:46:21+00:00","author":"root","twitter_card":"summary_large_image","twitter_creator":"@bebubo4","twitter_site":"@bebubo4","twitter_misc":{"\uae00\uc4f4\uc774":"root","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"4\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/atmokpo.com\/w\/36171\/#article","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/36171\/"},"author":{"name":"root","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7"},"headline":"Using Hugging Face Transformers, Setting TrainingArguments","datePublished":"2024-11-01T09:46:21+00:00","mainEntityOfPage":{"@id":"https:\/\/atmokpo.com\/w\/36171\/"},"wordCount":591,"publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"articleSection":["Using Hugging Face"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/atmokpo.com\/w\/36171\/","url":"https:\/\/atmokpo.com\/w\/36171\/","name":"Using Hugging Face Transformers, Setting TrainingArguments - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/#website"},"datePublished":"2024-11-01T09:46:21+00:00","breadcrumb":{"@id":"https:\/\/atmokpo.com\/w\/36171\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/atmokpo.com\/w\/36171\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/atmokpo.com\/w\/36171\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\ud648","item":"https:\/\/atmokpo.com\/w\/en\/"},{"@type":"ListItem","position":2,"name":"Using Hugging Face Transformers, Setting TrainingArguments"}]},{"@type":"WebSite","@id":"https:\/\/atmokpo.com\/w\/#website","url":"https:\/\/atmokpo.com\/w\/","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","description":"","publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/atmokpo.com\/w\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/atmokpo.com\/w\/#organization","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","url":"https:\/\/atmokpo.com\/w\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/","url":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","contentUrl":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","width":400,"height":400,"caption":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8"},"image":{"@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/bebubo4"]},{"@type":"Person","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7","name":"root","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","caption":"root"},"sameAs":["http:\/\/atmokpo.com\/w"],"url":"https:\/\/atmokpo.com\/w\/author\/root\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36171","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/comments?post=36171"}],"version-history":[{"count":1,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36171\/revisions"}],"predecessor-version":[{"id":36172,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36171\/revisions\/36172"}],"wp:attachment":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/media?parent=36171"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/categories?post=36171"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/tags?post=36171"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}