{"id":36201,"date":"2024-11-01T09:46:35","date_gmt":"2024-11-01T09:46:35","guid":{"rendered":"http:\/\/atmokpo.com\/w\/?p=36201"},"modified":"2024-11-01T09:46:35","modified_gmt":"2024-11-01T09:46:35","slug":"huggingface-transformers-tutorial-classification-accuracy","status":"publish","type":"post","link":"https:\/\/atmokpo.com\/w\/36201\/","title":{"rendered":"huggingface transformers tutorial, classification accuracy"},"content":{"rendered":"<p><body><\/p>\n<p>The advancement of deep learning and natural language processing (NLP) is one of the key elements driving today\u2019s technological innovation. While there are several libraries and frameworks available, the <strong>Hugging Face<\/strong> <strong>Transformers<\/strong> library is particularly designed to be intuitive and user-friendly. This article will discuss how to build a document classification model using Hugging Face&#8217;s Transformers and evaluate the model&#8217;s performance.<\/p>\n<h2>1. What is Hugging Face Transformers?<\/h2>\n<p>The Hugging Face Transformers library supports various model architectures and makes it easy to use pre-trained models. Transformers are models that have revolutionized natural language processing, based on the <strong>Attention Mechanism<\/strong>. These models are pre-trained on large datasets and can be fine-tuned for specific tasks.<\/p>\n<h2>2. Installing Required Libraries<\/h2>\n<p>We will install the necessary libraries to use Hugging Face Transformers. The primary libraries we will use are <strong>transformers<\/strong>, <strong>torch<\/strong>, and <strong>datasets<\/strong>. Use the following command to install them:<\/p>\n<pre><code>!pip install transformers torch datasets<\/code><\/pre>\n<h2>3. Preparing the Dataset<\/h2>\n<p>We will prepare the dataset for document classification. Here, we will use the <strong>AG News<\/strong> dataset. AG News is a dataset for news article classification, which has four classes:<\/p>\n<ul>\n<li>World<\/li>\n<li>Sports<\/li>\n<li>Business<\/li>\n<li>Science\/Technology<\/li>\n<\/ul>\n<p>Running the following code will download the dataset and split it into training and testing data.<\/p>\n<pre><code>from datasets import load_dataset\n\ndataset = load_dataset(\"ag_news\")<\/code><\/pre>\n<h2>4. Data Preprocessing<\/h2>\n<p>After loading the data, we need to separate the texts and labels and perform the necessary preprocessing. The following code shows the process of checking sample data and labels.<\/p>\n<pre><code>train_texts = dataset['train']['text']\ntrain_labels = dataset['train']['label']\n\ntest_texts = dataset['test']['text']\ntest_labels = dataset['test']['label']\n\nprint(\"Sample news article:\", train_texts[0])\nprint(\"Label:\", train_labels[0])<\/code><\/pre>\n<h2>5. Preparing the Model and Tokenizer<\/h2>\n<p>Now, we will load the pre-trained model and tokenizer using the <strong>transformers<\/strong> library. Here, we will use the <strong>BertForSequenceClassification<\/strong> model.<\/p>\n<pre><code>from transformers import BertTokenizer, BertForSequenceClassification\n\nmodel_name = \"bert-base-uncased\"\ntokenizer = BertTokenizer.from_pretrained(model_name)\nmodel = BertForSequenceClassification.from_pretrained(model_name, num_labels=4)<\/code><\/pre>\n<h2>6. Data Tokenization<\/h2>\n<p>We tokenize each text for document classification according to the BERT model. The following code adds padding to facilitate batch processing.<\/p>\n<pre><code>def tokenize_function(examples):\n    return tokenizer(examples[\"text\"], padding=\"max_length\", truncation=True)\n\ntokenized_datasets = dataset.map(tokenize_function, batched=True)<\/code><\/pre>\n<h2>7. Training the Model<\/h2>\n<p>We use the <strong>Trainer<\/strong> class to train the model. The Trainer automatically handles training and evaluation. The following code includes the setup and preparation process for training.<\/p>\n<pre><code>from transformers import Trainer, TrainingArguments\n\ntraining_args = TrainingArguments(\n    output_dir=\".\/results\",\n    evaluation_strategy=\"epoch\",\n    learning_rate=2e-5,\n    per_device_train_batch_size=16,\n    per_device_eval_batch_size=16,\n    num_train_epochs=3,\n    weight_decay=0.01,\n)\n\ntrainer = Trainer(\n    model=model,\n    args=training_args,\n    train_dataset=tokenized_datasets['train'],\n    eval_dataset=tokenized_datasets['test']\n)\n\ntrainer.train()<\/code><\/pre>\n<h2>8. Evaluating the Model<\/h2>\n<p>After training the model, we can measure its performance through the evaluation function. We will use the <strong>metrics<\/strong> library to calculate accuracy.<\/p>\n<pre><code>import numpy as np\nfrom sklearn.metrics import accuracy_score\n\npredictions, label_ids, _ = trainer.predict(tokenized_datasets['test'])\npreds = np.argmax(predictions, axis=1)\n\naccuracy = accuracy_score(label_ids, preds)\nprint(\"Classification accuracy:\", accuracy)<\/code><\/pre>\n<h2>9. Conclusion<\/h2>\n<p>We learned how to load a dataset and perform text classification using a pre-trained model with Hugging Face Transformers. Through this process, we saw the usefulness of transformer models in natural language processing tasks. Additionally, further performance improvements can be achieved by trying hyperparameter tuning or various models.<\/p>\n<h2>10. References<\/h2>\n<p>For readers looking for more information and examples, the following resources are recommended:<\/p>\n<ul>\n<li><a href=\"https:\/\/huggingface.co\/docs\/transformers\/index\">Hugging Face Transformers Documentation<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/huggingface\/transformers\">Transformers GitHub Repository<\/a><\/li>\n<li><a href=\"https:\/\/www.analyticsvidhya.com\/blog\/2021\/06\/a-beginners-guide-to-using-huggingface-transformers-with-bart\/\">A Beginner&#8217;s Guide to Using HuggingFace Transformers<\/a><\/li>\n<\/ul>\n<p><\/body><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The advancement of deep learning and natural language processing (NLP) is one of the key elements driving today\u2019s technological innovation. While there are several libraries and frameworks available, the Hugging Face Transformers library is particularly designed to be intuitive and user-friendly. This article will discuss how to build a document classification model using Hugging Face&#8217;s &hellip; <a href=\"https:\/\/atmokpo.com\/w\/36201\/\" class=\"more-link\">\ub354 \ubcf4\uae30<span class=\"screen-reader-text\"> &#8220;huggingface transformers tutorial, classification accuracy&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[108],"tags":[],"class_list":["post-36201","post","type-post","status-publish","format-standard","hentry","category---en"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>huggingface transformers tutorial, classification accuracy - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/atmokpo.com\/w\/36201\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"huggingface transformers tutorial, classification accuracy - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"og:description\" content=\"The advancement of deep learning and natural language processing (NLP) is one of the key elements driving today\u2019s technological innovation. While there are several libraries and frameworks available, the Hugging Face Transformers library is particularly designed to be intuitive and user-friendly. This article will discuss how to build a document classification model using Hugging Face&#8217;s &hellip; \ub354 \ubcf4\uae30 &quot;huggingface transformers tutorial, classification accuracy&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/atmokpo.com\/w\/36201\/\" \/>\n<meta property=\"og:site_name\" content=\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-01T09:46:35+00:00\" \/>\n<meta name=\"author\" content=\"root\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:site\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"root\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"3\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/atmokpo.com\/w\/36201\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36201\/\"},\"author\":{\"name\":\"root\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\"},\"headline\":\"huggingface transformers tutorial, classification accuracy\",\"datePublished\":\"2024-11-01T09:46:35+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36201\/\"},\"wordCount\":416,\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"articleSection\":[\"Using Hugging Face\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/atmokpo.com\/w\/36201\/\",\"url\":\"https:\/\/atmokpo.com\/w\/36201\/\",\"name\":\"huggingface transformers tutorial, classification accuracy - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#website\"},\"datePublished\":\"2024-11-01T09:46:35+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36201\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/atmokpo.com\/w\/36201\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/atmokpo.com\/w\/36201\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\ud648\",\"item\":\"https:\/\/atmokpo.com\/w\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"huggingface transformers tutorial, classification accuracy\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/atmokpo.com\/w\/#website\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/atmokpo.com\/w\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"contentUrl\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"width\":400,\"height\":400,\"caption\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\"},\"image\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/bebubo4\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\",\"name\":\"root\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"caption\":\"root\"},\"sameAs\":[\"http:\/\/atmokpo.com\/w\"],\"url\":\"https:\/\/atmokpo.com\/w\/author\/root\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"huggingface transformers tutorial, classification accuracy - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/atmokpo.com\/w\/36201\/","og_locale":"ko_KR","og_type":"article","og_title":"huggingface transformers tutorial, classification accuracy - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","og_description":"The advancement of deep learning and natural language processing (NLP) is one of the key elements driving today\u2019s technological innovation. While there are several libraries and frameworks available, the Hugging Face Transformers library is particularly designed to be intuitive and user-friendly. This article will discuss how to build a document classification model using Hugging Face&#8217;s &hellip; \ub354 \ubcf4\uae30 \"huggingface transformers tutorial, classification accuracy\"","og_url":"https:\/\/atmokpo.com\/w\/36201\/","og_site_name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","article_published_time":"2024-11-01T09:46:35+00:00","author":"root","twitter_card":"summary_large_image","twitter_creator":"@bebubo4","twitter_site":"@bebubo4","twitter_misc":{"\uae00\uc4f4\uc774":"root","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"3\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/atmokpo.com\/w\/36201\/#article","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/36201\/"},"author":{"name":"root","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7"},"headline":"huggingface transformers tutorial, classification accuracy","datePublished":"2024-11-01T09:46:35+00:00","mainEntityOfPage":{"@id":"https:\/\/atmokpo.com\/w\/36201\/"},"wordCount":416,"publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"articleSection":["Using Hugging Face"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/atmokpo.com\/w\/36201\/","url":"https:\/\/atmokpo.com\/w\/36201\/","name":"huggingface transformers tutorial, classification accuracy - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/#website"},"datePublished":"2024-11-01T09:46:35+00:00","breadcrumb":{"@id":"https:\/\/atmokpo.com\/w\/36201\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/atmokpo.com\/w\/36201\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/atmokpo.com\/w\/36201\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\ud648","item":"https:\/\/atmokpo.com\/w\/en\/"},{"@type":"ListItem","position":2,"name":"huggingface transformers tutorial, classification accuracy"}]},{"@type":"WebSite","@id":"https:\/\/atmokpo.com\/w\/#website","url":"https:\/\/atmokpo.com\/w\/","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","description":"","publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/atmokpo.com\/w\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/atmokpo.com\/w\/#organization","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","url":"https:\/\/atmokpo.com\/w\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/","url":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","contentUrl":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","width":400,"height":400,"caption":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8"},"image":{"@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/bebubo4"]},{"@type":"Person","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7","name":"root","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","caption":"root"},"sameAs":["http:\/\/atmokpo.com\/w"],"url":"https:\/\/atmokpo.com\/w\/author\/root\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36201","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/comments?post=36201"}],"version-history":[{"count":1,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36201\/revisions"}],"predecessor-version":[{"id":36202,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36201\/revisions\/36202"}],"wp:attachment":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/media?parent=36201"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/categories?post=36201"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/tags?post=36201"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}