{"id":36061,"date":"2024-11-01T09:45:22","date_gmt":"2024-11-01T09:45:22","guid":{"rendered":"http:\/\/atmokpo.com\/w\/?p=36061"},"modified":"2024-11-01T09:45:22","modified_gmt":"2024-11-01T09:45:22","slug":"huggingface-transformers-tutorial-convert-bart-tokenization-results-to-numpy-array","status":"publish","type":"post","link":"https:\/\/atmokpo.com\/w\/36061\/","title":{"rendered":"huggingface transformers tutorial, convert BART tokenization results to numpy array"},"content":{"rendered":"<p><body><\/p>\n<p>Recent advancements in deep learning models have been remarkable in the fields of artificial intelligence and natural language processing. In particular, the <strong>Hugging Face<\/strong> <strong>Transformers<\/strong> library allows easy access to various natural language processing (NLP) models. In this practical session, we will explore how to use the BART (Bidirectional and Auto-Regressive Transformers) model to process text and convert its results into a NumPy array.<\/p>\n<h2>Introduction to BART Model<\/h2>\n<p>BART is a model developed by Facebook AI Research that demonstrates excellent performance in text generation and summarization tasks. BART adopts an Encoder-Decoder structure, which is advantageous for understanding input text and generating new text based on it. BART is particularly effective at handling complex structures of text, such as graphical nodes, and performs well across various NLP tasks.<\/p>\n<h2>Installing Hugging Face Transformers<\/h2>\n<p>To use the Hugging Face Transformers library, you need to install it. You can easily install it using the command below.<\/p>\n<pre><code>pip install transformers<\/code><\/pre>\n<h2>Using BART Tokenizer<\/h2>\n<p>To use BART, let&#8217;s initialize the tokenizer and explore the process of tokenizing text. The following code is an example of tokenizing basic text using the BART tokenizer.<\/p>\n<pre><code>from transformers import BartTokenizer\n\n# Initialize BART tokenizer\ntokenizer = BartTokenizer.from_pretrained('facebook\/bart-large')\n\n# Input text\ntext = \"Deep learning is the technology of the future.\"\n\n# Tokenize the text\ntokens = tokenizer(text)\nprint(tokens)<\/code><\/pre>\n<h3>Tokenization Result<\/h3>\n<p>In the above code, we used the BART tokenizer to tokenize the input text. The result of the tokenization is expressed in the form of a dictionary and contains various pieces of information.<\/p>\n<pre><code>Output: \n{'input_ids': [0, 10024, 327, 1311, 1346, 231, 1620, 2], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1]}<\/code><\/pre>\n<h2>Converting to NumPy Array<\/h2>\n<p>We will convert the tokenization results into a NumPy array to create a suitable format for model input. First, we need to install and import the NumPy library.<\/p>\n<pre><code>import numpy as np<\/code><\/pre>\n<h2>Code Example<\/h2>\n<p>The following is an example code that includes the entire process:<\/p>\n<pre><code>import numpy as np\nfrom transformers import BartTokenizer\n\n# Initialize BART tokenizer\ntokenizer = BartTokenizer.from_pretrained('facebook\/bart-large')\n\n# Input text\ntext = \"Deep learning is the technology of the future.\"\n\n# Tokenize the text\ntokens = tokenizer(text)\n\n# Convert input_ids to NumPy array\ninput_ids_np = np.array(tokens['input_ids'])\nattention_mask_np = np.array(tokens['attention_mask'])\n\n# Output\nprint(\"Input IDs:\", input_ids_np)\nprint(\"Attention Mask:\", attention_mask_np)<\/code><\/pre>\n<h3>Checking Results<\/h3>\n<p>When the above code is executed, you will find that the tokenization results have been converted into NumPy arrays. These arrays can be used as inputs for the model.<\/p>\n<h2>Inputting into the Model<\/h2>\n<p>You can input the generated numerical arrays into the model to perform text generation or other NLP tasks. For example, you can proceed with summarizing sentences. Below is an example of summarizing the input text using the BART model.<\/p>\n<pre><code>from transformers import BartForConditionalGeneration\n\n# Load BART model\nmodel = BartForConditionalGeneration.from_pretrained('facebook\/bart-large')\n\n# Input to the model\noutput_sequences = model.generate(input_ids_np[np.newaxis, :], attention_mask=attention_mask_np[np.newaxis, :])\n\n# Decode results\nsummary = tokenizer.decode(output_sequences[0], skip_special_tokens=True)\nprint(\"Generated Summary:\", summary)<\/code><\/pre>\n<h3>Model Output<\/h3>\n<p>You can check the generated summary text as a result of the above code. The BART model extracts meaningful content by generating natural text based on the input sentence.<\/p>\n<h2>Conclusion<\/h2>\n<p>In this tutorial, we learned how to use the BART model through the Hugging Face Transformers library, use the tokenizer, and convert the results into NumPy arrays. This process helps in establishing a foundational understanding of natural language processing and acquiring basic skills in utilizing models. We hope that you achieve more results by applying approaches like BART in various text processing tasks in the future.<\/p>\n<p>Based on the knowledge gained from this tutorial, we hope you attempt more projects and expand your understanding of deep learning. Thank you.<\/p>\n<p><\/body><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recent advancements in deep learning models have been remarkable in the fields of artificial intelligence and natural language processing. In particular, the Hugging Face Transformers library allows easy access to various natural language processing (NLP) models. In this practical session, we will explore how to use the BART (Bidirectional and Auto-Regressive Transformers) model to process &hellip; <a href=\"https:\/\/atmokpo.com\/w\/36061\/\" class=\"more-link\">\ub354 \ubcf4\uae30<span class=\"screen-reader-text\"> &#8220;huggingface transformers tutorial, convert BART tokenization results to numpy array&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[108],"tags":[],"class_list":["post-36061","post","type-post","status-publish","format-standard","hentry","category---en"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>huggingface transformers tutorial, convert BART tokenization results to numpy array - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/atmokpo.com\/w\/36061\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"huggingface transformers tutorial, convert BART tokenization results to numpy array - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"og:description\" content=\"Recent advancements in deep learning models have been remarkable in the fields of artificial intelligence and natural language processing. In particular, the Hugging Face Transformers library allows easy access to various natural language processing (NLP) models. In this practical session, we will explore how to use the BART (Bidirectional and Auto-Regressive Transformers) model to process &hellip; \ub354 \ubcf4\uae30 &quot;huggingface transformers tutorial, convert BART tokenization results to numpy array&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/atmokpo.com\/w\/36061\/\" \/>\n<meta property=\"og:site_name\" content=\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-01T09:45:22+00:00\" \/>\n<meta name=\"author\" content=\"root\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:site\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"root\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"3\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/atmokpo.com\/w\/36061\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36061\/\"},\"author\":{\"name\":\"root\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\"},\"headline\":\"huggingface transformers tutorial, convert BART tokenization results to numpy array\",\"datePublished\":\"2024-11-01T09:45:22+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36061\/\"},\"wordCount\":469,\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"articleSection\":[\"Using Hugging Face\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/atmokpo.com\/w\/36061\/\",\"url\":\"https:\/\/atmokpo.com\/w\/36061\/\",\"name\":\"huggingface transformers tutorial, convert BART tokenization results to numpy array - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#website\"},\"datePublished\":\"2024-11-01T09:45:22+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36061\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/atmokpo.com\/w\/36061\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/atmokpo.com\/w\/36061\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\ud648\",\"item\":\"https:\/\/atmokpo.com\/w\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"huggingface transformers tutorial, convert BART tokenization results to numpy array\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/atmokpo.com\/w\/#website\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/atmokpo.com\/w\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"contentUrl\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"width\":400,\"height\":400,\"caption\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\"},\"image\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/bebubo4\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\",\"name\":\"root\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"caption\":\"root\"},\"sameAs\":[\"http:\/\/atmokpo.com\/w\"],\"url\":\"https:\/\/atmokpo.com\/w\/author\/root\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"huggingface transformers tutorial, convert BART tokenization results to numpy array - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/atmokpo.com\/w\/36061\/","og_locale":"ko_KR","og_type":"article","og_title":"huggingface transformers tutorial, convert BART tokenization results to numpy array - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","og_description":"Recent advancements in deep learning models have been remarkable in the fields of artificial intelligence and natural language processing. In particular, the Hugging Face Transformers library allows easy access to various natural language processing (NLP) models. In this practical session, we will explore how to use the BART (Bidirectional and Auto-Regressive Transformers) model to process &hellip; \ub354 \ubcf4\uae30 \"huggingface transformers tutorial, convert BART tokenization results to numpy array\"","og_url":"https:\/\/atmokpo.com\/w\/36061\/","og_site_name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","article_published_time":"2024-11-01T09:45:22+00:00","author":"root","twitter_card":"summary_large_image","twitter_creator":"@bebubo4","twitter_site":"@bebubo4","twitter_misc":{"\uae00\uc4f4\uc774":"root","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"3\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/atmokpo.com\/w\/36061\/#article","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/36061\/"},"author":{"name":"root","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7"},"headline":"huggingface transformers tutorial, convert BART tokenization results to numpy array","datePublished":"2024-11-01T09:45:22+00:00","mainEntityOfPage":{"@id":"https:\/\/atmokpo.com\/w\/36061\/"},"wordCount":469,"publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"articleSection":["Using Hugging Face"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/atmokpo.com\/w\/36061\/","url":"https:\/\/atmokpo.com\/w\/36061\/","name":"huggingface transformers tutorial, convert BART tokenization results to numpy array - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/#website"},"datePublished":"2024-11-01T09:45:22+00:00","breadcrumb":{"@id":"https:\/\/atmokpo.com\/w\/36061\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/atmokpo.com\/w\/36061\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/atmokpo.com\/w\/36061\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\ud648","item":"https:\/\/atmokpo.com\/w\/en\/"},{"@type":"ListItem","position":2,"name":"huggingface transformers tutorial, convert BART tokenization results to numpy array"}]},{"@type":"WebSite","@id":"https:\/\/atmokpo.com\/w\/#website","url":"https:\/\/atmokpo.com\/w\/","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","description":"","publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/atmokpo.com\/w\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/atmokpo.com\/w\/#organization","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","url":"https:\/\/atmokpo.com\/w\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/","url":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","contentUrl":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","width":400,"height":400,"caption":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8"},"image":{"@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/bebubo4"]},{"@type":"Person","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7","name":"root","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","caption":"root"},"sameAs":["http:\/\/atmokpo.com\/w"],"url":"https:\/\/atmokpo.com\/w\/author\/root\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36061","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/comments?post=36061"}],"version-history":[{"count":1,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36061\/revisions"}],"predecessor-version":[{"id":36062,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36061\/revisions\/36062"}],"wp:attachment":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/media?parent=36061"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/categories?post=36061"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/tags?post=36061"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}