{"id":36145,"date":"2024-11-01T09:46:06","date_gmt":"2024-11-01T09:46:06","guid":{"rendered":"http:\/\/atmokpo.com\/w\/?p=36145"},"modified":"2024-11-01T09:46:06","modified_gmt":"2024-11-01T09:46:06","slug":"transformers-course-using-hugging-face-m2m100-translation-result-decoding","status":"publish","type":"post","link":"https:\/\/atmokpo.com\/w\/36145\/","title":{"rendered":"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding"},"content":{"rendered":"<p><body><\/p>\n<p>\n        Recent advancements in artificial intelligence and natural language processing (NLP) are occurring at an astonishing pace, and machine translation is receiving significant attention as one of the key areas. Among these, Hugging Face&#8217;s Transformers library helps researchers and developers easily access the latest models. In this article, we will conduct a translation task using the M2M100 model and explore decoding the output in depth with explanations and example code.\n    <\/p>\n<h2>1. What are Hugging Face Transformers?<\/h2>\n<p>\n        Hugging Face (Transformers) is a library that offers a variety of pre-trained natural language processing models, making them easy to use. It includes various models like Bert, GPT, T5, and particularly multilingual models such as M2M100 that support translation between multiple languages.\n    <\/p>\n<h2>2. Introduction to the M2M100 Model<\/h2>\n<p>\n        M2M100 (Many-to-Many 100) is a multilingual machine translation model developed by Facebook AI Research that supports direct translation between 100 languages. Previous translation systems focused on one-directional translation for specific languages, but M2M100 has the ability to translate directly between any language combination.<br \/>\n        The advantages of this model include:<\/p>\n<ul>\n<li>Direct translation between various languages<\/li>\n<li>Improved quality of machine translation<\/li>\n<li>Trained on vast amounts of data, possessing a high generalization ability<\/li>\n<\/ul>\n<h2>3. Installing the Library and Setting Up the Environment<\/h2>\n<p>\n        To use the M2M100 model, you must first install the required libraries. A Python environment must be set up, and it can be installed with the following command:<\/p>\n<pre><code>pip install transformers torch<\/code><\/pre>\n<\/p>\n<h2>4. Using the M2M100 Model<\/h2>\n<h3>4.1 Loading the Model<\/h3>\n<p>\n        Now, let\u2019s load the M2M100 model and prepare to carry out translation tasks. Below is the code to load the model.\n    <\/p>\n<pre><code>\nfrom transformers import M2M100Tokenizer, M2M100ForConditionalGeneration\n\n# Loading tokenizer and model\ntokenizer = M2M100Tokenizer.from_pretrained(\"facebook\/m2m100_418M\")\nmodel = M2M100ForConditionalGeneration.from_pretrained(\"facebook\/m2m100_418M\")\n    <\/code><\/pre>\n<h3>4.2 Defining the Translation Function<\/h3>\n<p>\n        Next, we will create a simple translation function to implement the functionality of translating a given input sentence into a specific language. In this example, we will translate an English sentence into Korean.\n    <\/p>\n<pre><code>\ndef translate_text(text, target_lang=\"ko\"):\n    # Tokenizing the input sentence\n    tokenizer.src_lang = \"en\"  # Setting input language\n    encoded_input = tokenizer(text, return_tensors=\"pt\")\n    \n    # Translating through the model\n    generated_tokens = model.generate(**encoded_input, forced_bos_token_id=tokenizer.get_lang_id(target_lang))\n    \n    # Decoding and returning the result\n    return tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]\n    <\/code><\/pre>\n<h3>4.3 Translation Example<\/h3>\n<p>\n        Now, let\u2019s use the translation function. Below is an example of translating the sentence &#8220;Hello, how are you?&#8221; into Korean.\n    <\/p>\n<pre><code>\nsource_text = \"Hello, how are you?\"\ntranslated_text = translate_text(source_text, target_lang=\"ko\")\nprint(translated_text)  # Output: \"\uc548\ub155\ud558\uc138\uc694, \uc798 \uc9c0\ub0b4\uc138\uc694?\"\n    <\/code><\/pre>\n<h2>5. Decoding the Translation Output<\/h2>\n<p>\n        By decoding the translation output, we can convert the tokens generated by the model into natural language. The M2M100 model has the ability to handle outputs generated in multiple languages.<br \/>\n        Let\u2019s delve deeper into this with a more in-depth example.\n    <\/p>\n<h3>5.1 Implementing the Decoding Function<\/h3>\n<p>\n        A decoding function is also needed to carefully handle the tokens obtained from the translation. This helps ensure the format of the model&#8217;s output and improve the quality of the translation through additional post-processing.\n    <\/p>\n<pre><code>\ndef decode_output(generated_tokens, skip_special_tokens=True):\n    # Decoding tokens and returning the result string\n    return tokenizer.batch_decode(generated_tokens, skip_special_tokens=skip_special_tokens)\n    <\/code><\/pre>\n<h3>5.2 Example of Decoding Results<\/h3>\n<p>\n        Let\u2019s decode the list of generated tokens to check the translation results. The example below shows the procedure of decoding the result after the translation is completed.\n    <\/p>\n<pre><code>\n# Getting the generated tokens\ngenerated_tokens = model.generate(**encoded_input, forced_bos_token_id=tokenizer.get_lang_id(\"ko\"))\n\n# Decoding and printing the result\ndecoded_output = decode_output(generated_tokens)\nprint(decoded_output)  # Output: [\"\uc548\ub155\ud558\uc138\uc694, \uc798 \uc9c0\ub0b4\uc138\uc694?\"]\n    <\/code><\/pre>\n<h2>6. Optimizing Results<\/h2>\n<p>\n        Translation results may vary based on context or specific meanings. To optimize this, various parameters can be adjusted, or the model can be retrained for improvement. Additionally, adjusting the maximum output length or various random seeds can enhance the quality of the results.\n    <\/p>\n<h3>6.1 Optional Parameter Adjustments<\/h3>\n<p>\n        The model&#8217;s generate method can be adjusted with various parameters:<\/p>\n<ul>\n<li><strong>max_length<\/strong>: Maximum token length to generate<\/li>\n<li><strong>num_beams<\/strong>: Number of beams for beam search (improving diversity in decoding)<\/li>\n<li><strong>temperature<\/strong>: Adjusting the randomness of generation (values between 0-1)<\/li>\n<\/ul>\n<pre><code>\n# Example of additional parameter settings\ngenerated_tokens = model.generate(\n    **encoded_input,\n    forced_bos_token_id=tokenizer.get_lang_id(\"ko\"),\n    max_length=40,\n    num_beams=5,\n    temperature=0.7\n)\n    <\/code><\/pre>\n<h3>6.2 Comparing Results Before and After Optimization<\/h3>\n<p>\n        This is a method to evaluate the model&#8217;s performance by comparing results before and after optimization. Please choose the settings that best fit your application.\n    <\/p>\n<h2>7. Summary and Conclusion<\/h2>\n<p>\n        In this article, we explored how to perform machine translation using Hugging Face&#8217;s M2M100 model and how to decode the output results. Thanks to advancements in deep learning and NLP technologies, we have established a foundation for easily communicating across various languages.\n    <\/p>\n<p>\n        These technologies and tools will be utilized in the development of various applications in the future, fundamentally changing the way we work. We encourage you to use these tools to tackle even more meaningful projects.\n    <\/p>\n<h2>8. References<\/h2>\n<ul>\n<li>Hugging Face Transformers Documentation: <a href=\"https:\/\/huggingface.co\/docs\/transformers\/index\" target=\"_blank\" rel=\"noopener\">https:\/\/huggingface.co\/docs\/transformers\/index<\/a><\/li>\n<li>Research Paper on M2M100: <a href=\"https:\/\/arxiv.org\/abs\/2010.11125\" target=\"_blank\" rel=\"noopener\">https:\/\/arxiv.org\/abs\/2010.11125<\/a><\/li>\n<\/ul>\n<p><\/body><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recent advancements in artificial intelligence and natural language processing (NLP) are occurring at an astonishing pace, and machine translation is receiving significant attention as one of the key areas. Among these, Hugging Face&#8217;s Transformers library helps researchers and developers easily access the latest models. In this article, we will conduct a translation task using the &hellip; <a href=\"https:\/\/atmokpo.com\/w\/36145\/\" class=\"more-link\">\ub354 \ubcf4\uae30<span class=\"screen-reader-text\"> &#8220;Transformers Course Using Hugging Face, M2M100 Translation Result Decoding&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[108],"tags":[],"class_list":["post-36145","post","type-post","status-publish","format-standard","hentry","category---en"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Transformers Course Using Hugging Face, M2M100 Translation Result Decoding - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/atmokpo.com\/w\/36145\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"og:description\" content=\"Recent advancements in artificial intelligence and natural language processing (NLP) are occurring at an astonishing pace, and machine translation is receiving significant attention as one of the key areas. Among these, Hugging Face&#8217;s Transformers library helps researchers and developers easily access the latest models. In this article, we will conduct a translation task using the &hellip; \ub354 \ubcf4\uae30 &quot;Transformers Course Using Hugging Face, M2M100 Translation Result Decoding&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/atmokpo.com\/w\/36145\/\" \/>\n<meta property=\"og:site_name\" content=\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-01T09:46:06+00:00\" \/>\n<meta name=\"author\" content=\"root\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:site\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"root\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"4\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/atmokpo.com\/w\/36145\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36145\/\"},\"author\":{\"name\":\"root\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\"},\"headline\":\"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding\",\"datePublished\":\"2024-11-01T09:46:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36145\/\"},\"wordCount\":671,\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"articleSection\":[\"Using Hugging Face\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/atmokpo.com\/w\/36145\/\",\"url\":\"https:\/\/atmokpo.com\/w\/36145\/\",\"name\":\"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#website\"},\"datePublished\":\"2024-11-01T09:46:06+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36145\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/atmokpo.com\/w\/36145\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/atmokpo.com\/w\/36145\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\ud648\",\"item\":\"https:\/\/atmokpo.com\/w\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/atmokpo.com\/w\/#website\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/atmokpo.com\/w\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"contentUrl\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"width\":400,\"height\":400,\"caption\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\"},\"image\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/bebubo4\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\",\"name\":\"root\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"caption\":\"root\"},\"sameAs\":[\"http:\/\/atmokpo.com\/w\"],\"url\":\"https:\/\/atmokpo.com\/w\/author\/root\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/atmokpo.com\/w\/36145\/","og_locale":"ko_KR","og_type":"article","og_title":"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","og_description":"Recent advancements in artificial intelligence and natural language processing (NLP) are occurring at an astonishing pace, and machine translation is receiving significant attention as one of the key areas. Among these, Hugging Face&#8217;s Transformers library helps researchers and developers easily access the latest models. In this article, we will conduct a translation task using the &hellip; \ub354 \ubcf4\uae30 \"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding\"","og_url":"https:\/\/atmokpo.com\/w\/36145\/","og_site_name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","article_published_time":"2024-11-01T09:46:06+00:00","author":"root","twitter_card":"summary_large_image","twitter_creator":"@bebubo4","twitter_site":"@bebubo4","twitter_misc":{"\uae00\uc4f4\uc774":"root","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"4\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/atmokpo.com\/w\/36145\/#article","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/36145\/"},"author":{"name":"root","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7"},"headline":"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding","datePublished":"2024-11-01T09:46:06+00:00","mainEntityOfPage":{"@id":"https:\/\/atmokpo.com\/w\/36145\/"},"wordCount":671,"publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"articleSection":["Using Hugging Face"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/atmokpo.com\/w\/36145\/","url":"https:\/\/atmokpo.com\/w\/36145\/","name":"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/#website"},"datePublished":"2024-11-01T09:46:06+00:00","breadcrumb":{"@id":"https:\/\/atmokpo.com\/w\/36145\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/atmokpo.com\/w\/36145\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/atmokpo.com\/w\/36145\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\ud648","item":"https:\/\/atmokpo.com\/w\/en\/"},{"@type":"ListItem","position":2,"name":"Transformers Course Using Hugging Face, M2M100 Translation Result Decoding"}]},{"@type":"WebSite","@id":"https:\/\/atmokpo.com\/w\/#website","url":"https:\/\/atmokpo.com\/w\/","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","description":"","publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/atmokpo.com\/w\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/atmokpo.com\/w\/#organization","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","url":"https:\/\/atmokpo.com\/w\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/","url":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","contentUrl":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","width":400,"height":400,"caption":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8"},"image":{"@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/bebubo4"]},{"@type":"Person","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7","name":"root","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","caption":"root"},"sameAs":["http:\/\/atmokpo.com\/w"],"url":"https:\/\/atmokpo.com\/w\/author\/root\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36145","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/comments?post=36145"}],"version-history":[{"count":1,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36145\/revisions"}],"predecessor-version":[{"id":36146,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36145\/revisions\/36146"}],"wp:attachment":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/media?parent=36145"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/categories?post=36145"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/tags?post=36145"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}