{"id":36501,"date":"2024-11-01T09:48:59","date_gmt":"2024-11-01T09:48:59","guid":{"rendered":"http:\/\/atmokpo.com\/w\/?p=36501"},"modified":"2024-11-01T11:52:56","modified_gmt":"2024-11-01T11:52:56","slug":"deep-learning-pytorch-course-principles-of-monte-carlo-tree-search","status":"publish","type":"post","link":"https:\/\/atmokpo.com\/w\/36501\/","title":{"rendered":"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search"},"content":{"rendered":"<p><body><\/p>\n<p>In the field of deep learning and artificial intelligence, various algorithms exist for problem solving. One of them, Monte Carlo Tree Search (MCTS), is a widely used algorithm for decision-making in uncertain environments. In this article, we will deeply explain the principles of MCTS and provide an implementation example using PyTorch.<\/p>\n<h2>Overview of Monte Carlo Tree Search<\/h2>\n<p>MCTS is an algorithm utilized in various fields such as game theory, optimization problems, and robotics, which simulates situations and makes decisions based on the results. The core idea of MCTS is to explore the tree through random sampling. In other words, it tests various actions possible from a specific state and evaluates how good each action is to determine the optimal action.<\/p>\n<h3>Four Stages of MCTS<\/h3>\n<ol>\n<li><strong>Selection<\/strong>: Consider all possible actions from the current state and proceed to the next state according to the selection criteria.<\/li>\n<li><strong>Expansion<\/strong>: Add a new node from the selected state. This node represents the resulting state after performing the selected action.<\/li>\n<li><strong>Simulation<\/strong>: Randomly select actions from the expanded node to play through to the end of the game and evaluate the results.<\/li>\n<li><strong>Backpropagation<\/strong>: Learn from the simulation results to the parent node. At this time, update the number of wins, visitations, etc., for the nodes.<\/li>\n<\/ol>\n<h2>Combining with Deep Learning<\/h2>\n<p>MCTS can perform the basic stages using simple rule-based methods, but it can exhibit even stronger performance when combined with deep learning. For example, deep learning can be used to predict the value of actions or more accurately evaluate the value of states. This is particularly effective in complex environments.<\/p>\n<h2>Implementing MCTS with PyTorch<\/h2>\n<p>Now, let&#8217;s implement Monte Carlo Tree Search using PyTorch. We will use a simple Tic-Tac-Toe game as an example.<\/p>\n<h3>Setting Up the Environment<\/h3>\n<p>First, we will install the required libraries:<\/p>\n<pre><code>pip install torch numpy<\/code><\/pre>\n<h3>Building the Game Environment<\/h3>\n<p>We will build a basic environment for the Tic-Tac-Toe game:<\/p>\n<pre><code>import numpy as np\n\nclass TicTacToe:\n    def __init__(self):\n        self.board = np.zeros((3, 3), dtype=int)\n        self.current_player = 1\n\n    def reset(self):\n        self.board.fill(0)\n        self.current_player = 1\n\n    def available_actions(self):\n        return np.argwhere(self.board == 0)\n\n    def take_action(self, action):\n        self.board[action[0], action[1]] = self.current_player\n        self.current_player = 3 - self.current_player  # Switch between players\n\n    def is_winner(self, player):\n        return any(np.all(self.board[i, :] == player) for i in range(3)) or \\\n               any(np.all(self.board[:, j] == player) for j in range(3)) or \\\n               np.all(np.diag(self.board) == player) or \\\n               np.all(np.diag(np.fliplr(self.board)) == player)\n\n    def is_full(self):\n        return np.all(self.board != 0)\n\n    def get_state(self):\n        return self.board.copy()\n<\/code><\/pre>\n<h3>Implementing MCTS<\/h3>\n<p>Now we will implement the MCTS algorithm. The code below shows a basic construction method for MCTS.<\/p>\n<pre><code>import random\n\nclass MCTSNode:\n    def __init__(self, state, parent=None):\n        self.state = state\n        self.parent = parent\n        self.children = []\n        self.visits = 0\n        self.wins = 0\n\n    def ucb1(self, exploration_constant=1.41):\n        if self.visits == 0:\n            return float(\"inf\")\n        return self.wins \/ self.visits + exploration_constant * np.sqrt(np.log(self.parent.visits) \/ self.visits)\n\ndef mcts(root_state, iterations):\n    root_node = MCTSNode(root_state)\n    \n    for _ in range(iterations):\n        node = root_node\n        state = root_state.copy()\n\n        # Selection\n        while node.children:\n            node = max(node.children, key=lambda n: n.ucb1())\n            state.take_action(node.state)\n\n        # Expansion\n        available_actions = state.available_actions()\n        if available_actions.size > 0:\n            action = random.choice(available_actions)\n            state.take_action(action)\n            new_node = MCTSNode(action, parent=node)\n            node.children.append(new_node)\n            node = new_node\n\n        # Simulation\n        while not state.is_full():\n            available_actions = state.available_actions()\n            if not available_actions.any():\n                break\n            action = random.choice(available_actions)\n            state.take_action(action)\n            if state.is_winner(1):  # Player 1 is the maximizer\n                node.wins += 1\n\n        # Backpropagation\n        while node is not None:\n            node.visits += 1\n            node = node.parent\n            \n    return max(root_node.children, key=lambda n: n.visits).state\n<\/code><\/pre>\n<h3>Running the Game<\/h3>\n<p>Finally, let\u2019s execute the actual game using MCTS.<\/p>\n<pre><code>def play_game():\n    game = TicTacToe()\n    game.reset()\n\n    while not game.is_full():\n        if game.current_player == 1:\n            action = mcts(game.get_state(), iterations=1000)\n        else:\n            available_actions = game.available_actions()\n            action = random.choice(available_actions)\n\n        game.take_action(action)\n        print(game.get_state())\n        \n        if game.is_winner(1):\n            print(\"Player 1 wins!\")\n            return\n        elif game.is_winner(2):\n            print(\"Player 2 wins!\")\n            return\n    \n    print(\"Draw!\")\n\nplay_game()\n<\/code><\/pre>\n<h2>Conclusion<\/h2>\n<p>In this article, we examined the principles of Monte Carlo Tree Search and how to implement it using PyTorch. MCTS is a powerful tool for modeling decision-making processes, particularly in uncertain environments. We hope this simple Tic-Tac-Toe example helped in understanding the basic flow of MCTS. We encourage you to study the applications of MCTS in more complex games or problems in the future.<\/p>\n<p><\/body><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the field of deep learning and artificial intelligence, various algorithms exist for problem solving. One of them, Monte Carlo Tree Search (MCTS), is a widely used algorithm for decision-making in uncertain environments. In this article, we will deeply explain the principles of MCTS and provide an implementation example using PyTorch. Overview of Monte Carlo &hellip; <a href=\"https:\/\/atmokpo.com\/w\/36501\/\" class=\"more-link\">\ub354 \ubcf4\uae30<span class=\"screen-reader-text\"> &#8220;Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[149],"tags":[],"class_list":["post-36501","post","type-post","status-publish","format-standard","hentry","category-pytorch-study"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/atmokpo.com\/w\/36501\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"og:description\" content=\"In the field of deep learning and artificial intelligence, various algorithms exist for problem solving. One of them, Monte Carlo Tree Search (MCTS), is a widely used algorithm for decision-making in uncertain environments. In this article, we will deeply explain the principles of MCTS and provide an implementation example using PyTorch. Overview of Monte Carlo &hellip; \ub354 \ubcf4\uae30 &quot;Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/atmokpo.com\/w\/36501\/\" \/>\n<meta property=\"og:site_name\" content=\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-01T09:48:59+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-11-01T11:52:56+00:00\" \/>\n<meta name=\"author\" content=\"root\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:site\" content=\"@bebubo4\" \/>\n<meta name=\"twitter:label1\" content=\"\uae00\uc4f4\uc774\" \/>\n\t<meta name=\"twitter:data1\" content=\"root\" \/>\n\t<meta name=\"twitter:label2\" content=\"\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04\" \/>\n\t<meta name=\"twitter:data2\" content=\"4\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/atmokpo.com\/w\/36501\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36501\/\"},\"author\":{\"name\":\"root\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\"},\"headline\":\"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search\",\"datePublished\":\"2024-11-01T09:48:59+00:00\",\"dateModified\":\"2024-11-01T11:52:56+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36501\/\"},\"wordCount\":417,\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"articleSection\":[\"PyTorch Study\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/atmokpo.com\/w\/36501\/\",\"url\":\"https:\/\/atmokpo.com\/w\/36501\/\",\"name\":\"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"isPartOf\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#website\"},\"datePublished\":\"2024-11-01T09:48:59+00:00\",\"dateModified\":\"2024-11-01T11:52:56+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/atmokpo.com\/w\/36501\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/atmokpo.com\/w\/36501\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/atmokpo.com\/w\/36501\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\ud648\",\"item\":\"https:\/\/atmokpo.com\/w\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/atmokpo.com\/w\/#website\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/atmokpo.com\/w\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/atmokpo.com\/w\/#organization\",\"name\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\",\"url\":\"https:\/\/atmokpo.com\/w\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"contentUrl\":\"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png\",\"width\":400,\"height\":400,\"caption\":\"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8\"},\"image\":{\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/bebubo4\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7\",\"name\":\"root\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g\",\"caption\":\"root\"},\"sameAs\":[\"http:\/\/atmokpo.com\/w\"],\"url\":\"https:\/\/atmokpo.com\/w\/author\/root\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/atmokpo.com\/w\/36501\/","og_locale":"ko_KR","og_type":"article","og_title":"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","og_description":"In the field of deep learning and artificial intelligence, various algorithms exist for problem solving. One of them, Monte Carlo Tree Search (MCTS), is a widely used algorithm for decision-making in uncertain environments. In this article, we will deeply explain the principles of MCTS and provide an implementation example using PyTorch. Overview of Monte Carlo &hellip; \ub354 \ubcf4\uae30 \"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search\"","og_url":"https:\/\/atmokpo.com\/w\/36501\/","og_site_name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","article_published_time":"2024-11-01T09:48:59+00:00","article_modified_time":"2024-11-01T11:52:56+00:00","author":"root","twitter_card":"summary_large_image","twitter_creator":"@bebubo4","twitter_site":"@bebubo4","twitter_misc":{"\uae00\uc4f4\uc774":"root","\uc608\uc0c1 \ub418\ub294 \ud310\ub3c5 \uc2dc\uac04":"4\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/atmokpo.com\/w\/36501\/#article","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/36501\/"},"author":{"name":"root","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7"},"headline":"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search","datePublished":"2024-11-01T09:48:59+00:00","dateModified":"2024-11-01T11:52:56+00:00","mainEntityOfPage":{"@id":"https:\/\/atmokpo.com\/w\/36501\/"},"wordCount":417,"publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"articleSection":["PyTorch Study"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/atmokpo.com\/w\/36501\/","url":"https:\/\/atmokpo.com\/w\/36501\/","name":"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search - \ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","isPartOf":{"@id":"https:\/\/atmokpo.com\/w\/#website"},"datePublished":"2024-11-01T09:48:59+00:00","dateModified":"2024-11-01T11:52:56+00:00","breadcrumb":{"@id":"https:\/\/atmokpo.com\/w\/36501\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/atmokpo.com\/w\/36501\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/atmokpo.com\/w\/36501\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\ud648","item":"https:\/\/atmokpo.com\/w\/en\/"},{"@type":"ListItem","position":2,"name":"Deep Learning PyTorch Course, Principles of Monte Carlo Tree Search"}]},{"@type":"WebSite","@id":"https:\/\/atmokpo.com\/w\/#website","url":"https:\/\/atmokpo.com\/w\/","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","description":"","publisher":{"@id":"https:\/\/atmokpo.com\/w\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/atmokpo.com\/w\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/atmokpo.com\/w\/#organization","name":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8","url":"https:\/\/atmokpo.com\/w\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/","url":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","contentUrl":"https:\/\/atmokpo.com\/w\/wp-content\/uploads\/2024\/11\/logo.png","width":400,"height":400,"caption":"\ub77c\uc774\ube0c\uc2a4\ub9c8\ud2b8"},"image":{"@id":"https:\/\/atmokpo.com\/w\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/bebubo4"]},{"@type":"Person","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/91b6b3b138fbba0efb4ae64b1abd81d7","name":"root","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/atmokpo.com\/w\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/708197b41fc6435a7ce22d951b25d4a47e9e904270cb1f04682d4f025066f80c?s=96&d=mm&r=g","caption":"root"},"sameAs":["http:\/\/atmokpo.com\/w"],"url":"https:\/\/atmokpo.com\/w\/author\/root\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36501","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/comments?post=36501"}],"version-history":[{"count":1,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36501\/revisions"}],"predecessor-version":[{"id":36502,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/posts\/36501\/revisions\/36502"}],"wp:attachment":[{"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/media?parent=36501"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/categories?post=36501"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/atmokpo.com\/w\/wp-json\/wp\/v2\/tags?post=36501"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}