Strojno učenje računalniškega igralca v igri Havannah

diplomsko delo

Nino Serec (Author), Damjan Strnad (Mentor)

Abstract

V zadnjih letih je bil na področju umetne inteligence z uporabo okrepitvenega učenja nevronskih mrež dosežen preboj pri sposobnostih računalnika za igranje iger na deski, kot je Go, pri katerih je bil človek doslej močnejši nasprotnik. V diplomskem delu raziščemo algoritem igranja iger AlphaZero, ki kombinira tehnike preiskovanja dreves Monte Carlo in okrepitvenega učenja nevronskih mrež. Algoritem začne brez posebnega predznanja o dobrih strategijah, vendar se moč algoritma s postopkom učenja, ki se ponavlja iterativno, konstantno povečuje. V diplomskem delu opišemo in implementiramo osnovno obliko AlphaZero za igranje igre Havannah. Naučimo več različic modela nevronskih mrež, kjer vsak naslednik premaga svojega prednika in postane prvak. S tem pokažemo, da se lahko računalniški igralec uči igranja igre Havannah samo s podanimi pravili igre, tako da je sposoben premagati povprečnega človeškega igralca.

Keywords

igra Havannah;drevesno preiskovanje Monte Carlo;nevronske mreže;okrepitveno učenje;diplomske naloge;

Data

Language:	Slovenian
Year of publishing:	2020
Typology:	2.11 - Undergraduate Thesis
Organization:	UM FERI - Faculty of Electrical Engineering and Computer Science
Publisher:	[N. Serec]
UDC:	004.388.4:004.85(043.2)
COBISS:	45050627
Views:	553
Downloads:	67
Average score:	0 (0 votes)
Metadata:

Other data

Secondary language:	English
Secondary title:	Machine learning of computer player in Havannah game
Secondary abstract:	In recent years, in the field of artificial intelligence, the reinforcement learning of neural networks has been used to achieve a breakthrough in the ability of the computer players to play board games, such as Go, in which human has been a stronger opponent. In this thesis, we explore the AlphaZero algorithm, which combines Monte Carlo tree search and reinforced neural network learning. The algorithm starts without any special prior knowledge of good strategies, but the algorithm becomes stronger with a learning process that repeats iteratively. In this thesis, we implement the basic form of AlphaZero for playing the Havannah game. Several versions of the neural network model are trained to play the game, where each successor defeats its predecessor and becomes the champion, thus showing that a computer player can learn to play the Havannah game and win against a human player, simply by being given the rules of the game and not possessing any special prior knowledge of good strategies.
Secondary keywords:	Havannah;Monte Carlo tree search;neural networks;reinforced learning;tabula rasa;
Type (COBISS):	Bachelor thesis/paper
Thesis comment:	Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Računalništvo in informacijske tehnologije
Pages:	VIII, 42 f.
ID:	12074975