Globoko okrepitveno učenje za igranje iger na podlagi video vhoda

magistrsko delo

Monika Bozhinova (Author), Damjan Strnad (Mentor)

Abstract

V magistrskem delu smo se ukvarjali z okrepitvenim učenjem agentov za igranje računalniških iger. V ta namen smo implementirali tri modele agenta, ki temeljijo na uporabi nevronske mreže za aproksimacijo funkcije vrednosti akcij, in predlagali lastno izboljšano arhitekturo dvobojevalne dvojne Q-mreže. Učenje smo izvajali na igrah Pong in Beamrider iz nabora iger Atari 2600. Ugotovili smo, da z našim pristopom dosežemo boljšo zmogljivost agenta kot globoka Q-mreža, dvojna globoka Q-mreža in dvojna globoka Q-mreža z dvobojevalno arhitekturo v igri Pong, medtem ko se v igri Beamrider agent uči počasneje, predvidoma zaradi šuma v drugačni predstavitvi stanja, ki ga predlagani model uporablja.

Keywords

globoko okrepitveno učenje;nevronske mreže;globoka Q-mreža;dvobojevalna arhitektura;igre Atari;magistrske naloge;

Data

Language:	Slovenian
Year of publishing:	2021
Typology:	2.09 - Master's Thesis
Organization:	UM FERI - Faculty of Electrical Engineering and Computer Science
Publisher:	[M. Bozhinova]
UDC:	004.85:004.96(043.2)
COBISS:	83074563
Views:	285
Downloads:	55
Average score:	0 (0 votes)
Metadata:

Other data

Secondary language:	English
Secondary title:	Deep reinforcement learning for playing games based on video input
Secondary abstract:	In the master's thesis, we dealt with reinforcement learning of agents for playing computer games. To this end, we implemented three agent models based on the use of neural networks as action value function approximators, and proposed our own improved architecture of the dueling double Q-network. We conducted the training on the games Pong and Beamrider from the Atari 2600 games. We found that with our approach we achieve better agent performance than deep Q-networks, double deep Q-networks and double deep Q-networks with dueling architecture in the game Pong, while in Beamrider the agent learns more slowly, presumably due to the noise in the different representation of the state used by the proposed model.
Secondary keywords:	deep reinforcement learning;neural networks;deep Q-network;dueling architecture;Atari games;Pong;Beamrider;
Type (COBISS):	Master's thesis/paper
Thesis comment:	Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Računalništvo in informacijske tehnologije
Pages:	XII, 52 str.
ID:	13394388