magistrsko delo
Abstract
V magistrskem delu obravnavamo algoritme okrepitvenega učenja na primeru igranja računalniških iger. Namen magistrskega dela je implementacija igre v okolju Unity in analiza učinkovitosti algoritmov okrepitvenega učenja računalniškega igralca. Opisane so teoretične osnove okrepitvenega učenja, podrobneje pa so predstavljeni algoritmi PPO (angl. Proximal Policy Optimization), SAC (angl. Soft Actor Critic) in DQN (angl. Deep Q-Network), ki so uporabljeni v končni analizi. Rezultati so pokazali, da je bilo učenje agenta v celoti gledano uspešno. V testnem okolju se je najbolje odrezal algoritem PPO, z uporabo katerega je naučen agent v povprečju dosegal 86,4% maksimalne možne nagrade, najslabše pa algoritem DQN, ki ni primeren za uporabo v implementiranem testnem okolju.
Keywords
okrepitveno učenje;računalniške igre;Unity;agent;strojno učenje;magistrske naloge;
Data
Language: |
Slovenian |
Year of publishing: |
2021 |
Typology: |
2.09 - Master's Thesis |
Organization: |
UM FERI - Faculty of Electrical Engineering and Computer Science |
Publisher: |
[J. Banko] |
UDC: |
004.85:004.96(043.2) |
COBISS: |
67936771
|
Views: |
375 |
Downloads: |
67 |
Average score: |
0 (0 votes) |
Metadata: |
|
Other data
Secondary language: |
English |
Secondary title: |
Reinforcement learning of game-playing agents in the Unity engine |
Secondary abstract: |
In the master thesis we deal with the reinforcement learning algorithms on the example of playing computer games. The purpose of the thesis is to implement a game in the Unity engine and perform an effectiveness analysis of reinforcement learning algorithms of a computer player. Theoretic bases of reinforcement learning are described and PPO (Proximal Policy Optimization), SAC (Soft Actor Critic) and DQN (Deep Q-Network) algorithms that are used in the final analysis are presented in detail. The results have shown that the learning of the agent was overall successful. The best algorithm in the test environment was PPO, using which the agent achieved 86,4% of the maximal possible reward on average, and the worst was DQN, which is not suitable for use in the implemented test environment. |
Secondary keywords: |
reinforcement learning;computer games;Unity;agent;machine learning; |
Type (COBISS): |
Master's thesis/paper |
Thesis comment: |
Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Računalništvo in informacijske tehnologije |
Pages: |
VIII, 53 str. |
ID: |
12934011 |