diplomsko delo
Jan Urankar (Avtor), Janez Demšar (Mentor)

Povzetek

Cilj diplomskega dela je raziskati razlike med igro agentov, naučenih s pomočjo umetne inteligence, in človeka v pokru. Prva koraka sta bila učenje različice pokra, imenovane No Limit Texas Holde’m, in spoznavanje pravil igre. Naslednji korak je bil izgradnja poker agenta, ki bi ga lahko s pomočjo strojnega učenja naučili igranja igre. Odločili smo se za algoritem, ki spada v družino algoritmov spodbujevanega učenja, imenovan counterfactual regret minimization. Agenta smo naučili igre poker. Zatem smo inicializirali dva agenta, ki sta igrala drug proti drugemu, pri čemer smo beležili njune poteze. Ko smo generirali dovolj iger agentov, smo poiskali podatke o igrah ljudi. V naslednjem koraku smo pripravili skripto za analizo med podatki igre agentov in podatki igre ljudi. V tej skripti smo analizirali več vidikov, ki so bili osnova za sklep o slogu igre določenega igralca ali agenta. Na podlagi poskusa smo prišli do sklepa, da so agenti, naučeni s pomočjo našega algoritma, veliko dejavnejši in agresivnejši od ljudi. Dejavna igra v tem kontekstu pomeni, da igralec igra veliko iger, agresivna igra pa, da veliko stavi in da tudi pogosto zavaja ali ’blefira’. Te ugotovitve se ujemajo z ugotovitvami drugih ekip, ki so s pomočjo umetne inteligence učili inteligentne agente na področju pokra z algoritmom counterfactual regret minimization.

Ključne besede

umetna inteligenca;igra poker;igranje igre;računalništvo in informatika;univerzitetni študij;diplomske naloge;

Podatki

Jezik: Slovenski jezik
Leto izida:
Tipologija: 2.11 - Diplomsko delo
Organizacija: UL FRI - Fakulteta za računalništvo in informatiko
Založnik: [J. Urankar]
UDK: 004.8:685.811(043.2)
COBISS: 77195779 Povezava se bo odprla v novem oknu
Št. ogledov: 269
Št. prenosov: 40
Ocena: 0 (0 glasov)
Metapodatki: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Ostali podatki

Sekundarni jezik: Angleški jezik
Sekundarni naslov: Machine learning in poker
Sekundarni povzetek: The goal of this diploma thesis is finding the differences betwen human and artifical agent in poker. The first step was preforming research on the specific version of poker, named No Limit Texas Holde’m and learning the rules of the game. The next step was a creation of an intelligent poker agent, which is trained to play the specified version of poker, using machine learning. We decided to use an algorithm, which belongs in the family of machine learning algorithms, known as reinforcement learning. The algorithm is called counterfactual regret minimization. After the selection of the algorithm we trained the intelligent agent and created two instances of the same agent. Those two agents than played poker against each other and we were monitoring and noting every move they made. When we generated enough games between agents, we created a script for analysing the data. In the script we analysed several aspects of the game, which were the basis for our conclusion on the game style of certain virtual agent versus human player. Our conclusion on the basis of the experiment is, that virtual agents, trained with our algorithm, play much more actively and more aggressively than humans. Active game in this context means, that a player plays many games. Aggressive game means that the player bets a lot and often bluffs. These findings are in line with the other researchers findings, which used counter- factual regret minimization for creating an intelligent agent.
Sekundarne ključne besede: machine learning;big data;poker;counterfactual regret minimization;computer science;diploma;Strojno učenje;Poker;Računalništvo;Univerzitetna in visokošolska dela;
Vrsta dela (COBISS): Diplomsko delo/naloga
Študijski program: 1000468
Komentar na gradivo: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Strani: 44 str.
ID: 13390862