diplomsko delo
Luka Zorko (Author), Igor Kononenko (Mentor)

Abstract

V diplomskem delu sem raziskoval in primerjal različne tipe rekurentnih nevronskih mrež za namen generiranja umetnih člankov. Opisal in preizkusil sem različne tipe rekurentnih nevronskih mrež in sicer klasično, mrežo z dolgim kratkoročnim spominom LSTM (ang. Long Short Term Memory) in poenostavljeno verzijo LSTM, GRU (ang. Gated recurrent unit). Modele sem učil na zbirki športnih člankov, ki sem jo dobil na internetu in na Shakespearovi igri Romeo in Julija. Vsako izmed mrež sem pognal s sedmimi različnimi nastavitvami in za vsako izmed njih generiral po 6 kratkih besedil in jih na koncu, po pregledu, ocenil z ocenami od 1 do 5. Na generiranih besedilih sem preizkusil tudi 5 modelov (3 znakovne in 2 besedna) za klasifikacijo oz. razlikovanje ročno napisanih in generiranih člankov. Pri prepoznavanju generiranih besedil, se je najbolje odrezal besedni model, ki uporablja dvosmerno mrežo z dolgim kratkoročnim spominom BLSTM (ang. Bidirectional Long Short Term Memory), pri generiranju besedil, pa je najslabše ocene dobila navadna rekurentna nevronska mreža RNN, najboljše pa mreža LSTM.

Keywords

nevronske mreže;procesiranje naravnega jezika;računalništvo in informatika;univerzitetni študij;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [L. Zorko]
UDC: 004.8:81'322(043.2)
COBISS: 77913347 Link will open in a new window
Views: 5070
Downloads: 33
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Use of recurrent neural networks for generating artificial text
Secondary abstract: In this thesis I researched and compared different types of recurrent neural networks for natural language generation. I described and tested various types of recurrent neural networks: classic RNN, neural net with long short term memory LSTM, and a simplified version of LSTM called GRU (Gated recurrent unit). I trained the models on a collection of short sports articles from the internet and on Shakespeare's play Romeo and Juliet. Each type of neural net was run with 7 different settings. For each of the settings I generated 6 short text outputs and graded them from 1 to 5. I also tested the effectiveness of 5 models (3 character-based and 2 word-based) in differentiating handwritten and generated articles. In this task the word-based bidirectional neural net with long short term mermory BLSTM performed the best, while in the task of text generation, the regular RNN performed the worst and LSTM performed the best.
Secondary keywords: artificial intelligence;neural networks;natural language processing;computer and information science;diploma;Umetna inteligenca;Računalniško jezikoslovje;Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Bachelor thesis/paper
Study programme: 1000468
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 44 str.
ID: 13418741