diplomsko delo
Abstract
Graf je neevklidska podatkovna struktura, ki jo je težko neposredno analizirati z metodami strojnega učenja, ki obdelujejo podatke v vektorski obliki. Zaradi tega so v zadnjih letih postale priljubljene metode strojnega učenja za vektorsko vložitev, ki graf transformirajo v vektorski prostor. V diplomskem delu zgradimo graf iz člankov z angleške Wikipedije s sledenjem vsebovanim hiperpovezavam. Eksperiment izvedemo za filme in glasbene albume. Vozlišča dobljenega grafa vložimo v vektorski prostor, kar nam omogoči učinkovitejšo analizo grafa, pri kateri se osredotočimo na vizualizacijo, podobnost ter klasifikacijo filmov in albumov v žanre. Med seboj primerjamo vložitve metod DeepWalk, node2vec in SDNE. Pri klasifikaciji filmov v povprečju dosežemo 88,5 % točnost, pri albumih pa 89,3 % točnost.
Keywords
strojno učenje;graf;vložitev vozlišč;naključni sprehod;avtokodirnik;diplomske naloge;
Data
Language: |
Slovenian |
Year of publishing: |
2020 |
Typology: |
2.11 - Undergraduate Thesis |
Organization: |
UM FERI - Faculty of Electrical Engineering and Computer Science |
Publisher: |
[V. Keršič] |
UDC: |
004.85:004.422.63(043.2) |
COBISS: |
38602243
|
Views: |
338 |
Downloads: |
65 |
Average score: |
0 (0 votes) |
Metadata: |
|
Other data
Secondary language: |
English |
Secondary title: |
Machine learning methods for vector embedding of graph nodes |
Secondary abstract: |
A graph is a non-Euclidean data structure, which is hard to analyze directly with machine learning methods that process data in the vector form. Therefore, in the recent years, machine learning methods for vector embedding, which transform graphs into vector space, have gained a lot of traction. In the thesis, we construct a graph from English Wikipedia articles by following contained hyperlinks. We conduct experiments for movies and music albums. We embed the nodes of the obtained graph in a vector space, which allows us to analyze them more efficiently, focusing on visualization, similarity, and classification of movies and albums into genres. We compare the embeddings produced by methods DeepWalk, node2vec, and SDNE. We achieve, on average, the classification accuracy of 88.5 % for movies and 89.3 % for albums. |
Secondary keywords: |
machine learning;graph;node embeddings;random walk;autoencoder; |
Type (COBISS): |
Bachelor thesis/paper |
Thesis comment: |
Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Računalništvo in informacijske tehnologije |
Pages: |
XII, 42 str. |
ID: |
11967015 |