magistrsko delo
Tevž Šart (Author), Sašo Karakatič (Mentor)

Abstract

V sklopu magistrske naloge smo se osredotočili na problematiko iskanja primernih revij za objavo znanstvenih člankov različnih avtorjev. V prvem delu smo se osredotočili na pridobivanje znanja iz nestrukturiranih podatkov. Za pridobivanje uporabnega znanja smo uporabili način besedne vložitve. V drugem delu smo se osredotočili na izgradnjo programske rešitve za vektorizacijo znanstvenih člankov in revij. Namen magistrske je bil ugotoviti, ali lahko s pomočjo strojnega učenja in tehnike vektorizacije besedila ugotovimo podobnosti med znanstvenimi članki različnih avtorjev in revij ter na takšen način ugotovimo, ali avtor objavlja svoje znanstvene članke v pravilnih revijah. Vhodni korpus smo pridobili iz spletne baze znanstvenih člankov Scopus. S pomočjo rezultatov programske rešitve smo opravili analizo, s pomočjo katere smo pridobili odgovore na zastavljena raziskovalna vprašanja ter posledično sprejeli ali zavrgli hipoteze.

Keywords

besedne vložitve;vektorizacija besedila;obdelava naravnega jezika;magistrske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UM FERI - Faculty of Electrical Engineering and Computer Science
Publisher: [T. Šart]
UDC: 004.85:004.775(043.2)
COBISS: 60445699 Link will open in a new window
Views: 445
Downloads: 58
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Machine learning based analysis of scientific journals and authors
Secondary abstract: As part of the master's thesis, we focused on the issue of finding suitable journals for the publication of scientific articles by various authors. In the first part, we focused on acquiring knowledge from unstructured data. We used the word embedding method to gain useful knowledge. In the second part, we focused on building a software solution for vectorization of scientific articles and journals. The purpose of the master's thesis was to determine whether we can use machine learning and text vectorization techniques to determine the similarities between scientific articles of different authors and journals and thus determine whether the author publishes his scientific articles in the correct journals. The input corpus was obtained from the online database of scientific articles Scoupus. With the help of the results of the software solution, we performed an analysis with the help of which we obtained answers to the posed research questions and consequently accepted or rejected the set hypotheses.
Secondary keywords: DOC2VEC;TF-IDF;word embedding;text vectorization;natural language processing;
Type (COBISS): Master's thesis/paper
Thesis comment: Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Informatika in tehnologije komuniciranja
Pages: IX, 59 f.
ID: 12678869