magistrsko delo
Aleš Pečovnik (Author), Milan Ojsteršek (Mentor)

Abstract

V magistrskem delu je opisano področje označevanja imenskih entitet vključno z razpoložljivimi pristopi. Podrobneje je razdelano področje strojnega učenja, tako nadzorovanega, delno nadzorovanega kot nenadzorovanega, z opisi nekaj pogosto uporabljanih metod. V magistrskem delu smo izdelali in preizkusili sistem označevanja imenskih entitet v besedilih, ki so napisana v slovenskem jeziku. Uporabili smo strojno učenje z metodo pogojnih naključnih polj. Opisani so posamezni deli sistema, uporabljen podatkovni vir za učenje sistema, za tem pa sam potek preizkušanja in dobljeni rezultati ter ugotovitve. Rezultati za kategoriji geografskih in lastnih imen so bili zadovoljivi, obstajajo pa še razne možnosti za nadaljnji napredek na tem področju, ki je tudi nujen zaradi naraščajočih potreb po uporabi takšnih sistemov.

Keywords

obdelovanje naravnega jezika;strojno učenje;označevanje imenskih entitet;pogojna naključna polja;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UM FERI - Faculty of Electrical Engineering and Computer Science
Publisher: A. Pečovnik
UDC: 004.5:004.86(043.2)
COBISS: 22123286 Link will open in a new window
Views: 1665
Downloads: 146
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Named entity recognition in texts using machine learning methods
Secondary abstract: The thesis describes the field of named entity recognition along with possible approaches, where machine learning methods are depicted in details. This includes some of the most commonly used methods from supervised, semi supervised and unsupervised areas. The purpose of the Master thesis was to develop and test a system for named entity recognition in Slovenian texts using a machine learning method, whereby the conditional random field method was chosen. Individual parts of the system and the used data source are further elaborated, which is followed by the course of the experiment itself, together with the obtained results and findings. Results for the geographical and personal name entities were satisfactory; however, there are still various possibilities for further progress in the field of named entity recognition, which is also necessary due to the growing demand for such systems.
Secondary keywords: natural language processing;machine learning;named entity recognition;conditional random fields;
URN: URN:SI:UM:
Type (COBISS): Master's thesis/paper
Thesis comment: Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Računalništvo in informacijske tehnologije
Pages: VIII, 79 f.
ID: 11000640
Recommended works:
, Bayesian attention networks for reliable hate speech detection