diplomsko delo
Jasmina Pegan (Author), Marko Robnik Šikonja (Mentor), Polona Gantar (Co-mentor)

Abstract

Cilj diplomske naloge je razvoj klasifikatorja za prepoznavo protipomenk. Za izdelavo rešitve je bila uporabljena baza vnaprej pripravljenih vektorskih vložitev besed za slovenščino. Najprej smo sestavili učno množico protipomenk in sopomenk. Sledilo je iskanje čimbolj ustreznega klasifikacijskega modela. Ogledali smo si nekaj modelov metode podpornih vektorjev in nekaj globokih nevronskih mrež. Izbranim besedam smo poiskali pomensko sorodne besede in na njih uporabili naučeni model. Tako smo pridobili kandidate za pare protipomenk in sopomenk. Točnost rezultatov smo ocenili na testni množici. Najbolje ocenjeni model dosega klasifikacijsko točnost 0.70.

Keywords

protipomenke;sopomenke;vektorske vložitve besed;strojno učenje;klasifikacija;računalništvo;računalništvo in informatika;računalništvo in matematika;interdisciplinarni študij;univerzitetni študij;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [J. Pegan]
UDC: 004.85:81'373.422(043.2)
COBISS: 1538361795 Link will open in a new window
Views: 766
Downloads: 200
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Antonym detection with word embeddings
Secondary abstract: This thesis aims to develop a classifier for antonym detection. A database of pre-made word embeddings for Slovene was used to create the solution. First we collected a learning set consisting of synonyms and antonyms. Then we searched for the most appropriate classification model. We observed some support vector machine models and some deep neural networks. We applied the learned model to groups of words closest to the selected words. Thus, we obtained candidates for pairs of synonyms and antonyms. The accuracy of the results set was evaluated on the test set. The top rated model reaches classification accuracy of 0.70.
Secondary keywords: antonyms;synonyms;word embeddings;machine learning;classification;computer science;computer and information science;computer science and mathematics;interdisciplinary studies;diploma;
Type (COBISS): Bachelor thesis/paper
Study programme: 1000407
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 41 str.
ID: 11225347
Recommended works:
, zbirnik za spletne brskalnike
, diplomsko delo