Jezik: | Slovenski jezik |
---|---|
Leto izida: | 2023 |
Tipologija: | 2.11 - Diplomsko delo |
Organizacija: | UL FRI - Fakulteta za računalništvo in informatiko |
Založnik: | [B. Bulić] |
UDK: | 004.8:81'322(043.2) |
COBISS: | 168959747 |
Št. ogledov: | 70 |
Št. prenosov: | 8 |
Ocena: | 0 (0 glasov) |
Metapodatki: |
Sekundarni jezik: | Angleški jezik |
---|---|
Sekundarni naslov: | Word sense induction in Slovene using large language models |
Sekundarni povzetek: | In the thesis, we developed a procedure for discovering new word meanings. We extracted the list of observed words from the word-sense disambiguation dataset. Sentences containing the observed word were obtained from the news database from the Event Registry service. We represented the words with vectors using the models multilingual-BERT-Base, Cased and SloBERTa and clustered them in various ways. We compared the results with the data from the disambiguation dataset and manually checked some words with known semantic shifts. The obtained results are not promising. We believe that the main reason is an unsuitable text database. |
Sekundarne ključne besede: | meanings of words;sentence vector embedding;clustering;BERT;natural language processing;word sense induction;computer science;computer and information science;computer science and mathematics;interdisciplinary studies;diploma;Računalniško jezikoslovje;Računalništvo;Univerzitetna in visokošolska dela; |
Vrsta dela (COBISS): | Diplomsko delo/naloga |
Študijski program: | 1000407 |
Konec prepovedi (OpenAIRE): | 1970-01-01 |
Komentar na gradivo: | Univ. v Ljubljani, Fak. za računalništvo in informatiko |
Strani: | 37 str. |
ID: | 19937509 |