Prepoznavanje motivov v pravljicah s pomočjo velikih jezikovnih modelov

diplomsko delo

Domen Beden (Author), Marko Robnik Šikonja (Mentor)

Abstract

V diplomskem delu raziskujemo uporabo velikih jezikovnih modelov (VJM) za avtomatsko prepoznavanje pripovednih motivov v ljudskih pravljicah. Najprej predstavimo folkloristično teorijo motivov, klasifikacijske sisteme (ATU, Thompson) ter sodobne digitalne zbirke. Obravnavamo temeljne koncepte obdelave naravnega jezika in arhitekturo velikih modelov, s poudarkom na učenju z navodili. V eksperimentalnem delu uporabimo model Gemma 7B, ki ga učimo na strukturiranih primerih zgodb in motivov. Preizkusimo več učnih strategij (polno prilagajanje, LoRA, destilacija z Gemini 2.5 Pro) ter izvedemo kvantitativno in kvalitativno evalvacijo rezultatov. Ugotovimo, da so VJM-ji sposobni učinkovite klasifikacije motivov, še posebej, ko so podatki obogateni z verigami misli. Delo prispeva k razvoju orodij za računalniško folkloristiko in kaže možnosti za nadaljnje raziskave.

Keywords

veliki jezikovni modeli;obdelava naravnega jezika;pravljice;motivi v pravljicah;diplomske naloge;

Data

Language:	Slovenian
Year of publishing:	2025
Typology:	2.11 - Undergraduate Thesis
Organization:	UL FRI - Faculty of Computer and Information Science
Publisher:	[D. Beden]
UDC:	004.85:82(043.2)
COBISS:	247357187
Views:	100
Downloads:	23
Average score:	0 (0 votes)
Metadata:

Other data

Secondary language:	English
Secondary title:	Detection of folkloristic motifs with large language models
Secondary abstract:	This thesis explores the use of large language models (LLMs) for the automatic recognition of narrative motifs in folktales. We begin by presenting folkloristic theory on motifs, classification systems (ATU, Thompson), and modern digital corpora. We outline key natural language processing concepts and LLM architectures, with an emphasis on instruction-based learning. In the experimental part, we fine-tune the Gemma 7B model on structured examples linking stories and motifs. We evaluate several learning strategies (full fine-tuning, LoRA, distillation using Gemini 2.5 Pro) and perform both quantitative and qualitative evaluations. Our results show that LLMs can effectively classify motifs, especially when the dataset is enriched with chainof- thought explanations. This work contributes to the field of computational folkloristics and opens pathways for further research.
Secondary keywords:	large language models;natural language processing;motifs in folktales;folktales;computer and information science;diploma;
Type (COBISS):	Bachelor thesis/paper
Study programme:	1000468
Thesis comment:	Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages:	1 spletni vir (1 datoteka PDF (74 str.))
ID:	27132409