magistrsko delo

Abstract

Klasifikacija novic o podjetjih predstavlja časovno zelo dolgotrajen proces, saj je vsako novico potrebno prebrati in ji na podlagi vsebine določiti vsebinski pomen. Z razvojem metod za podatkovno rudarjenje lahko ta proces avtomatiziramo, s čimer novice razvrščamo v zanemarljivem času. V okviru magistrskega dela smo razvili sistem pridobivanja, prečiščevanja in klasifikacije novic. Novice smo pridobivali iz brezplačnih spletnih virov ter si ustvarili korpus besedil, ki smo jih najprej obdelali z orodjem Orange ter nato zgradili napovedne modele z uporabo različnih algoritmov. S pomočjo vizualizacij in matrike zamenjav smo prikazali kakovost napovednih modelov ter jih na podlagi njihove uspešnosti ovrednotili. S pomočjo ML.NET knjižnice smo na koncu razvili sistem avtomatske klasifikacije, ki novice glede na njihovo vsebino z 80 % natančnostjo klasificira v skupine.

Keywords

podatkovno rudarjenje;klasifikacija tekstov;trgovanje;novice;vrednostni papir;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UM FOV - Faculty of Organizational Sciences
Publisher: [J. Jakič]
UDC: 004.6
COBISS: 72527363 Link will open in a new window
Views: 294
Downloads: 22
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Implementation of the securities news classification system
Secondary abstract: Classification of news about companies represents a very time-consuming process, as each news has to be completely read to determine its content meaning. By using already developed data mining methods, we can automate this process and classify news in a negligible amount of time. During our master's thesis, we developed a system for obtaining, refining and classifying news. We obtained news from free online sources and created a corpus of texts. We first processed texts with the Orange tool, then we built predictive models by using different algorithms. Using visualizations and confusion matrices, we demonstrated the quality of predictive models, which were then evaluated based on their performance. We finally developed an automatic classification system by using ML.NET library, which is capable of classifying news into groups with 80 % accuracy.
Secondary keywords: Podatkovno rudarjenje;Univerzitetna in visokošolska dela;
Type (COBISS): Master's thesis/paper
Thesis comment: Univ. v Mariboru, Fak. za organizacijske vede
Pages: VI, 69 f.
ID: 12578501