magistrsko delo
Abstract
Klasifikacija novic o podjetjih predstavlja časovno zelo dolgotrajen proces, saj je vsako novico potrebno prebrati in ji na podlagi vsebine določiti vsebinski pomen. Z razvojem metod za podatkovno rudarjenje lahko ta proces avtomatiziramo, s čimer novice razvrščamo v zanemarljivem času. V okviru magistrskega dela smo razvili sistem pridobivanja, prečiščevanja in klasifikacije novic. Novice smo pridobivali iz brezplačnih spletnih virov ter si ustvarili korpus besedil, ki smo jih najprej obdelali z orodjem Orange ter nato zgradili napovedne modele z uporabo različnih algoritmov. S pomočjo vizualizacij in matrike zamenjav smo prikazali kakovost napovednih modelov ter jih na podlagi njihove uspešnosti ovrednotili. S pomočjo ML.NET knjižnice smo na koncu razvili sistem avtomatske klasifikacije, ki novice glede na njihovo vsebino z 80 % natančnostjo klasificira v skupine.
Keywords
podatkovno rudarjenje;klasifikacija tekstov;trgovanje;novice;vrednostni papir;
Data
Language: |
Slovenian |
Year of publishing: |
2021 |
Typology: |
2.09 - Master's Thesis |
Organization: |
UM FOV - Faculty of Organizational Sciences |
Publisher: |
[J. Jakič] |
UDC: |
004.6 |
COBISS: |
72527363
|
Views: |
294 |
Downloads: |
22 |
Average score: |
0 (0 votes) |
Metadata: |
|
Other data
Secondary language: |
English |
Secondary title: |
Implementation of the securities news classification system |
Secondary abstract: |
Classification of news about companies represents a very time-consuming process, as each news has to be completely read to determine its content meaning. By using already developed data mining methods, we can automate this process and classify news in a negligible amount of time. During our master's thesis, we developed a system for obtaining, refining and classifying news. We obtained news from free online sources and created a corpus of texts. We first processed texts with the Orange tool, then we built predictive models by using different algorithms. Using visualizations and confusion matrices, we demonstrated the quality of predictive models, which were then evaluated based on their performance. We finally developed an automatic classification system by using ML.NET library, which is capable of classifying news into groups with 80 % accuracy. |
Secondary keywords: |
Podatkovno rudarjenje;Univerzitetna in visokošolska dela; |
Type (COBISS): |
Master's thesis/paper |
Thesis comment: |
Univ. v Mariboru, Fak. za organizacijske vede |
Pages: |
VI, 69 f. |
ID: |
12578501 |