Klasifikacija s porazdeljeno predstavitvijo

diplomsko delo

Ivan Drljepan (Author), Marko Robnik Šikonja (Mentor)

Abstract

Klasifikacija s porazdeljeno predstavitvijo

Keywords

strojno učenje;klasifikacija;razpršene tabele;"big data";porazdeljena predstavitev;računalništvo;računalništvo in informatika;univerzitetni študij;diplomske naloge;

Data

Language:	Slovenian
Year of publishing:	2013
Typology:	2.11 - Undergraduate Thesis
Organization:	UL FRI - Faculty of Computer and Information Science
Publisher:	[I. Drljepan]
UDC:	004.83(043.2)
COBISS:	9689684
Views:	57
Downloads:	4
Average score:	0 (0 votes)
Metadata:

Other data

Secondary language:	English
Secondary title:	Distributed representation based classification
Secondary abstract:	Machine learning is increasingly met with datasets that require learning on a large number of learning samples. In solving these problems, some successful methods require too much time and/or space, for them to be viable. The aim of the thesis was the implementation and testing of the distributed representation based classification method of which classification speed is independent of the number of learning samples. We show that an implementation, which preserves a constant classification time, in case of high-dimensional problems requires too much space for it to be practical. By using hash tables we preserved an almost constant, fast classification for low-dimensional problems. It is made possible by a low memory consumption which is crucial for this method's classification speed. However, with low-dimensional problems, high number of learning samples causes learning saturation, which results in a drop of the classification rate. With more dimensions classification rate improves, but on account of higher memory consumption and longer classification time. Empirical evaluation has shown that, compared to the related nearest neighbors method, distributed representation based classification is faster and uses less space, while classification rates show no statistically significant differences. We determined that the method is suitable for sequential problems and that there are existing problems which are entirely unsuitable for it. Thus the method does not offer a general solution, however, under certain circumstances, it can solve problems faster, requires less space and at the same time maintain comparable classification rate.
Secondary keywords:	machine learning;classification;hash tables;big data;distributed representation;computer science;computer and information science;diploma;
File type:	application/pdf
Type (COBISS):	Bachelor thesis/paper
Thesis comment:	Univerza v Ljubljani, Fak. za računalništvo in informatiko
Pages:	45 str.
ID:	24181899

Slovenski jezik

English language

Recommended works:

Klasifikacija s porazdeljeno predstavitvijo

2013, diplomsko delo

Integracija sistemov CRM

2011, diplomsko delo

Algoritem kot storitev

2021, diplomsko delo

Anonimizacija sodnih odločb z metodami strojnega učenja

2020, diplomsko delo

Ocenjevanje atributov s posplošitvami algoritma Relief

2019, diplomsko delo