diplomsko delo

Abstract

Klasifikacija s porazdeljeno predstavitvijo

Keywords

strojno učenje;klasifikacija;razpršene tabele;"big data";porazdeljena predstavitev;računalništvo;računalništvo in informatika;univerzitetni študij;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [I. Drljepan]
UDC: 004.83(043.2)
COBISS: 9689684 Link will open in a new window
Views: 57
Downloads: 4
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Distributed representation based classification
Secondary abstract: Machine learning is increasingly met with datasets that require learning on a large number of learning samples. In solving these problems, some successful methods require too much time and/or space, for them to be viable. The aim of the thesis was the implementation and testing of the distributed representation based classification method of which classification speed is independent of the number of learning samples. We show that an implementation, which preserves a constant classification time, in case of high-dimensional problems requires too much space for it to be practical. By using hash tables we preserved an almost constant, fast classification for low-dimensional problems. It is made possible by a low memory consumption which is crucial for this method's classification speed. However, with low-dimensional problems, high number of learning samples causes learning saturation, which results in a drop of the classification rate. With more dimensions classification rate improves, but on account of higher memory consumption and longer classification time. Empirical evaluation has shown that, compared to the related nearest neighbors method, distributed representation based classification is faster and uses less space, while classification rates show no statistically significant differences. We determined that the method is suitable for sequential problems and that there are existing problems which are entirely unsuitable for it. Thus the method does not offer a general solution, however, under certain circumstances, it can solve problems faster, requires less space and at the same time maintain comparable classification rate.
Secondary keywords: machine learning;classification;hash tables;big data;distributed representation;computer science;computer and information science;diploma;
File type: application/pdf
Type (COBISS): Bachelor thesis/paper
Thesis comment: Univerza v Ljubljani, Fak. za računalništvo in informatiko
Pages: 45 str.
ID: 24181899