diplomsko delo

Povzetek

Klasifikacija s porazdeljeno predstavitvijo

Ključne besede

strojno učenje;klasifikacija;razpršene tabele;"big data";porazdeljena predstavitev;računalništvo;računalništvo in informatika;univerzitetni študij;diplomske naloge;

Podatki

Jezik: Slovenski jezik
Leto izida:
Tipologija: 2.11 - Diplomsko delo
Organizacija: UL FRI - Fakulteta za računalništvo in informatiko
Založnik: [I. Drljepan]
UDK: 004.83(043.2)
COBISS: 9689684 Povezava se bo odprla v novem oknu
Št. ogledov: 57
Št. prenosov: 4
Ocena: 0 (0 glasov)
Metapodatki: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Ostali podatki

Sekundarni jezik: Angleški jezik
Sekundarni naslov: Distributed representation based classification
Sekundarni povzetek: Machine learning is increasingly met with datasets that require learning on a large number of learning samples. In solving these problems, some successful methods require too much time and/or space, for them to be viable. The aim of the thesis was the implementation and testing of the distributed representation based classification method of which classification speed is independent of the number of learning samples. We show that an implementation, which preserves a constant classification time, in case of high-dimensional problems requires too much space for it to be practical. By using hash tables we preserved an almost constant, fast classification for low-dimensional problems. It is made possible by a low memory consumption which is crucial for this method's classification speed. However, with low-dimensional problems, high number of learning samples causes learning saturation, which results in a drop of the classification rate. With more dimensions classification rate improves, but on account of higher memory consumption and longer classification time. Empirical evaluation has shown that, compared to the related nearest neighbors method, distributed representation based classification is faster and uses less space, while classification rates show no statistically significant differences. We determined that the method is suitable for sequential problems and that there are existing problems which are entirely unsuitable for it. Thus the method does not offer a general solution, however, under certain circumstances, it can solve problems faster, requires less space and at the same time maintain comparable classification rate.
Sekundarne ključne besede: machine learning;classification;hash tables;big data;distributed representation;computer science;computer and information science;diploma;
Vrsta datoteke: application/pdf
Vrsta dela (COBISS): Diplomsko delo/naloga
Komentar na gradivo: Univerza v Ljubljani, Fak. za računalništvo in informatiko
Strani: 45 str.
ID: 24181899