magistrsko delo
Rok Gomišček (Author), Tomaž Curk (Mentor)

Abstract

Atributi, s katerimi opisujemo primere v bazah podatkov, so pogosto zelo številni. Določanje resnično pomembnih atributov za klasifikacijo ter njihovih medsebojnih odvisnosti zato predstavlja velik izziv. Eden od načinov, kako zmanjšati dimenzionalnost prostora in določiti pomembne atribute in primere, je z uporabo nenegativne matrične faktorizacije. V magistrski nalogi smo najprej preučili osnove nenegativne matrične faktorizacije in nekaj načinov prikaza podatkov in faktorskih modelov v matrikah. Predlagamo nekaj načinov, kako prikazati in razumeti modele, pridobljene s faktorizacijo. Uspeh metod smo ovrednotili na nekaj podatkovnih zbirkah in ugotovili, da nam vsaka metoda razkrije uporabne informacije o modelu. Z gručenjem faktoriziranih matrik lahko dobimo čistejše gruče kot z gručenjem izvornih podatkov. S projekcijo primerov v prostor faktorjev lahko ugotovimo, kateri faktorji vplivajo na določene razrede. Če pa tej projekciji dodamo še atribute, lahko sklepamo še o povezavi med primeri in atributi izvornega prostora.

Keywords

nenegativna matrična faktorizacija;faktorski model;vizualizacija podatkov;računalništvo;računalništvo in informatika;magisteriji;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [R. Gomišček]
UDC: 004.65(043.2)
COBISS: 1536577987 Link will open in a new window
Views: 1457
Downloads: 253
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Visualization and interpretation of models obtained with non-negative matrix factorization
Secondary abstract: Attributes that describe data in the databases present themselves in large numbers. For this reason defining truly important attributes for classification and establishing their mutual dependence poses a significant challenge. One way of reducing the dimensionality of the space and defining important attributes and examples is by using non-negative matrix factorization. In this master thesis we first examined the basics of non-negative matrix factorization and a few ways of visualizing the data and factor models in matrices. We propose a few ways of presenting and understanding the models acquired with factorization. We evaluated the effectiveness of the methods on several databases and learnt that each method reveals useful information about a model. Clustering of the factorized matrices can produce purer clusters than clustering of the source data. By projecting examples to the factor space we can see which factors affect certain classes. Adding attributes to this projection makes it possible to deduce the link between the examples and the attributes of the source space.
Secondary keywords: non-negative matrix factorization;factor model;data visualization;computer science;computer and information science;master's degree;
File type: application/pdf
Type (COBISS): Master's thesis/paper
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 68 str.
ID: 8966558