diplomsko delo

Povzetek

Ocenjevanje atributov v neuravnoteženih problemih

Ključne besede

strojno učenje;neuravnotežene množice;ocenjevanje atributov;CORElearn;odločitvena drevesa;računalništvo;računalništvo in informatika;visokošolski strokovni študij;diplomske naloge;

Podatki

Jezik: Slovenski jezik
Leto izida:
Tipologija: 2.11 - Diplomsko delo
Organizacija: UL FRI - Fakulteta za računalništvo in informatiko
Založnik: [D. Rački]
UDK: 004(043.2)
COBISS: 8631892 Povezava se bo odprla v novem oknu
Št. ogledov: 36
Št. prenosov: 1
Ocena: 0 (0 glasov)
Metapodatki: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Ostali podatki

Sekundarni jezik: Angleški jezik
Sekundarni naslov: Attribute evaluation on imbalanced data sets
Sekundarni povzetek: We analyze the performance of attribute evaluation measures on imbalanced datasets at different levels of imbalance. We sample real world datasets at ratios 1:5, 1:10, 1:50, 1:100, 1:500 and 1:1000. We build decision tree models and for each attribute evaluation measure compute AUC with stratified 5x2 cross validation. To test significance of the difference we use Friedman's test. With Nemenyi's test we determine and graphically display the similarities and differences. We find that the best performing measure at unaltered class ratios is MDL, for class ratios 1:5 the best measure is the angular distance. For ratios 1:10 and 1:50 the beast measure is ReliefF and for class ratios 1:100, 1:500 and 1:1000 the best performing measure is information gain. The worst performing measure on all class ratios is accuracy.
Sekundarne ključne besede: machine learning;imbalanced datasets;attribute evaluation;CORElearn;decision trees;computer science;computer and information science;diploma;
Vrsta datoteke: application/pdf
Vrsta dela (COBISS): Diplomsko delo/naloga
Komentar na gradivo: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Strani: 66 str.
ID: 23936536