Aleš Papič (Author), Igor Kononenko (Author), Zoran Bosnić (Author)

Abstract

The quantity of data generated increases daily, which makes it difficult to process. In the case of supervised learning, labeling training examples may represent an especially tedious and costly task. One of the aims of positive and unlabeled (PU) learning is to train a binary classifier from partially labeled data, representing a strategy for combining supervised and semi-supervised learning and alleviating the cost of labeling data fully. Still, the main strength of PU learning arises when the negative data are not directly available or too diverse. Although the generative approaches have shown promising results in this field, they also bring shortcomings, such as high computational cost, training instability, and inability to generate fully labeled datasets. In the paper, we propose a novel Conditional Generative PU framework (CGenPU) with a built-in auxiliary classifier. We develop a novel loss function to learn the distribution of positive and negative examples, which leads to a unique, desirable equilibrium under a nonparametric assumption. Our CGenPU is evaluated against existing generative approaches using both synthetic and real data. The characteristics of various methods, including ours, are depicted with different toy examples. The results demonstrate the state-of-the-art performance on standard positive and unlabeled learning benchmark datasets. Given only ten labeled CIFAR-10 examples, CGenPU achieves classification accuracy of 84%, while current state-of-the-art D-GAN framework achieves 54%. On top of that, CGenPU is the first single-stage generative framework for PU learning.

Keywords

učenje iz pozitivnih in neoznačenih primerov;delno nadzorovano učenje;generativne nasprotniške mreže;globoko učenje;positive and unlabeled learning;partially supervised learning;generative adversarial networks;deep learning;

Data

Language: English
Year of publishing:
Typology: 1.01 - Original Scientific Article
Organization: UL FRI - Faculty of Computer and Information Science
UDC: 004.8
COBISS: 148488195 Link will open in a new window
ISSN: 0957-4174
Views: 106
Downloads: 17
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: Slovenian
Secondary keywords: učenje iz pozitivnih in neoznačenih primerov;delno nadzorovano učenje;generativne nasprotniške mreže;globoko učenje;
Type (COBISS): Article
Pages: str. 1-13
Issue: ǂVol. ǂ224
Chronology: Aug. 2023
DOI: 10.1016/j.eswa.2023.120046
ID: 18957941