bachelor's thesis
Matej Kalc (Author), Jure Demšar (Mentor), Jan Hartman (Co-mentor), Davorin Kopič (Co-mentor)

Abstract

In the context of online advertising, Click-Through Rate (CTR) is the probability that a user clicks on an ad. CTR prediction is done using machine learning methods, such as Factorization machines (FM) and neural networks. Various improved versions of these traditional approaches have been proposed in the last decade, the main goal of this thesis is to evaluate these upgrades. We evaluated the models in two phases: using different combinations of parameters and using Bayesian optimization for parameter tuning. In the first phase, results showed that the group of models that use neural networks achieves a higher Area Under the ROC Curve (AUC). Kernel-extended Factorization Machine, a new proposed model during the Data Science project competition at the Faculty of Computer Science, performed worse than the FM model. In the second phase, we applied Bayesian optimization to the models to achieve an even higher AUC. The second-generation of the Deep&Cross model surprisingly surpassed Deep Factorization Machine with a higher AUC, which had the highest AUC in the first phase. During the evaluation, we also tested the degree of FM and concluded that there is no need for a degree higher than two.

Keywords

CTR prediction;factorization machines;deep learning;online learning;Bayesian optimization;computer and information science;diploma thesis;

Data

Language: English
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [M. Kalc]
UDC: 004(043.2)
COBISS: 76724995 Link will open in a new window
Views: 331
Downloads: 64
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: Slovenian
Secondary title: Ocenjevanje modelov za napovedovanje verjetnosti klika
Secondary abstract: CTR je verjetnost, da uporabnik klikne nek oglas. CTR se napoveduje z uporabo metod strojnega učenja, kot sta Factorization machine (FM) in nevronska mreža. V zadnjem desetletju so bile predlagane različne izboljšane različice FM-ja in modeli z izboljšanimi nevronskimi mrežami. Glavni cilj diplomske naloge je ovrednotiti modele na isti množici podatkov. Te modele smo ovrednotili v dveh fazah: z uporabo različnih kombinacij parametrov in z uporabo Bayesove optimizacije za nastavljanje parametrov. V prvi fazi so rezultati pokazali, da je skupina modelov, ki uporabljajo nevronske mreže, dosegla višjo Ploščino pod ROC krivuljo (AUC). Kernel-extended Factorization machine, novi predlagani model med Data Science projektom na Fakulteti za računalništvo, je bil slabši od modela FM. V drugi fazi smo za modele uporabili Bayesovo optimizacijo, s katero smo dosegli še višji AUC. Model Deep&Cross V2 je z višjim AUC-jem presenetljivo presegel Deep Factorization Machine, model z višjim AUC-jem v prvi fazi. Med testiranjem smo ovrednotili razne stopnje reda za FM in ugotovili, da ni potrebe po stopnji, višji od dveh.
Secondary keywords: napovedovanje verjetnosti klika;globoko učenje;sprotno učenje;Bayesova optimizacija;računalništvo in informatika;univerzitetni študij;diplomske naloge;Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Bachelor thesis/paper
Study programme: 1000468
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 42 str.
ID: 13345769
Recommended works:
, diplomsko delo
, diplomsko delo
, bachelor's thesis