magistrsko delo
Abstract
Metodam za zlivanje podatkov z uporabo matričnega razcepa je skupen problem hladnega zagona, ko na začetku njihovega izvajanja primanjkuje podatkov, na katerih bi se algoritmi lahko začeli učiti.
V magistrskem delu se osredotočimo na metodo DFMF in jo prilagodimo tako, da problem hladnega zagona naslovimo s prenosom znanja od drugod.
Implementiramo več prilagoditev metode in njihovo delovanje najprej preizkusimo na umetno ustvarjenih podatkih, kjer pri testiranju s prečnim preverjanjem večina prilagoditev dosega višje vrednosti AUC kot osnovna različica.
Prilagojene metode apliciramo še na realnem problemu določanja bakterijskih gostiteljev virusov, kjer imamo na voljo množico laboratorijsko že potrjenih interakcij, na podlagi katerih želimo predlagati potencialne nove.
Prenos znanja opravimo z uporabo konvolucijske nevronske mreže za napovedovanje taksonomske razvrstitve organizmov, ki jo prilagodimo tako, da lahko vektorje iz zadnjega nivoja uporabimo za inicializacijo faktorskih matrik v metodi DFMF.
Pri testiranju s prečnim preverjanjem se izkaže, da dve prilagojeni različici dosegata približno enake natančnosti kot osnovna metoda DFMF, medtem ko so ostale slabše.
Na koncu predstavimo še nekaj potencialnih novih interakcij med bakteriofagi in bakterijami, ki jih napovemo z osnovno metodo in eno izmed prilagojenih različic, ki daje najboljše rezultate.
Keywords
napovedni modeli;zlivanje modelov;matrična tri-faktorizacija;preneseno učenje;nevronske mreže;bioinformatika;bakteriofagi;bakterije;računalništvo;računalništvo in informatika;magisteriji;
Data
Language: |
Slovenian |
Year of publishing: |
2020 |
Typology: |
2.09 - Master's Thesis |
Organization: |
UL FRI - Faculty of Computer and Information Science |
Publisher: |
[U. Bajc] |
UDC: |
004.8:578.81(043.2) |
COBISS: |
39995651
|
Views: |
787 |
Downloads: |
135 |
Average score: |
0 (0 votes) |
Metadata: |
|
Other data
Secondary language: |
English |
Secondary title: |
Inferring viral bacterial hosts by fusing predictive models |
Secondary abstract: |
Data fusion by matrix factorization methods have a cold start problem in common, which is characterized by a lack of initial data that could suffice for the initiation of the algorithms' learning process.
In this master thesis we focus on the DFMF method and adjust it in such a way that a cold start problem is addressed by transfer learning.
We implement several adjustments of the method and cross validate their efficiency on artificially created data where most of the adjustments reach higher AUC numbers than its basic version.
Then we apply the adjusted methods on the real problem of defining viral bacterial hosts, with numerous in laboratory confirmed interactions, upon which we wish to suggest potentially new ones.
Transfer learning is achieved with the use of convolutional neural network used for predicting taxonomic classification of organisms which we adjust in such a way that vectors from the last level can be used for the initialization of the factor matrix in the DFMF method.
Cross validation suggests that two of the adjusted versions reach approximately the same precision results as the basic DFMF method, whereas the others prove to be worse.
In the end we present some potentially new interactions among bacteriophage and bacteria which we predict with the basic method and one of the adjusted versions that gives the best results. |
Secondary keywords: |
predictive models;model fusion;matrix tri-factorization;transfer learning;neural networks;bioinformatics;bacteriophages;bacteria;computer science;computer and information science;master's degree; |
Type (COBISS): |
Master's thesis/paper |
Study programme: |
1000471 |
Embargo end date (OpenAIRE): |
1970-01-01 |
Thesis comment: |
Univ. v Ljubljani, Fak. za računalništvo in informatiko |
Pages: |
101 str. |
ID: |
12168676 |