magistrsko delo
Tomaž Borštnik (Author), Tomaž Curk (Mentor)

Abstract

Interakcije med proteini in RNA imajo pomembno vlogo pri uravnavanju genske ekspresije in posledično na delovanje celic. Napake v interakcijah so pogosto povezane z nastankom bolezni, kot so nevropatije, rak, itd. Poznavanje mest interakcij je tako nujno za razumevanje, odkrivanje, uravnavanje genske ekspresije in zdravljenje omenjenih bolezni. V magistrskem delu smo se osredotočili na modeliranje mesta interakcije proteinov z RNA na osnovi simuliranih podatkov metode RBDmap, ki je nadaljevanje študije Castella in sodelavcev, objavljene leta 2012. Podatke RBDmap smo simulirani na podlagi zbirke PDB, ki hrani strukture 3D kompleksov proteinov in RNA. Za napovedovanje posameznih aminokislin oziroma krajših zaporedij v fragmentih smo preizkusili vrsto metod strojnega učenja, kot so metoda podpornih vektorjev, klasifikacijska drevesa, naivni Bayesov klasifikator in K-najbližjih sosedov. Razvili smo tudi metodo, ki določi aminokisline v interakciji z RNA na podlagi lastnosti fragmentov aminokislin in celotnega proteina. Uspešnost metode je primerljiva s trenutno obstoječimi metodami (AUC 0,783). V nasprotju s pričakovanji, opisovanje fragmentov v splošnem ni pripomoglo k izboljšanju napovednih modelov.

Keywords

gradnja modelov;neuravnoteženi podatki;protein-RNA;PDB;računalništvo;računalništvo in informatika;magisteriji;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [T. Borštnik]
UDC: 004.85:577(043.2)
COBISS: 1536579779 Link will open in a new window
Views: 1421
Downloads: 492
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Prediction of amino acids interacting with RNA
Secondary abstract: Interactions between proteins and RNA play an important role in the regulation of gene expression and therefore in the functioning of cells. Errors in interactions are often related to the development of diseases, such as neuropathy, cancer, etc. To this end, knowing the locations of interactions is crucial for understanding, discovering and managing gene expression and for treating those diseases. The master's thesis focuses on modeling the amino acids interacting with RNA based on simulated data on RBDmap experiments, which is the continuation of the study by Castello et al. from 2012. RBDmap was simulated using the PDB database on 3D structures of ribonucleoprotein complexes. A number of methods of machine learning, such as support vector machines, classification tree, naive Bayes classifier and k-nearest neighbours were evaluated for predicting individual amino acids and fragments of amino acids interacting with RNA. Moreover, a method was developed to determine amino acids interacting with RNA, which considers the characteristics of fragments of amino acids and the entire protein. The method achieved good results (AUC 0.783), which is comparable with current methods. Including features on fragments did not improve the predictive model.
Secondary keywords: building models;imbalanced data;protein-RNA;PDB;computer science;computer and information science;master's degree;
File type: application/pdf
Type (COBISS): Master's thesis/paper
Study programme: 1000471
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 56 str.
ID: 9057533