diplomsko delo
Matic Pajnič (Author), Tomaž Curk (Mentor), Uroš Lotrič (Co-mentor)

Abstract

Cilj diplomske naloge je bil implementirati sekvenčni algoritem za iskanje krajših zaporedij v genomih, ga pohitriti s paralelizacijo ter implementirati paralelno verzijo na grafični kartici. Sekvenčni algoritem je moral poiskati krajša zaporedja znakov v danem zaporedju, ki predstavlja genom. Izračunati je moral frekvence pojavitev zaporedij in pogostost interakcij na podanih položajih ter na naključno premaknjenih položajih, za vsako krajše zaporedje posebej. Na podlagi pojavitev je nato moral določiti, katera krajša zaporedja so za določene položaje v genomu bolj značilna. Na podlagi podatkov o interakcijah med proteini in genomom na določenih položajih, ter na podlagi najdenih krajših zaporedij znakov, je moral nato izračunati in statistično ovrednotiti pogostost pojavitve interakcij. Sekvenčni algoritem smo implementirali v programskem jeziku C, paralelizacijo sekvenčnega algoritma pa smo izvedli na podlagi arhitekture OpenCL, ki omogoča implementacijo algoritmov na grafičnih karticah.

Keywords

paralelno računanje;GPE;sinhronizacija;niti;bioinformatika;proteini;DNA;RNA;geni;računalništvo;računalništvo in informatika;univerzitetni študij;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [M. Pajnič]
UDC: 004:575.111(043.2)
COBISS: 1536599491 Link will open in a new window
Views: 1309
Downloads: 403
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Implementation of parallel algorithm for k-mer enrichment analysis of genomic sequences
Secondary abstract: The goal of this thesis was to implement a sequential algorithm that would search for subsequences in a genome. To accelerate the execution time of this algorithm we designed a parallel version and implemented the parallel version on a graphics card. The sequential algorithm had to search for predefined subsequences in a genome that was represented as a sequence of characters. It had to calculate the frequencies of sequence occurrences and the frequencies of interactions on predefined positions and on randomly modified positions in the genome, for each subsequence. Based on these frequencies it had to identify sequences that were more frequent on certain locations in a given genome. Based on data about protein-RNA interactions on certain locations in the genome, and based on the found character sequences, the algorithm had to calculate and statistically evaluate the frequencies of interactions. The sequential algorithm was implemented in the C programming language, while the parallelization was implemented on the OpenCL architecture.
Secondary keywords: parallel computing;GPU;synchronization;threads;bioinformatics;proteins;DNA;RNA;genes;computer science;computer and information science;diploma;
File type: application/pdf
Type (COBISS): Bachelor thesis/paper
Study programme: 1000468
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 55 str.
ID: 9043470