Množenje redkih matrik na arhitekturi CUDA

diplomsko delo

Klemen Pravdič (Avtor), Tomaž Dobravec (Mentor)

Povzetek

Množenje redkih matrik na arhitekturi CUDA

Ključne besede

množenje redkih matrik;redka matrika;GPE procesiranje;CUDA;računalništvo;univerzitetni študij;diplomske naloge;

Podatki

Jezik:	Slovenski jezik
Leto izida:	2013
Tipologija:	2.11 - Diplomsko delo
Organizacija:	UL FRI - Fakulteta za računalništvo in informatiko
Založnik:	[K. Pravdič]
UDK:	512.643.122(043.2)
COBISS:	9985364
Št. ogledov:	54
Št. prenosov:	5
Ocena:	0 (0 glasov)
Metapodatki:

Ostali podatki

Sekundarni jezik:	Angleški jezik
Sekundarni naslov:	Sparse matrix multiplication on CUDA
Sekundarni povzetek:	Sparse matrix multiplication is a common operation in linear algebra and an important element of other algorithms. Sparse matrix is a matrix populated primarily with zeros. This thesis presents two algorithms for sparse matrix multiplication, row-column algorithm and row-row (also known as row-wise) algorithm. It describes sequential implementation on CPU and parallel implementation on GPU for both algorithms. Algorithms were implemented in C programming language. For parallel implementation we used GPU with CUDA architecture. We described different formats of storage for sparse matrices (CSR, CSC, and COO) that are used in implementation of algorithms. For the purpose of understanding parallel implementations of algorithms CUDA architecture is described. Timings for all implementations were measured and compared against each other. For testing purposes we used sparse matrices from Matrix Market repository along with sparse matrices with different densities and dimensions that we generated ourselves. On GPU we stored product as both, a sparse and a dense matrix. We determined that row-row algorithm is faster than row-column algorithm and under certain conditions parallel implementation outperforms sequential implementation of a row-row algorithm. Performance of parallel row-row algorithm depends on density and dimensions of input matrices; for efficient performance input matrices with smaller dimensions should be denser. Row-row algorithm on CUDA performs better when groups of implicitly synchronized threads (warps) are used.
Sekundarne ključne besede:	sparse matrix multiplication;sparse matrix;GPU processing;CUDA;computer science;diploma;
Vrsta datoteke:	application/pdf
Vrsta dela (COBISS):	Diplomsko delo
Komentar na gradivo:	Univ. v Ljubljani, Fak. za računalništvo in informatiko
Strani:	51 str.
ID:	24168169

Slovenski jezik

English language

Priporočena dela:

Množenje redkih matrik na arhitekturi CUDA

2013, diplomsko delo

Izvedbe hitrega urejanja za CPE in GPE

2013, diplomsko delo

Paralelni evolucijski algoritem za odkrivanje znanja iz modela genskega regulatornega omrežja

2013, diplomsko delo

Primerjava grafičnih procesnih enot in centralnih procesnih enot

2009, diplomsko delo

Sistem za sintezo govora iz besedila

2013, diplomsko delo