diplomsko delo
Marko Rus (Author), Matej Kristan (Mentor), Domen Tabernik (Co-mentor)

Abstract

Konvolucijske nevronske mreže dosegajo izjemne rezultate na področju računalniškega vida. Osrednja operacija teh mrež je konvolucija z jedrom majhne in nespremenljive velikosti. V praksi je zato standardni prijem za povečavo dovzetnega polja združevanje sosednjih slikovnih točk, kar pa za mnoge probleme v računalniškem vidu nima zadovoljive izhodne resolucije. Problem naslavlja t. i. dilatacija, ki enote iz konvolucijskega jedra razširi na širše območje in s tem poveča dovzetno polje. Velikost razširitve je ročno nastavljena in tekom učenja ni spremenljiva, kar lahko predstavlja težavo, saj v splošnem njene optimalne vrednosti ne poznamo. Učljivo velikost dovzetnega polja ima nedavno predlagana metoda, pri kateri je konvolucijsko jedro sestavljeno iz premičnih združevalnih enot (angl. displaced aggregation units, DAU). Vsako jedro ima svoj nabor parametrov, svojo velikost dovzetnega polja. V tej diplomski nalogi naslavljamo vprašanje, ali je mogoče reducirati prostostne stopnje modela brez izgube natančnosti. Predlagamo tri načine reduciranja prostostnih stopenj z deljenjem odmikov na vhodih in izhodih. Implementiramo prehod naprej in vzvratni prehod za te tri različice, jih vgradimo v arhitekture konvolucijskih nevronskih mrež različnih velikosti in evalviramo na problemu klasifikacije slik v 10 razredov. Vse različice imajo za več kot 50% manj parametrov kot originalni sloj DAU. Eksperimentalni rezultati kažejo, da ima model, ki ima odmike neodvisne od izhoda, znatno manjšo računsko zahtevnost kot originalni sloj DAU, pri tem da klasifikacijska točnost pade za manj kot 2%.

Keywords

konvolucija;nevronske mreže;deljenje parametrov;klasifikacija;strojno učenje;računalništvo;računalništvo in informatika;računalništvo in matematika;interdisciplinarni študij;univerzitetni študij;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [M. Rus]
UDC: 004.8(043.2)
COBISS: 1538342851 Link will open in a new window
Views: 605
Downloads: 248
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: DAU convolutional neural networks with reduced degrees of freedom
Secondary abstract: Convolutional neural networks have demonstrated excellent performance at computer vision tasks. The central operation of these networks is a convolution with a small, fixed size kernel. In practice, therefore, the standard approach for increasing the receptive field is to combine adjacent pixels, which for many computer vision tasks does not have a sufficient output resolution. The problem is addressed by the so-called dilation, which extends the units from the convolution kernel to a wider area, thereby increasing the receptive field. The size of the kernel is manually set and is not variable during learning, which can be a problem, as we generally do not know its optimal value. To solve this problem, a method has recently been proposed in which the convolution kernel consists of displaced aggregation units (DAU). Each kernel has its own set of parameters, its own size of receptive field. In this thesis we address the question of whether it is possible to reduce model degree of freedom without loss of its accuracy. We propose three ways to reduce degrees of freedom by sharing displacements at the inputs and outputs. We implement a forward and backward pass for these three versions, embed them in architectures of convolutional neural networks of different sizes and evaluate on the problem of classifying images into 10 classes. All versions have more than 50% fewer parameters than the original DAU layer. The experimental results show that the model, which has output-independent displacements, has a significantly lower computational complexity than the original DAU layer, with the classification accuracy lower by less than 2%.
Secondary keywords: convolution;neural networks;parameter sharing;classification;machine learning;computer science;computer and information science;computer science and mathematics;interdisciplinary studies;diploma;
Type (COBISS): Bachelor thesis/paper
Study programme: 1000407
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 33 str.
ID: 11220290