magistrsko delo
Andraž Puc (Author), Vitomir Štruc (Mentor)

Abstract

Generativna nasprotniška omrežja (angl. Generative Adversarial Networks - GAN) so v zadnjih letih ena izmed najatraktivnejših globokih modelov za generiranje semantičnih podatkov, tudi na področju biometrije. Koncept dveh modulov, generatorja in diskriminatorja, ki si med učenjem nasprotujeta, se izkaže za izredno močno metodo postavitve omrežne arhitekture. V teoriji pride med učenjem modelov GAN do tekme med obema moduloma. Izkaže se, da je v praksi težko uspešno naučiti tovrsten sistem, ki je stabilen, a uporaba različnih metod in pristopov omogoča uspešno učenje takih modelov. Generativna narava omrežij GAN z dobro naučenim modelom praktično omogoča neomejene možnosti uporabe podatkov, ustvarjenih na tak način. Raziskovalcem je v zadnjih letih zanimanje vzbudilo predvsem področje generiranja in manipulacije slik človekovega obraza. Najboljše rezultate na tem področju dosegata modela StyleGAN in njegov naslednik StyleGAN2, ki sta ustvarjena prav z namenom generiranja hiperrealističnih (angl. hyper-realistic) slik človekovih obrazov z možnostjo dodatnega spreminjanja ustvarjenih slik. V pričujočem delu opišemo razvoj modela za manipulacijo frizure na sliki človekovega obraza v visoki ločljivosti in pri tem uporabimo različne metode slikovnega procesiranja. Naši pristopi v osnovi temeljijo na uporabi razvozlanega oz. razpletenega latentnega prostora generatorja StyleGAN2, v katerem lahko predstavimo poljubno resnično sliko z zapisom v latentni vektor. Pri postopku neposredne manipulacije latentnih projekcij slik uporabimo pristop pogojenega premika čez hiperravnine (angl. hyperplane), ki jih poiščemo z učenjem klasifikatorja SVM (angl. Support Vector Machine - SVM) podpornih vektorjev na označeni podatkovni zbirki. Model dodatno omogoča neposredno preslikavo celotne frizure osebe z referenčne slike na vhodno sliko, kjer uporabimo raznovrstne klasične metode slikovnih tehnologij v kombinaciji z novodobnimi procesi, ki jih omogoča izredna moč generatorja StyleGAN2. Poudarimo tudi problematiko ohranjanja obrazne identitete pri uporabljenih procesih in za slednje dodatno poskrbimo z uporabo opisanih postkorekcijskih postopkov končnega kodiranja (angl. encoding) in preslikave ključnih obraznih karakteristik. Predstavimo rezultate našega pristopa in dodatno analiziramo ključne elemente modela, ki vplivajo na uspešnost tovrstnih manipulacij. Izvedemo tudi primerjalno študijo, v kateri pridobljene rezultate primerjamo z nekaterimi trenutno najnaprednejšimi (angl. state-of-the-art) modeli, ki prav tako specifično omogočajo manipulacijo frizure na sliki človekovega obraza.

Keywords

umetna inteligenca;globoki model;generativno nasprotniško omrežje;StyleGAN2;latentni vektor;manipulacija slike;stil frizure;magisteriji;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UL FE - Faculty of Electrical Engineering
Publisher: [A. Puc]
UDC: 004(043.3)
COBISS: 95455747 Link will open in a new window
Views: 211
Downloads: 38
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Virtual hairstyles with generative neural models
Secondary abstract: Generative Adversarial Networks – GANs have in recent years shown to be one of the most ground-breaking approaches for deep models used in generating semantic data. This is also true in the field of biometrics. The concept of two modules, generator and discriminator, that oppose each other during the training process, seems to be an incredibly powerful idea when designing a neural network architecture. In theory, it leads to an adversarial game between the two actors. This appears to be troublesome in practice, when training such systems in a stable manner, however, different methods and approaches enable successful training of such models. Generative nature of GAN networks, that are successfully trained, practically allows us to use this kind of generated data in unlimited ways. Researchers have in recent years especially shown interest in generating and manipulating images of human faces. The best results on this field are achieved by models StyleGAN and its successor StyleGAN2, which are created specifically for generating hyper-realistic images of human faces with possibilities of additional manipulations of the generated images. In this work we describe the development of our model for hairstyle manipulation on high resolution images of human faces, where we use different methods of image processing. Our approaches are based on usage of the disentangled latent space of the StyleGAN2 generator in which we first reconstruct an arbitrary real image. In our process of direct manipulation of the latent image projections, we use a conditional manipulation approach using hyperplanes, that we determine when training an SVM (Support Vector Machine) classificator on a labelled database. The model additionally allows for complete hairstyle transformation from a reference image onto the input image. We use different classical methods of imaging technologies in combination with modern approaches, that are mainly enabled by the powerful StyleGAN2 generator. Additionally, we emphasize the importance of face identity preservation when employing described methods. The latter is dealt with by using specific post-correction processes of final encoding and the transformation of key facial characteristics. Lastly, we present our results and additionally analyse key elements of the model, that affect the success of our manipulations. We perform an ablation study, where we compare found results with some other models, that specifically allow for hairstyle manipulation on an image of a human face.
Secondary keywords: artificial intelligence;deep model;generative adversarial network;StyleGAN2;latent vector;image manipulation;hairstyle;
Type (COBISS): Master's thesis/paper
Study programme: 1000316
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za elektrotehniko
Pages: XX, 98 str.
ID: 14381849