diplomsko delo
Amadej Tratnik (Author), Luka Čehovin (Mentor)

Abstract

Zaznavanje objektov na slikah je aktualna tematika v industriji in raziskovanju, saj omogoča avtomatsko prepoznavanje posameznega objekta na sliki, pogosto hitreje in točneje od človeškega očesa. S porastom globokih nevronskih mrež je še posebej zanimivo področje semantične segmentacije, ki omogoča ekstrakcijo informacije do ravni posameznih slikovnih elementov. V okviru diplomske naloge smo se posvetili problemu prepoznavanja osebe v videu in zamenjave ozadja s poljubno vsebino. Zasnovali smo primerno točno in raznoliko podatkovno množico oseb in njihovih binarnih mask, implementirali in naučili dve konvolucijski nevronski mreži segmentacije, Fast-SCNN in UNet, ju primerjali in analizirali rezultate. Arhitekturo Fast-SCNN smo še dodatno optimizirali z orodjem ONNX Runtime, namenjenim produkciji, in ji omogočili izvajanje na CPE v realnem času. S primerno anotirano množico za učenje in optimizirano različico nevronske mreže Fast-SCNN smo dosegli v povprečju 27 sličic na sekundo pri prepoznavanju osebe v videu ter 29 sličic na sekundo pri prepoznavanju osebe v realnem času preko spletne kamere.

Keywords

nevronske mreže;globoke nevronske mreže;zaznavanje objektov;univerzitetni študij;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [A. Tratnik]
UDC: 004.93:004.8(043.2)
COBISS: 104184067 Link will open in a new window
Views: 209
Downloads: 58
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Virtual backgrounds as semantic segmentation using deep neural networks
Secondary abstract: Object detection is a current topic in industry and research. It enables automatic identification of an individual objects in an image, which is often faster and more accurate than that of the human eye. With the rise of deep neural networks, the process of semantic segmentation is particularly interesting, as it allows the extraction of information from an image on pixel level. As part of the BA thesis, we addressed the issue of identifying a person in a video and replacing their background with any given content. We designed a diverse and accurate set of data subjects and their binary masks, implemented and trained two convolutional neural networks for semantic segmentation, Fast-SCNN and UNet. We then compared the two networks and analyzed the results. The Fast-SCNN network was further optimized with ONNX Runtime to enable real-time execution on the CPU. On an appropriately annotated dataset combined with an optimized version of the Fast-SCNN neural network, we achieved an average of 27 FPS in videos and 29 FPS in real-time webcam segmentation.
Secondary keywords: semantic segmentation;deep learning;computer vision;computer science;diploma;Globoko učenje (strojno učenje);Računalniški vid;Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Bachelor thesis/paper
Study programme: 1000468
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 56 str.
ID: 14889989