bachelor's thesis
Abstract
Visual object tracking has recently shifted towards target segmentation, which has increased the demand for video datasets with objects segmented in each frame. However, manually obtaining large segmented video datasets is time-consuming and costly. We address this problem by introducing a Semi-supervised Annotation by Tracking algorithm (SAT), which is specialized for target segmentation specifically for visual object tracking domain with minimal user input. The annotation pipeline is split into two modules. The anchor frame segmentation module predicts a segmentation mask by few (approximately four) user clicks on the object of interest. The module is used to segment the target in a subset of frames, anchors, throughout the sequence. Then a mask propagation module propagates the segmentation masks from the anchors to the in-between frames. On the VOT dataset, SAT achieves an IoU of 73% already at 5% of user annotated frames and outperforms the winner of the DAVIS2020 challenge IVOS and the winner of DAVIS2018 challenge IVS by 40% and 67%, respectively and shortens the annotation time by 98%. On the DAVIS interactive challenges, SAT performs comparably to the state-of-the-art in video object segmentation.
Keywords
convolutional neural networks;video object segmentation;video object tracking;computer and information science;diploma;
Data
Language: |
English |
Year of publishing: |
2021 |
Typology: |
2.11 - Undergraduate Thesis |
Organization: |
UL FRI - Faculty of Computer and Information Science |
Publisher: |
[J. Pelhan] |
UDC: |
004.8(043.2) |
COBISS: |
63110659
|
Views: |
362 |
Downloads: |
99 |
Average score: |
0 (0 votes) |
Metadata: |
|
Other data
Secondary language: |
Slovenian |
Secondary title: |
Delno-avtomatska metoda za segmentacijo objekta v videoposnetku |
Secondary abstract: |
Na področju vizualnega sledenja se je pred kratkim zaradi hitrega razvoja uveljavilo poročanje lokacije tarče s segmentacijskimi maskami, kar je povečalo zahtevo po popolnoma segmentiranih zbirkah videoposnetkov. Postopek ročne anotacije zbirk videoposnetkov je dolgotrajen in drag, zato v diplomskem delu naslovimo prav ta problem. Predstavimo metodo za pol-avtomatsko segmentacijo objektov na videposnetku SAT, specializirano za učinkovito anotiranje videoposnetkov vizualnega sledenja. Segmentiranje videoposnetka smo razdelili na dva modula. Prvi modul učinkovito segmentira objekte na pozameznih slikah, saj za oceno segmentacijske maske potrebuje zgolj nekaj klikov na rob objekta. Drugi modul, ki temelji na pred kratkim predstavljenim sledilnikom D3S, pa skrbi za prenos mask na preostale slike videoposnetka. Na podatkovni zbirki VOT2020 metoda SAT doseže IoU 73%, z zgolj 5% anotiranih slik, kar je 40% izboljšava v primerjavi z zmagovalno metodo interaktivnega izziva DAVIS2020, IVOS, in kar 67% izboljšava v primerjavi z zmagovalno metodo interaktivnega izziva DAVIS2018, IVS. SAT skrajša čas ročnega anotiranja videoposnetka za kar 98%. Na DAVIS interaktivnem izzivu SAT doseže rezultate, ki so primerljivi z naprednimi metodami na področju segmentacije videoposnetkov. |
Secondary keywords: |
konvolucijske nevronske mreže;segmentacija videoposnetka;sledenje objektom;računalništvo in informatika;univerzitetni študij;diplomske naloge; |
Type (COBISS): |
Bachelor thesis/paper |
Study programme: |
1000468 |
Thesis comment: |
Univ. v Ljubljani, Fak. za računalništvo in informatiko |
Pages: |
69 str. |
ID: |
12891441 |