magistrsko delo
Nejc Pušnik (Author), Aleš Holobar (Mentor)

Abstract

V magistrskem delu smo izdelali glasovno ključavnico na platformi Raspberry Pi. V programskem jeziku Java smo izdelali program, ki s pomočjo mikrofona zajema zvočni signal in iz njega izlušči koeficiente melodičnega kepstruma. Nato smo na podlagi razdalje, izračunane z algoritmom dinamičnega časovnega prileganja, med seboj primerjali in klasificirali posnetke izgovorjave 49 slovenskih besed sedmih različnih oseb. Analizirali smo vpliv števila koeficientov melodičnega kepstruma, dolžine izgovorjene besede, števila samoglasnikov v izgovorjeni besedi, spola govorcev in šuma. Pri posnetkih z razmerjem signal–šum 25 dB je najmanjša dobljena napaka razpoznave znašala 9,45 %, pri posnetkih z razmerjem signal–šum 15 dB pa približno 26,97 %.

Keywords

glasovne ključavnice;dinamično časovno prileganje;koeficienti melodičnega kepstruma;magistrske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.09 - Master's Thesis
Organization: UM FERI - Faculty of Electrical Engineering and Computer Science
Publisher: N. Pušnik
UDC: 004.357:004.934.8'1(043.2)
COBISS: 20969750 Link will open in a new window
Views: 1083
Downloads: 119
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Voice Lock on Raspberry Pi Platform
Secondary abstract: In the master's thesis, we created a voice lock on the Raspberry Pi platform. In the Java programming language, we created a program that uses a microphone to capture an acoustic signal and extracts the Mel-frequency cepstral coefficients from it. Then, on the basis of the distance, calculated by the dynamic time-matching algorithm, we compared and classified the recordings of the pronunciation of 49 Slovene words by seven different people. We analysed the influence of the number of Mel-frequency cepstral coefficients, the length of the spoken word, the number of vowels in the spoken word, speaker’s gender and noise. For the recordings with a signal-to-noise ratio of 25 dB the minimum detection error was 9.45%, whereas for the recordings with a signal-to-noise ratio of 15 dB detection error increased to 26.97%.
Secondary keywords: voice lock;Raspeberry Pi;dynamic time warping;mel frequency cepstral coefficient;
URN: URN:SI:UM:
Type (COBISS): Master's thesis/paper
Thesis comment: Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Računalništvo in informacijske tehnologije
Pages: X, 71 str.
ID: 10850326