diplomsko delo
Luka Končar (Author), Zoran Bosnić (Mentor)

Abstract

Pretvorba besedila v govor je uporabna na različnih področjih. Z globokim učenjem lahko za glas take pretvorbe uporabimo poljubno osebo, če le imamo nekaj minut posnetkov njenega govora. Pretvorba posnetkov v nabor podatkov za učenje modelov je zamudno, zato smo izdelali programsko opremo, ki ta postopek olajša. Nato smo izdelali modele z uporabo implementacije Tacotrona in dveh vokoderjev: Griffin-Lim in WaveRNN. Na koncu smo izvedli primerjavo teh dveh vokoderjev in ugotovili, da je Griffin-Lim veliko hitrejši pri sintetiziranju govora kot WaveRNN, a je kvaliteta govora bistveno slabša.

Keywords

pretvorba besedila v govor;univerzitetni študij;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UL FRI - Faculty of Computer and Information Science
Publisher: [L. Končar]
UDC: 004.8(043.2)
COBISS: 102623747 Link will open in a new window
Views: 104
Downloads: 51
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: Deep learning for text-to-speech
Secondary abstract: Text-to-speech (TTS) is useful in a variety of areas. With deep learning we can use any person's voice for TTS, if only we have a few minutes of recordings of their speech. Converting the recordings into a dataset useful for model training is time consuming, so we created software that makes this process easier. We then created models using Tacotron and two vocoders: Griffin-Lim and WaveRNN. In the end we performed a comparison of these two vocoders and found that Griffin-Lim is much faster at synthesizing speech than WaveRNN, but the quality of speech is significantly worse.
Secondary keywords: deep learning;text-to-speech;computer and information science;diploma;Globoko učenje (strojno učenje);Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS): Bachelor thesis/paper
Study programme: 1000468
Embargo end date (OpenAIRE): 1970-01-01
Thesis comment: Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages: 36 str.
ID: 14808613