Uporaba globokega učenja za pretvorbo besedila v govor

diplomsko delo

Luka Končar (Author), Zoran Bosnić (Mentor)

Abstract

Pretvorba besedila v govor je uporabna na različnih področjih. Z globokim učenjem lahko za glas take pretvorbe uporabimo poljubno osebo, če le imamo nekaj minut posnetkov njenega govora. Pretvorba posnetkov v nabor podatkov za učenje modelov je zamudno, zato smo izdelali programsko opremo, ki ta postopek olajša. Nato smo izdelali modele z uporabo implementacije Tacotrona in dveh vokoderjev: Griffin-Lim in WaveRNN. Na koncu smo izvedli primerjavo teh dveh vokoderjev in ugotovili, da je Griffin-Lim veliko hitrejši pri sintetiziranju govora kot WaveRNN, a je kvaliteta govora bistveno slabša.

Keywords

pretvorba besedila v govor;univerzitetni študij;diplomske naloge;

Data

Language:	Slovenian
Year of publishing:	2022
Typology:	2.11 - Undergraduate Thesis
Organization:	UL FRI - Faculty of Computer and Information Science
Publisher:	[L. Končar]
UDC:	004.8(043.2)
COBISS:	102623747
Views:	104
Downloads:	51
Average score:	0 (0 votes)
Metadata:

Other data

Secondary language:	English
Secondary title:	Deep learning for text-to-speech
Secondary abstract:	Text-to-speech (TTS) is useful in a variety of areas. With deep learning we can use any person's voice for TTS, if only we have a few minutes of recordings of their speech. Converting the recordings into a dataset useful for model training is time consuming, so we created software that makes this process easier. We then created models using Tacotron and two vocoders: Griffin-Lim and WaveRNN. In the end we performed a comparison of these two vocoders and found that Griffin-Lim is much faster at synthesizing speech than WaveRNN, but the quality of speech is significantly worse.
Secondary keywords:	deep learning;text-to-speech;computer and information science;diploma;Globoko učenje (strojno učenje);Računalništvo;Univerzitetna in visokošolska dela;
Type (COBISS):	Bachelor thesis/paper
Study programme:	1000468
Embargo end date (OpenAIRE):	1970-01-01
Thesis comment:	Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages:	36 str.
ID:	14808613