Language: | Slovenian |
---|---|
Year of publishing: | 2022 |
Typology: | 2.11 - Undergraduate Thesis |
Organization: | UL FRI - Faculty of Computer and Information Science |
Publisher: | [L. Končar] |
UDC: | 004.8(043.2) |
COBISS: | 102623747 |
Views: | 104 |
Downloads: | 51 |
Average score: | 0 (0 votes) |
Metadata: |
Secondary language: | English |
---|---|
Secondary title: | Deep learning for text-to-speech |
Secondary abstract: | Text-to-speech (TTS) is useful in a variety of areas. With deep learning we can use any person's voice for TTS, if only we have a few minutes of recordings of their speech. Converting the recordings into a dataset useful for model training is time consuming, so we created software that makes this process easier. We then created models using Tacotron and two vocoders: Griffin-Lim and WaveRNN. In the end we performed a comparison of these two vocoders and found that Griffin-Lim is much faster at synthesizing speech than WaveRNN, but the quality of speech is significantly worse. |
Secondary keywords: | deep learning;text-to-speech;computer and information science;diploma;Globoko učenje (strojno učenje);Računalništvo;Univerzitetna in visokošolska dela; |
Type (COBISS): | Bachelor thesis/paper |
Study programme: | 1000468 |
Embargo end date (OpenAIRE): | 1970-01-01 |
Thesis comment: | Univ. v Ljubljani, Fak. za računalništvo in informatiko |
Pages: | 36 str. |
ID: | 14808613 |