Jezik: | Slovenski jezik |
---|---|
Leto izida: | 2021 |
Tipologija: | 2.11 - Diplomsko delo |
Organizacija: | UL FRI - Fakulteta za računalništvo in informatiko |
Založnik: | [T. Šabanov] |
UDK: | 004.8:81'322(043.2) |
COBISS: | 75236355 |
Št. ogledov: | 326 |
Št. prenosov: | 74 |
Ocena: | 0 (0 glasov) |
Metapodatki: |
Sekundarni jezik: | Angleški jezik |
---|---|
Sekundarni naslov: | Slovene speech synthesis using multi-speaker datasets |
Sekundarni povzetek: | In the thesis, we addressed the problem of Slovene speech synthesis based on relatively small data set. We described older approaches to speech synthesis like articular and formant synthesis, and more modern approaches like unit selection and speech synthesis with deep neural networks. We created a dataset consisting 30 hours of speech from four speakers for use with speech synthesis. We used ForwardTacotron architecture for generating mel-spectrograms and Hifi-GAN architecture for generating waveforms from these spectrograms. We created a basic model for male speech, which can be fine-tuned for new speakers. The best system we created achieved a good mean opinion score of listeners (4.07 on a scale 1-5) that simulates natural speech. |
Sekundarne ključne besede: | Slovene speech synthesis;deep neural networks;Tacotron model;computer and information science;diploma;Računalniško jezikoslovje;Umetna inteligenca;Računalništvo;Univerzitetna in visokošolska dela; |
Vrsta dela (COBISS): | Diplomsko delo/naloga |
Študijski program: | 1000468 |
Konec prepovedi (OpenAIRE): | 1970-01-01 |
Komentar na gradivo: | Univ. v Ljubljani, Fak. za računalništvo in informatiko |
Strani: | 40 str. |
ID: | 13296241 |