Samodejno učeči se sistem za sintezo človeškega govora

doktorska disertacija

Damjan Šonc (Avtor), Damjan Šonc (Avtor), Dušan Kodek (Mentor), Dušan Kodek (Mentor)

Povzetek

Samodejno učeči se sistem za sintezo človeškega govora

Ključne besede

sinteza govora;sestava govora;sestava z izbiro enot;samodejni razrez govora;učeči se sistem;računalništvo;disertacije;

Podatki

Jezik:	Slovenski jezik
Leto izida:	2011
Tipologija:	2.08 - Doktorska disertacija
Organizacija:	UL FRI - Fakulteta za računalništvo in informatiko
Založnik:	[D. Šonc]
UDK:	004.934.5(043.3)
COBISS:	256431616
Št. ogledov:	50
Št. prenosov:	5
Ocena:	0 (0 glasov)
Metapodatki:

Ostali podatki

Sekundarni jezik:	Angleški jezik
Sekundarni naslov:	Trainable speech synthesis system
Sekundarni povzetek:	The ultimate goal of speech synthesis is to build a system that could convert arbitrary written messages into intelligible and natural sounding speech. Such a system should also run on hardware platforms that we meet in everyday's life like a personal computer. The solutions that appeared in the last five decades can be divided into three different generations. Unfortunately, even the latest systems from the third generation are far from generating perfectly natural sounding speech. Currently, the best quality of the synthetic speech is obtained from the systems that belong to the group of Unit Selection Synthesis Systems. To build an adequate database of speech units a lot of work from trained engineers is required. The main objective of this Ph.D. thesis was to develop a system that could learn how to produce a high quality synthetic speech from the text and corresponding speech samples only, without requirements for skilled human labor or trained ASR (Automatic Speech Recognition) systems. The system should use statistical, machine learning techniques instead and algorithms for the automatic speech segmentation that do not require ASR. For the purposes of the thesis a prototype of the speech synthesis system named Learn to Speak by Yourself (LSY) was constructed. LSY belongs to the group of Unit Selection Synthesis Systems. The core of the LSY is made of the newly developed algorithm for the automatic speech segmentation that does not require the usage of an ASR system. The algorithm exploits the spectral differences between different phonemes (allophones) of a language. This approach is particularly useful for the Slovene or some other language with a relatively small number of speakers where it is more difficult to find skilled engineers or well trained ASR systems for the speech database construction. The system can start from scratch – i.e. no speech unit database is required. The database is automatically built during learning process. For generation of the speech samples the LSY uses a sinusoidal generator. The statistical results obtained from the listening tests show that synthetic speech produced by the generator in a synthesis by analysis process cannot be distinguished from a natural human speech. We may conclude that in theory a perfectly natural sounding synthetic speech can be produced by LSY. At this time the speech produced by a prototype version of the LSY is highly intelligible but not yet natural sounding. The main reason is the fact that only a few minutes of speech samples were fed to the prototype system while research results found in the literature recommend at least one hour of speech samples and even systems with five hours or more of speech samples are not uncommon. The future work will be concentrated on methods for the automatic extraction of prosody parameters from the speech samples. We would also like to improve the algorithm for the automatic speech segmentation
Sekundarne ključne besede:	speech synthesis;unit selection synthesis;automatic speech segmentation;trainable speech synthesis;computer science;doctoral dissertations;theses;Govor;Disertacije;Sinteza;
Vrsta datoteke:	application/pdf
Vrsta dela (COBISS):	Doktorska disertacija
Komentar na gradivo:	Univ. v Ljubljani, Fak. za računalništvo in informatiko
Strani:	1 optični disk (CD-ROM)
ID:	23936564

Slovenski jezik

English language

Priporočena dela:

Samodejno učeči se sistem za sintezo človeškega govora

2011, doktorska disertacija

Določanje višine govora in prozodičnih značilnosti pri avtomatski sintezi govora

1999, diplomsko delo univerzitetnega študija

Parallel computation in the Stan probabilistic programming language

2022, ni podatka o podnaslovu

Razvoj glasovno podprtih sistemov v programskem okolju Smalltalk

2003, diplomska naloga univerzitetnega študijskega programa

Argument based machine learning

2009, doctoral dissertation