Transkripcija klavirske glasbe s konvolucijskimi nevronskimi mrežami

magistrsko delo

Miha Pešič (Author), Matija Marolt (Mentor)

Abstract

V magistrskem delu obravnavamo problem avtomatske transkripcije klavirske glasbe. Z metodami strojnega učenja želimo iz zvočnega posnetka avtomatsko zaznati zaigrane klavirske note. Po zgledu najnovejših raziskav na področju smo implementirali rešitev s konvolucijskimi nevronskimi mrežami. Poleg učenja na označenih zbirkah posnetkov smo razvili generator učnih podatkov, ki med učenjem nevronske mreže v realnem času pripravlja spektrograme in matrike referenčnih anotacij iz datotek MIDI. Zbrali smo večje število MIDI datotek različnih glasbenih zvrsti za učenje. Pripravili smo testno množico, ki poleg 10 posnetkov klasične glasbe vsebuje 60 posnetkov šestih dodatnih zvrsti glasbe. Primerjali smo rezultate modelov, učenih na različne načine. Pri evalvaciji po okvirjih z generatorjem dosežemo nekoliko nižjo mero F kot z učenjem s pravimi posnetki glasbe. Pri evalvaciji po notah brez zaključkov je učenje z generatorjem boljše, pri evalvaciji po notah z zaključki pa precej slabše od učenja s pravimi posnetki.

Keywords

klavirska glasba;transkripcija;nevronska mreža;računalništvo;računalništvo in informatika;magisteriji;

Data

Language:	Slovenian
Year of publishing:	2020
Typology:	2.09 - Master's Thesis
Organization:	UL FRI - Faculty of Computer and Information Science
Publisher:	[M. Pešič]
UDC:	004.85:780.8(043.2)
COBISS:	1538538435
Views:	869
Downloads:	222
Average score:	0 (0 votes)
Metadata:

Other data

Secondary language:	English
Secondary title:	Transcription of piano music with convolutional neural networks
Secondary abstract:	In this thesis we tackle the problem of automatic music transcription of piano music. We wish to successfully transcribe piano notes played in an audio recording using machine learning techniques. We follow the latest developments in the field and implement a solution based on convolutional neural networks. In addition to training on annotated piano music datasets, we introduce a synthetic data generator that runs in real time during training and uses MIDI files to generate training spectrograms and groundtruth data. To train our models, we have collected a large set of MIDI files containing various genres of music. We also prepared a test set which comprises of 60 piano recordings of 6 different genres in addition to 10 recordings of classical music. We evaluate the results using different training methods. Frame-wise evaluation yields slightly better results using real piano test data than using synthetic data. We obtain better note-wise results without offsets using synthetic data, however note-wise evaluation yields superior results using real training data.
Secondary keywords:	piano music;transcription;neural network;computer science;computer and information science;master's degree;
Type (COBISS):	Master's thesis/paper
Study programme:	1000471
Embargo end date (OpenAIRE):	1970-01-01
Thesis comment:	Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages:	69 str.
ID:	11416812