Secondary abstract: |
With segmentation we can split audio recordings into meaningful parts. Longer recordings can be split into individual songs, interviews, comments … songs can be split into structural elements such as chorus and verse … these can be further split into motives and stanzas. Segmentation is usually the first step towards semantic description of recordings, since it enables us to take into account individual parts and simplifies procedures, such as discovering melody, harmony, rhythm and finding similarity of different parts. Segmentation algorithms are based on finding similar features in short parts of the analyzed audio signal, e.g. timbre, harmony or rhythm. Current approaches mostly deal with popular music, which is more commercially interesting. Music segmentation is lately reaching out to other genres as well.
In this thesis we provide an overview of research in music segmentation. We have pointed out the problems of current approaches, e.g. lower performance on folk music because of low quality sound recordings, inaccurate singing of amateur singers, distractions during recording or poor performance. We present a novel approach for two-level segmentation of field recordings of folk music. The method uses audio recordings only and does not rely on prior knowledge given by an expert. On the higher level, method splits field recordings into individual units of the same type (speech, solo singing, choral singing …). On the lower level, the method further splits parts with solo and choral singing into individual repeating parts. The proposed algorithm combines an existing approach for higher-level segmentation of field recordings with a newly developed low-level segmentation method. Low-level segmentation method is also used to improve high-level segmentation.
As part of this thesis we have tested and evaluated our method on collection of field recordings of Slovenian folk music. |