The SeFiRe field recording dataset

The repository contains the SeFiRe dataset - a dataset of annotated field recording samples that can be used for training audio labelling algorithms. The dataset contains 5 second long clips extracted from a variety of field recordings. Each clip is labelled with one label describing its contents as:

s: speech
1: solo singing (one voice)
2: choir singing (more than one voice, can also be unison)
i: instrumental (vocals may also be present)
x: (background) noise or a very noisy clip belonging to one of the above categories.

You will find two folders in the repository. Samples contains five second audio samples, their annotations are given in sample labels.txt, which is a tab-delimited text file. External samples contains annotations of five second clips extracted from a number of online sources. We can't include the clips in the dataset, but we provide links to audio files with the location of each extracted clip and its label in external sources.txt.

Feel free to use the dataset for your research, if you use it, please cite the following paper:

M. Marolt, C. Bohak, A. Kavčič, and M. Pesek, Automatic segmentation of ethnomusicological field recordings, Applied sciences, vol. 9, iss. 3, pp. 1-12, 2019, https://doi.org/10.3390/app9030439

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
external samples		external samples
samples		samples
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

external samples

external samples

samples

samples

README.md

README.md

Repository files navigation

The SeFiRe field recording dataset

About

Releases

Packages

matijama/field-recording-db

Folders and files

Latest commit

History

Repository files navigation

The SeFiRe field recording dataset

About

Topics

Resources

Stars

Watchers

Forks