Skip to content

Database of annotated field recording samples that can be used for training audio labelling algorithms

Notifications You must be signed in to change notification settings

matijama/field-recording-db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

The SeFiRe field recording dataset

The repository contains the SeFiRe dataset - a dataset of annotated field recording samples that can be used for training audio labelling algorithms. The dataset contains 5 second long clips extracted from a variety of field recordings. Each clip is labelled with one label describing its contents as:

  • s: speech
  • 1: solo singing (one voice)
  • 2: choir singing (more than one voice, can also be unison)
  • i: instrumental (vocals may also be present)
  • x: (background) noise or a very noisy clip belonging to one of the above categories.

You will find two folders in the repository. Samples contains five second audio samples, their annotations are given in sample labels.txt, which is a tab-delimited text file. External samples contains annotations of five second clips extracted from a number of online sources. We can't include the clips in the dataset, but we provide links to audio files with the location of each extracted clip and its label in external sources.txt.

Feel free to use the dataset for your research, if you use it, please cite the following paper:

About

Database of annotated field recording samples that can be used for training audio labelling algorithms

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published