Skip to content

seorim0/SE_Tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Speech Enhancement Tutorials

This is a repo for Speech Enhancement tutorials (Especially for time-frequency domain). You can experiment with various Speech enhancement techniques through this repo.

Update:

  • 2024.05.15 Upload codes

Will be soon:

  • Upload baseline codes
  • Upload performance rank table
  • Upldate performance rank table
  • Add some explanations
  • Add some analysis tools
  • Add current DNN-based SE models
  • Upldate references

Requirements

This repo is tested with Ubuntu 22.04, PyTorch 2.0.1, Python3.10, and CUDA11.7. For package dependencies, you can install them by:

pip install -r requirements.txt    

Getting started

  1. Install the necessary libraries.
  2. Download the VoiceBank+DEMAND database or prepare your own database and place it in '../Dataset/' folder.
├── 📦 SE_Tutorials   
│   └── 📂 models   
│       └── 📂 ref   
│           └── ...
│       └── ED_FNN.py   
│       └── ED_CNN.py
│   └── options.py   
│   └── train_interface.py   
│   └── ...   
└── 📦 Dataset   
    └── 📂 VBD (or ...)
        └── 📂 train   
            └── clean
            └── noisy
        └── 📂 test   
            └── clean
            └── noisy
  1. Run train_interface.py
  • You can simply change any parameter settings if you need to adjust them. (options.py)

For easy start

We have prepared a .ipynb file so you can just run it.

Baseline model architecture

Techniques

Technologies available in this repo are as follows:

  • generate noisy database
  • normalization
  • compression
  • domain
  • joint loss function
  • perceptual loss function
  • adversarial train

Performance ranks (using VoiceBank+DEMAND database)

The scores shown in this table are based on the values written in their paper.

Model Params (M) Causality PESQ CSIG CBAK COVL STOI SSNR Year Input Code
Noisy - - 1.97 3.35 2.44 2.63 0.91 1.68 - - -
SEGAN 97.47 2.16 3.48 2.94 2.80 0.92 7.73 2017 Time
MetricGAN - 2.86 3.99 3.18 3.42 - - 2019 Magnitude
PHASEN 0.92 2.99 4.21 3.55 3.62 - 10.08 2020 Magnitude+Phase

Reference

Contact

Please get in touch with us if you have any questions or suggestions.
E-mail: allmindfine@yonsei.ac.kr