Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learnable Window #108

Open
gunikavashisht13472 opened this issue Nov 3, 2021 · 3 comments
Open

Learnable Window #108

gunikavashisht13472 opened this issue Nov 3, 2021 · 3 comments

Comments

@gunikavashisht13472
Copy link

Could you please elaborate why you have not used Learnable_window in STFT , Mel Spectrograms and MFCC but used in their inverse counterparts?

@KinWaiCheuk
Copy link
Owner

The learnable kernels are, by default, disable in all STFT, Mel spectrograms, MFCC, and their inverse counterparts.

If you are referring to the argument refresh_win as shown below, it is not for learnable kernels. It is for recalculating the window_sumsquare for different audio lengths, which is essential to obtained a correct inverse. If all of your audio clips are of the same length, you can make refresh_win=False to speed up the calculation a little bit.

def inverse_stft(self, X, kernel_cos, kernel_sin, onesided=True, length=None, refresh_win=True):

@gunikavashisht13472
Copy link
Author

Thank you for the answer. I want to use RNNS instead of CNNs in my maodel. Will this code work for RNNs too?

@KinWaiCheuk
Copy link
Owner

Thank you for the answer. I want to use RNNS instead of CNNs in my maodel. Will this code work for RNNs too?

Yes it will work. nnAudio is just for spectrogram extraction using GPU. Once you have that spectrogram, you can use any model of your choice. The CNN in the example is just a demonstration on how to use nnAudio together with a pytorch model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants