Suggestions on implementing multi-scale quantization #3402

anfatima · 2024-04-30T17:43:12Z

Summary

Is multiscale quantization (https://papers.nips.cc/paper_files/paper/2017/hash/b6617980ce90f637e68c3ebe8b9be745-Abstract.html) supported? I have been reading the FAISS code, but so far it seems like that it is not supported and there doesn't seem to be a straightforward way to code it in Python without affecting performance significantly.

Any suggestions on the fastest way to add support for it (if it is not supported)? Are there alternative solutions that deal with the problem of large variances in the norms of the data point? If it is not supported, why not?

Platform

Faiss version: 1.7.4

Running on:

CPU
GPU

Interface:

C++
Python

mdouze · 2024-05-06T10:15:36Z

Yes it would be interesting to try it out.
What's weird in the paper is that the experiments are performed on Deep1M and SIFT1M, that are both normalized datasets, so the justification of multiscale quantization is not convincing.

anfatima · 2024-05-07T15:10:34Z

They say that a large variance between the norms affects the retrieval performance, so it is more of an argument about how large variances degrade the codebook performance.

Also, another paper on assessing the performance of compressed embeddings (https://arxiv.org/pdf/1909.01264) advocates for uniform quantization compared to K-means as a coarse quantizer. So, I was wondering if uniformly quantizing the scalar component (in multiscale) and then using that bucket to get the quantized residual could lead to better retrieval.

I will test it out with what is currently available in FAISS and if it leads to improvement in retrieval, will open a thread on how to implement it in FAISS for better runtime performance.

Thanks!

mdouze · 2024-05-10T14:27:45Z

Sure. NB that many clustering variants can be implemented in python without much performance impact, see eg. the k-means implementation in

https://github.com/facebookresearch/faiss/blob/main/contrib/clustering.py#L330

mdouze added the question label May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestions on implementing multi-scale quantization #3402

Suggestions on implementing multi-scale quantization #3402

anfatima commented Apr 30, 2024

mdouze commented May 6, 2024

anfatima commented May 7, 2024

mdouze commented May 10, 2024

Suggestions on implementing multi-scale quantization #3402

Suggestions on implementing multi-scale quantization #3402

Comments

anfatima commented Apr 30, 2024

Summary

Platform

mdouze commented May 6, 2024

anfatima commented May 7, 2024

mdouze commented May 10, 2024