You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there, I'm trying to use the HNSW + IVFPQ for quick search. Thanks to FAISS, the code is very simple:
xb = np.array([1e9, 128], dtype=np.float32)
xq = np.array([1, 128], dtype=np.float32)
d = xq.shape[1]
k = 10
# set HNSW index parameters
M = 64 # number of connections each vertex will have
ef_search = 32 # depth of layers explored during search
ef_construction = 64 # depth of layers explored during index construction
quantizer = faiss.IndexHNSWFlat(d, M)
quantizer.hnsw.efConstruction = ef_construction
quantizer.hnsw.efSearch = ef_search
nlist = 2048 # how many Voronoi cells (must be >= k* which is 2**nbits)
nbits = 8 # when using IVF+PQ, higher nbits values are not supported
m = 8
index = faiss.IndexIVFPQ(quantizer, d, nlist, m, nbits)
train_count = int(np.floor(0.1 * len(xb)))
train_inds = np.random.choice(np.arange(len(xb)), train_count, replace=False)
index.train(xb[train_inds, :])
index.add(xb)
D, I = index.search(xq, k)
It appears that the distance computation in Product Quantization (PQ) does not rely on the Hamming distance. As far as I understand, PQ generates binary codes, and the distance between them is typically computed using the Hamming metric. Am I correct in this understanding?"
Another is, in this case, the indexing procedure involves three modules: HNSW, IVF, PQ. The inputs are a number of float vectors. Could you please help clarify the order of these steps and the distance metric used in the indexing procedure? Thanks in advance.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi there, I'm trying to use the HNSW + IVFPQ for quick search. Thanks to FAISS, the code is very simple:
The returned
D
is like:It appears that the distance computation in Product Quantization (PQ) does not rely on the Hamming distance. As far as I understand, PQ generates binary codes, and the distance between them is typically computed using the Hamming metric. Am I correct in this understanding?"
Another is, in this case, the indexing procedure involves three modules: HNSW, IVF, PQ. The inputs are a number of float vectors. Could you please help clarify the order of these steps and the distance metric used in the indexing procedure? Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions