Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk indexing - Saving a model with a long indexable name triggers as many queries as the number of ngrams #43

Open
Startouf opened this issue Sep 3, 2020 · 1 comment
Labels

Comments

@Startouf
Copy link

Startouf commented Sep 3, 2020

The update code triggers one insert per ngram

This leads to as many insert requests as there are ngrams to be indexed. This becomes terribly slow when doing batch updates, (especially if one forgets to implement the update_if option)

I suggest the following

  • Use an implementation of batch update to add all the ngrams in a single request
  • (might be a an idea for a new feature) use a default implementation of default_if based on dirty tracking (if the indexed fields are Mongoid fields, then it's possible to use dirty tracking with _changed? methods to know if an update is needed
@Startouf Startouf changed the title Bulk indexing - Saving a model with a long indexable name triggers as many queries as the name length Bulk indexing - Saving a model with a long indexable name triggers as many queries as the number of ngrams Sep 3, 2020
@dblock dblock added the chore label Sep 3, 2020
@KieranP
Copy link

KieranP commented Apr 27, 2021

100%. The searching code performance isn't great either. It generates one query for every nGram in the search term instead of using one query with multiple OR conditions, AND then one query for every search result to fetch the original object. Highly inefficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants