Adapt prefix thresholds #4273

ManyTheFish · 2023-12-20T14:44:13Z

This PR changes the prefix database creation thresholds to reduce their impact on the indexing time.

Explanations

There are two thresholds that we can change in the prefix database:

the minimum number of words prefixed by the prefix candidate, named threshold
the maximum number of characters in a prefix, named max_prefix_length

Prior values and drawback

On Meilisearch v1.5, we have 100 as the threshold and 4 as the max_prefix_length, meaning that we compute and store every prefix that has at least 100 words prefixed by it and less than 5 characters.

By making some statistics using movies, Wikipedia, and e-commerce datasets, I found out that only 10% of the total count of stored prefixes have 1 character on a small dataset (10MB), and less than 1% on a bigger dataset (2GB).
This is a shame because the 1-character prefixes are the most important regarding search time gain.
On the opposite, the prefixes containing 3 or 4 characters represent more than 30% of the prefixes on a small dataset, which is acceptable, but represent more than 80% of the prefixes on big datasets.
In this configuration for the big datasets, we allocate between 80 and 100 times more resources (indexing time, space..) for the least important prefixes than for the most important ones, which seems non-optimal to me.

New value

This PR changes the threshold from 100 to 500 and the max_prefix_length from 4 to 3 in order to rebalance the distribution.
This way:

on a small dataset:

the 1-character prefixes represent more than 40% of the prefixes (10% before)
the 3-character prefixes represent around 2% of the prefixes (30% before, counting the 4-character prefixes)

on a bigger dataset:

the 1-character prefixes represent around 5% of the prefixes (less than 1% before)
the 3-character prefixes represent around 60% of the prefixes (more than 80% before, counting the 4-character prefixes)

This suggested change is conservative, and we could think of raising the thresholds even more, but I'm afraid that it would have a significant impact on the search, and because we already are in the prerelease phase we will not be able to make that many benchmarks.

Benchmarks results

Indexing Time

In terms of indexing time, there is a small gain by modifying the prefixes thresholds, but it is under the expectations. Despite the fact this change reduces the number of computed prefixes, the time to compute doesn't change that much. The total indexing time gain is between 5 and 10%.
Below are some indexing graphs of the e-commerce and the Wikipedia dataset showing the time spent to index the documents for each document addition:

Database size

As the indexing time gain, the total database size gain is between 5 and 10%.
Below are some indexing graphs of the e-commerce and the Wikipedia dataset showing the size of the database after each document addition:

Search time

Some search queries lost some performances:

smol-wiki-articles.csv: basic without quote/rock and roll                                2.12      3.9±0.01ms        ? ?/sec             1.00  1829.4±16.76µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/film                                         8.92      8.7±0.03ms        ? ?/sec             1.00   976.8±15.89µs        ? ?/sec

One is multiplied by 2 when the other by 8. It may be because roll and film are no longer computed as prefixes, so Meilisearch must compute them at search time.

@ManyTheFish

Is it worth to merge this PR?
We may try other thresholds, for example, keep or raise the threshold but come back to a max_prefix_length of 4 to maybe re-compute the roll and film prefixes and re-gain the time lost on the search benchmarks.

ManyTheFish · 2023-12-20T14:45:07Z

/benchmark search_wiki

meili-bot · 2023-12-20T15:35:39Z

Here are your search_wiki benchmarks diff 👊

group                                                                                    search_wiki_adapt-prefix-thresholds_248aaa6d    search_wiki_main_248aaa6d
-----                                                                                    --------------------------------------------    -------------------------
smol-wiki-articles.csv: basic placeholder/                                               1.00     18.2±0.14µs        ? ?/sec             1.04     19.0±0.20µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"film"                                          1.00      2.1±0.01ms        ? ?/sec             1.00      2.1±0.01ms        ? ?/sec
smol-wiki-articles.csv: basic with quote/"france"                                        1.00  1308.0±10.63µs        ? ?/sec             1.00   1306.0±6.56µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"japan"                                         1.00    846.8±8.15µs        ? ?/sec             1.00    846.7±3.62µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"machine"                                       1.00    415.0±2.30µs        ? ?/sec             1.00    417.0±2.22µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"miles" "davis"                                 1.00    816.7±3.24µs        ? ?/sec             1.00    820.0±5.58µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"mingus"                                        1.00    198.0±4.29µs        ? ?/sec             1.03    203.0±3.90µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"rock" "and" "roll"                             1.00  1878.1±10.01µs        ? ?/sec             1.00  1875.2±11.05µs        ? ?/sec
smol-wiki-articles.csv: basic with quote/"spain"                                         1.00    561.4±9.37µs        ? ?/sec             1.00    560.4±3.44µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/film                                         8.92      8.7±0.03ms        ? ?/sec             1.00   976.8±15.89µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/france                                       1.00      7.1±0.02ms        ? ?/sec             1.00      7.2±0.04ms        ? ?/sec
smol-wiki-articles.csv: basic without quote/japan                                        1.00      4.2±0.02ms        ? ?/sec             1.00      4.2±0.02ms        ? ?/sec
smol-wiki-articles.csv: basic without quote/machine                                      1.00  1795.6±11.38µs        ? ?/sec             1.00  1804.2±37.94µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/miles davis                                  1.00      4.7±0.02ms        ? ?/sec             1.00      4.7±0.02ms        ? ?/sec
smol-wiki-articles.csv: basic without quote/mingus                                       1.00    567.4±3.53µs        ? ?/sec             1.00    568.8±7.74µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/rock and roll                                2.12      3.9±0.01ms        ? ?/sec             1.00  1829.4±16.76µs        ? ?/sec
smol-wiki-articles.csv: basic without quote/spain                                        1.00      3.4±0.01ms        ? ?/sec             1.00      3.4±0.05ms        ? ?/sec
smol-wiki-articles.csv: prefix search/c                                                  1.00    811.5±3.77µs        ? ?/sec             1.00    808.5±3.52µs        ? ?/sec
smol-wiki-articles.csv: prefix search/g                                                  1.01  1034.6±13.70µs        ? ?/sec             1.00   1022.5±6.85µs        ? ?/sec
smol-wiki-articles.csv: prefix search/j                                                  1.01  1036.1±10.43µs        ? ?/sec             1.00   1028.4±9.71µs        ? ?/sec
smol-wiki-articles.csv: prefix search/q                                                  1.01    493.2±4.21µs        ? ?/sec             1.00    489.2±3.92µs        ? ?/sec
smol-wiki-articles.csv: prefix search/t                                                  1.00   1112.8±9.51µs        ? ?/sec             1.00   1115.6±8.63µs        ? ?/sec
smol-wiki-articles.csv: prefix search/x                                                  1.00  1881.2±11.66µs        ? ?/sec             1.00  1879.4±10.47µs        ? ?/sec
smol-wiki-articles.csv: proximity/april paris                                            1.00      5.2±0.02ms        ? ?/sec             1.00      5.2±0.02ms        ? ?/sec
smol-wiki-articles.csv: proximity/diesel engine                                          1.00    956.7±3.95µs        ? ?/sec             1.01    968.6±3.42µs        ? ?/sec
smol-wiki-articles.csv: proximity/herald sings                                           1.00      2.4±0.01ms        ? ?/sec             1.00      2.4±0.01ms        ? ?/sec
smol-wiki-articles.csv: proximity/tea two                                                1.00    236.4±1.28µs        ? ?/sec             1.02    240.3±1.13µs        ? ?/sec
smol-wiki-articles.csv: typo/Disnaylande                                                 1.01      2.3±0.01ms        ? ?/sec             1.00      2.2±0.01ms        ? ?/sec
smol-wiki-articles.csv: typo/aritmetric                                                  1.01      2.8±0.01ms        ? ?/sec             1.00      2.7±0.01ms        ? ?/sec
smol-wiki-articles.csv: typo/linax                                                       1.00    305.4±1.80µs        ? ?/sec             1.01    309.7±1.57µs        ? ?/sec
smol-wiki-articles.csv: typo/migrosoft                                                   1.01      2.4±0.01ms        ? ?/sec             1.00      2.3±0.01ms        ? ?/sec
smol-wiki-articles.csv: typo/nympalidea                                                  1.01  1856.0±16.96µs        ? ?/sec             1.00  1839.8±17.12µs        ? ?/sec
smol-wiki-articles.csv: typo/phytogropher                                                1.01   1997.8±8.74µs        ? ?/sec             1.00   1979.4±8.39µs        ? ?/sec
smol-wiki-articles.csv: typo/sisan                                                       1.00    472.0±2.83µs        ? ?/sec             1.01    478.1±2.75µs        ? ?/sec
smol-wiki-articles.csv: typo/the fronce                                                  1.00  1402.9±18.80µs        ? ?/sec             1.02  1424.2±12.18µs        ? ?/sec
smol-wiki-articles.csv: words/Abraham machin                                             1.00    915.7±4.32µs        ? ?/sec             1.00    920.2±5.55µs        ? ?/sec
smol-wiki-articles.csv: words/Idaho Bellevue pizza                                       1.00  1059.9±15.74µs        ? ?/sec             1.02  1077.6±68.56µs        ? ?/sec
smol-wiki-articles.csv: words/Kameya Tokujirō mingus monk                                1.00  1412.3±24.47µs        ? ?/sec             1.01  1424.7±11.17µs        ? ?/sec
smol-wiki-articles.csv: words/Ulrich Hensel meilisearch milli                            1.00      3.4±0.01ms        ? ?/sec             1.01      3.5±0.01ms        ? ?/sec
smol-wiki-articles.csv: words/the black saint and the sinner lady and the good doggo     1.00      2.9±0.01ms        ? ?/sec             1.03      3.0±0.01ms        ? ?/sec

Kerollmops

I like what you did there!

dureuill · 2023-12-21T08:36:39Z

Great thinking @ManyTheFish !

Could you add very visibly in the description of the PR: a table indicating the benefits of this PR for index size, indexing time, and the tradeoff for search time?

ManyTheFish · 2023-12-21T09:39:24Z

/benchmark search_songs

meili-bot · 2023-12-21T10:16:39Z

Here are your search_songs benchmarks diff 👊

group                                                                                                    search_songs_adapt-prefix-thresholds_248aaa6d    search_songs_main_248aaa6d
-----                                                                                                    ---------------------------------------------    --------------------------
smol-songs.csv: asc + default/Notstandskomitee                                                           1.00      2.2±0.01ms        ? ?/sec              1.01      2.3±0.01ms        ? ?/sec
smol-songs.csv: asc + default/charles                                                                    1.00    734.2±5.24µs        ? ?/sec              1.00    732.8±5.08µs        ? ?/sec
smol-songs.csv: asc + default/charles mingus                                                             1.01  1102.4±11.13µs        ? ?/sec              1.00  1096.8±13.11µs        ? ?/sec
smol-songs.csv: asc + default/david                                                                      1.00  1194.1±23.48µs        ? ?/sec              1.00   1195.3±8.12µs        ? ?/sec
smol-songs.csv: asc + default/david bowie                                                                1.01  1523.4±13.37µs        ? ?/sec              1.00   1509.1±7.55µs        ? ?/sec
smol-songs.csv: asc + default/john                                                                       1.01    634.1±4.24µs        ? ?/sec              1.00    627.3±3.45µs        ? ?/sec
smol-songs.csv: asc + default/marcus miller                                                              1.01  1912.4±11.46µs        ? ?/sec              1.00   1886.3±9.54µs        ? ?/sec
smol-songs.csv: asc + default/michael jackson                                                            1.01  1629.0±33.96µs        ? ?/sec              1.00   1606.8±8.25µs        ? ?/sec
smol-songs.csv: asc + default/tamo                                                                       1.00    806.1±3.94µs        ? ?/sec              1.00    804.7±4.28µs        ? ?/sec
smol-songs.csv: asc + default/thelonious monk                                                            1.00      3.1±0.01ms        ? ?/sec              1.01      3.1±0.01ms        ? ?/sec
smol-songs.csv: asc/Notstandskomitee                                                                     1.00      2.3±0.01ms        ? ?/sec              1.01      2.3±0.01ms        ? ?/sec
smol-songs.csv: asc/charles                                                                              1.02    400.5±2.71µs        ? ?/sec              1.00    393.9±3.34µs        ? ?/sec
smol-songs.csv: asc/charles mingus                                                                       1.00    568.9±4.39µs        ? ?/sec              1.00    568.6±3.73µs        ? ?/sec
smol-songs.csv: asc/david                                                                                1.00    634.7±3.70µs        ? ?/sec              1.00    631.9±2.59µs        ? ?/sec
smol-songs.csv: asc/david bowie                                                                          1.01    795.2±4.91µs        ? ?/sec              1.00    787.1±3.94µs        ? ?/sec
smol-songs.csv: asc/john                                                                                 1.00    494.6±3.34µs        ? ?/sec              1.00    495.2±3.64µs        ? ?/sec
smol-songs.csv: asc/marcus miller                                                                        1.01    615.8±6.53µs        ? ?/sec              1.00    609.0±6.04µs        ? ?/sec
smol-songs.csv: asc/michael jackson                                                                      1.01    758.8±4.64µs        ? ?/sec              1.00    749.6±3.47µs        ? ?/sec
smol-songs.csv: asc/tamo                                                                                 1.01    824.6±9.07µs        ? ?/sec              1.00    814.9±4.67µs        ? ?/sec
smol-songs.csv: asc/thelonious monk                                                                      1.00      2.4±0.01ms        ? ?/sec              1.01      2.4±0.01ms        ? ?/sec
smol-songs.csv: basic filter: <=/Notstandskomitee                                                        1.00      2.1±0.01ms        ? ?/sec              1.01      2.1±0.01ms        ? ?/sec
smol-songs.csv: basic filter: <=/charles                                                                 1.00    198.3±1.51µs        ? ?/sec              1.00    197.6±1.70µs        ? ?/sec
smol-songs.csv: basic filter: <=/charles mingus                                                          1.01    355.2±2.37µs        ? ?/sec              1.00    350.4±4.81µs        ? ?/sec
smol-songs.csv: basic filter: <=/david                                                                   1.01    250.1±4.75µs        ? ?/sec              1.00    247.3±3.15µs        ? ?/sec
smol-songs.csv: basic filter: <=/david bowie                                                             1.00    387.6±2.50µs        ? ?/sec              1.00    386.2±2.75µs        ? ?/sec
smol-songs.csv: basic filter: <=/john                                                                    1.00     44.5±0.77µs        ? ?/sec              1.00     44.4±0.73µs        ? ?/sec
smol-songs.csv: basic filter: <=/marcus miller                                                           1.01    388.7±2.59µs        ? ?/sec              1.00    386.1±1.85µs        ? ?/sec
smol-songs.csv: basic filter: <=/michael jackson                                                         1.00    411.9±6.81µs        ? ?/sec              1.00    410.1±2.49µs        ? ?/sec
smol-songs.csv: basic filter: <=/tamo                                                                    1.00    114.3±0.79µs        ? ?/sec              1.00    113.8±0.79µs        ? ?/sec
smol-songs.csv: basic filter: <=/thelonious monk                                                         1.00      2.2±0.01ms        ? ?/sec              1.01      2.2±0.01ms        ? ?/sec
smol-songs.csv: basic filter: TO/Notstandskomitee                                                        1.00      2.1±0.01ms        ? ?/sec              1.01      2.1±0.01ms        ? ?/sec
smol-songs.csv: basic filter: TO/charles                                                                 1.00    206.7±1.32µs        ? ?/sec              1.00    207.2±1.24µs        ? ?/sec
smol-songs.csv: basic filter: TO/charles mingus                                                          1.00    360.0±3.09µs        ? ?/sec              1.00    360.0±1.96µs        ? ?/sec
smol-songs.csv: basic filter: TO/david                                                                   1.00    256.4±1.27µs        ? ?/sec              1.01    259.1±4.66µs        ? ?/sec
smol-songs.csv: basic filter: TO/david bowie                                                             1.01    399.0±2.06µs        ? ?/sec              1.00    395.2±2.40µs        ? ?/sec
smol-songs.csv: basic filter: TO/john                                                                    1.01     54.3±0.89µs        ? ?/sec              1.00     53.6±0.93µs        ? ?/sec
smol-songs.csv: basic filter: TO/marcus miller                                                           1.01    397.8±3.03µs        ? ?/sec              1.00   394.2±11.79µs        ? ?/sec
smol-songs.csv: basic filter: TO/michael jackson                                                         1.01    420.1±2.22µs        ? ?/sec              1.00    416.8±4.93µs        ? ?/sec
smol-songs.csv: basic filter: TO/tamo                                                                    1.01    121.8±2.11µs        ? ?/sec              1.00    120.7±0.85µs        ? ?/sec
smol-songs.csv: basic filter: TO/thelonious monk                                                         1.00      2.2±0.01ms        ? ?/sec              1.01      2.2±0.01ms        ? ?/sec
smol-songs.csv: basic placeholder/                                                                       1.02     35.4±0.28µs        ? ?/sec              1.00     34.7±0.23µs        ? ?/sec
smol-songs.csv: basic with quote/"Notstandskomitee"                                                      1.00    135.2±0.70µs        ? ?/sec              1.00    134.6±5.88µs        ? ?/sec
smol-songs.csv: basic with quote/"charles"                                                               1.01    332.7±6.00µs        ? ?/sec              1.00    328.7±1.34µs        ? ?/sec
smol-songs.csv: basic with quote/"charles" "mingus"                                                      1.00    385.1±2.79µs        ? ?/sec              1.00    383.9±2.20µs        ? ?/sec
smol-songs.csv: basic with quote/"david"                                                                 1.01    525.2±3.99µs        ? ?/sec              1.00    518.6±3.69µs        ? ?/sec
smol-songs.csv: basic with quote/"david" "bowie"                                                         1.01    820.3±2.64µs        ? ?/sec              1.00    812.7±3.60µs        ? ?/sec
smol-songs.csv: basic with quote/"john"                                                                  1.01    838.1±3.81µs        ? ?/sec              1.00    832.7±2.95µs        ? ?/sec
smol-songs.csv: basic with quote/"marcus" "miller"                                                       1.01    272.0±1.85µs        ? ?/sec              1.00    270.5±2.21µs        ? ?/sec
smol-songs.csv: basic with quote/"michael" "jackson"                                                     1.01   790.2±19.44µs        ? ?/sec              1.00    782.7±2.81µs        ? ?/sec
smol-songs.csv: basic with quote/"tamo"                                                                  1.02    186.7±0.97µs        ? ?/sec              1.00    182.8±2.74µs        ? ?/sec
smol-songs.csv: basic with quote/"thelonious" "monk"                                                     1.01   438.3±10.77µs        ? ?/sec              1.00    434.0±5.52µs        ? ?/sec
smol-songs.csv: basic without quote/Notstandskomitee                                                     1.00      2.2±0.01ms        ? ?/sec              1.02      2.2±0.01ms        ? ?/sec
smol-songs.csv: basic without quote/charles                                                              1.02    688.9±3.16µs        ? ?/sec              1.00    675.7±2.62µs        ? ?/sec
smol-songs.csv: basic without quote/charles mingus                                                       1.01    956.2±4.29µs        ? ?/sec              1.00    944.8±3.83µs        ? ?/sec
smol-songs.csv: basic without quote/david                                                                1.00   1405.1±9.13µs        ? ?/sec              1.00   1403.8±7.69µs        ? ?/sec
smol-songs.csv: basic without quote/david bowie                                                          1.02  1481.6±11.40µs        ? ?/sec              1.00  1453.3±10.48µs        ? ?/sec
smol-songs.csv: basic without quote/john                                                                 1.00      2.7±0.01ms        ? ?/sec              1.00      2.7±0.01ms        ? ?/sec
smol-songs.csv: basic without quote/marcus miller                                                        1.01   1453.5±6.56µs        ? ?/sec              1.00  1437.4±10.48µs        ? ?/sec
smol-songs.csv: basic without quote/michael jackson                                                      1.02   1416.5±7.21µs        ? ?/sec              1.00  1388.9±10.35µs        ? ?/sec
smol-songs.csv: basic without quote/tamo                                                                 1.00    419.5±2.56µs        ? ?/sec              1.00    417.7±7.25µs        ? ?/sec
smol-songs.csv: basic without quote/thelonious monk                                                      1.00      3.0±0.01ms        ? ?/sec              1.00      3.0±0.01ms        ? ?/sec
smol-songs.csv: big filter/Notstandskomitee                                                              1.00      2.1±0.01ms        ? ?/sec              1.01      2.2±0.01ms        ? ?/sec
smol-songs.csv: big filter/charles                                                                       1.01    301.7±1.38µs        ? ?/sec              1.00    299.3±1.74µs        ? ?/sec
smol-songs.csv: big filter/charles mingus                                                                1.01    462.0±3.91µs        ? ?/sec              1.00    456.1±2.14µs        ? ?/sec
smol-songs.csv: big filter/david                                                                         1.01    563.8±4.43µs        ? ?/sec              1.00    560.0±5.77µs        ? ?/sec
smol-songs.csv: big filter/david bowie                                                                   1.01   1016.5±7.14µs        ? ?/sec              1.00  1006.3±34.02µs        ? ?/sec
smol-songs.csv: big filter/john                                                                          1.01    354.7±2.56µs        ? ?/sec              1.00    350.2±2.41µs        ? ?/sec
smol-songs.csv: big filter/marcus miller                                                                 1.01    497.5±3.75µs        ? ?/sec              1.00    490.6±4.65µs        ? ?/sec
smol-songs.csv: big filter/michael jackson                                                               1.01    973.1±4.91µs        ? ?/sec              1.00    962.5±4.81µs        ? ?/sec
smol-songs.csv: big filter/tamo                                                                          1.02    208.9±1.85µs        ? ?/sec              1.00    205.2±0.93µs        ? ?/sec
smol-songs.csv: big filter/thelonious monk                                                               1.00      2.3±0.01ms        ? ?/sec              1.01      2.3±0.01ms        ? ?/sec
smol-songs.csv: desc + default/Notstandskomitee                                                          1.00      2.3±0.01ms        ? ?/sec              1.01      2.3±0.01ms        ? ?/sec
smol-songs.csv: desc + default/charles                                                                   1.00    641.0±2.98µs        ? ?/sec              1.00    639.6±3.51µs        ? ?/sec
smol-songs.csv: desc + default/charles mingus                                                            1.00    986.7±8.28µs        ? ?/sec              1.00   985.8±18.69µs        ? ?/sec
smol-songs.csv: desc + default/david                                                                     1.00  1238.5±38.62µs        ? ?/sec              1.00   1233.7±7.41µs        ? ?/sec
smol-songs.csv: desc + default/david bowie                                                               1.02   1656.5±7.74µs        ? ?/sec              1.00   1631.1±9.50µs        ? ?/sec
smol-songs.csv: desc + default/john                                                                      1.00    852.8±8.89µs        ? ?/sec              1.00    852.8±3.72µs        ? ?/sec
smol-songs.csv: desc + default/marcus miller                                                             1.00   1629.9±8.77µs        ? ?/sec              1.00  1621.9±35.12µs        ? ?/sec
smol-songs.csv: desc + default/michael jackson                                                           1.01  1438.4±14.13µs        ? ?/sec              1.00   1425.3±7.31µs        ? ?/sec
smol-songs.csv: desc + default/tamo                                                                      1.00    988.7±5.83µs        ? ?/sec              1.01   996.2±36.49µs        ? ?/sec
smol-songs.csv: desc + default/thelonious monk                                                           1.00      2.7±0.01ms        ? ?/sec              1.01      2.8±0.01ms        ? ?/sec
smol-songs.csv: desc/Notstandskomitee                                                                    1.00      2.3±0.01ms        ? ?/sec              1.01      2.3±0.01ms        ? ?/sec
smol-songs.csv: desc/charles                                                                             1.02    401.0±8.86µs        ? ?/sec              1.00    393.4±2.34µs        ? ?/sec
smol-songs.csv: desc/charles mingus                                                                      1.01    571.4±3.54µs        ? ?/sec              1.00    566.1±6.32µs        ? ?/sec
smol-songs.csv: desc/david                                                                               1.01    637.8±3.97µs        ? ?/sec              1.00    631.9±3.72µs        ? ?/sec
smol-songs.csv: desc/david bowie                                                                         1.01    795.1±4.49µs        ? ?/sec              1.00    790.1±4.56µs        ? ?/sec
smol-songs.csv: desc/john                                                                                1.00    490.5±3.71µs        ? ?/sec              1.01    495.4±2.52µs        ? ?/sec
smol-songs.csv: desc/marcus miller                                                                       1.01    611.8±3.96µs        ? ?/sec              1.00    608.1±4.92µs        ? ?/sec
smol-songs.csv: desc/michael jackson                                                                     1.01    758.3±7.65µs        ? ?/sec              1.00    750.1±7.95µs        ? ?/sec
smol-songs.csv: desc/tamo                                                                                1.00   818.1±15.90µs        ? ?/sec              1.00    814.1±4.58µs        ? ?/sec
smol-songs.csv: desc/thelonious monk                                                                     1.00      2.4±0.01ms        ? ?/sec              1.00      2.4±0.01ms        ? ?/sec
smol-songs.csv: prefix search/a                                                                          1.00      2.0±0.01ms        ? ?/sec              1.00      2.0±0.01ms        ? ?/sec
smol-songs.csv: prefix search/b                                                                          1.00   1513.0±7.04µs        ? ?/sec              1.00   1512.3±8.21µs        ? ?/sec
smol-songs.csv: prefix search/i                                                                          1.00      2.7±0.01ms        ? ?/sec              1.00      2.7±0.01ms        ? ?/sec
smol-songs.csv: prefix search/s                                                                          1.01  1841.5±10.63µs        ? ?/sec              1.00   1825.7±8.10µs        ? ?/sec
smol-songs.csv: prefix search/x                                                                          1.00   1072.4±8.22µs        ? ?/sec              1.00   1071.8±7.96µs        ? ?/sec
smol-songs.csv: proximity/7000 Danses Un Jour Dans Notre Vie                                             1.00   1071.1±6.69µs        ? ?/sec              1.00   1076.1±7.61µs        ? ?/sec
smol-songs.csv: proximity/The Disneyland Sing-Along Chorus                                               1.00      3.8±0.01ms        ? ?/sec              1.00      3.8±0.01ms        ? ?/sec
smol-songs.csv: proximity/Under Great Northern Lights                                                    1.00  1909.0±10.41µs        ? ?/sec              1.00   1913.7±8.99µs        ? ?/sec
smol-songs.csv: proximity/black saint sinner lady                                                        1.01      4.0±0.01ms        ? ?/sec              1.00      4.0±0.03ms        ? ?/sec
smol-songs.csv: proximity/les dangeureuses 1960                                                          1.00      3.1±0.02ms        ? ?/sec              1.01      3.2±0.01ms        ? ?/sec
smol-songs.csv: typo/Arethla Franklin                                                                    1.02   559.6±10.29µs        ? ?/sec              1.00    550.3±3.73µs        ? ?/sec
smol-songs.csv: typo/Disnaylande                                                                         1.00      2.2±0.01ms        ? ?/sec              1.01      2.2±0.01ms        ? ?/sec
smol-songs.csv: typo/dire straights                                                                      1.03      2.9±0.02ms        ? ?/sec              1.00      2.8±0.01ms        ? ?/sec
smol-songs.csv: typo/fear of the duck                                                                    1.01    519.9±2.87µs        ? ?/sec              1.00    517.2±8.56µs        ? ?/sec
smol-songs.csv: typo/indochie                                                                            1.00    184.1±1.01µs        ? ?/sec              1.00    183.2±1.09µs        ? ?/sec
smol-songs.csv: typo/indochien                                                                           1.00   1955.4±9.81µs        ? ?/sec              1.01  1972.3±10.91µs        ? ?/sec
smol-songs.csv: typo/klub des loopers                                                                    1.00    566.9±3.02µs        ? ?/sec              1.00    567.8±3.50µs        ? ?/sec
smol-songs.csv: typo/michel depech                                                                       1.02    826.0±3.81µs        ? ?/sec              1.00    813.4±3.03µs        ? ?/sec
smol-songs.csv: typo/mongus                                                                              1.02    244.8±1.96µs        ? ?/sec              1.00    240.8±1.68µs        ? ?/sec
smol-songs.csv: typo/stromal                                                                             1.01    233.1±1.24µs        ? ?/sec              1.00    230.3±1.82µs        ? ?/sec
smol-songs.csv: typo/the white striper                                                                   1.00    898.2±3.90µs        ? ?/sec              1.00    898.0±4.16µs        ? ?/sec
smol-songs.csv: typo/thelonius monk                                                                      1.00      2.3±0.01ms        ? ?/sec              1.01      2.3±0.01ms        ? ?/sec
smol-songs.csv: words/7000 Danses / Le Baiser / je me trompe de mots                                     1.01  1071.3±11.53µs        ? ?/sec              1.00   1058.5±6.27µs        ? ?/sec
smol-songs.csv: words/Bring Your Daughter To The Slaughter but now this is not part of the title         1.01      4.6±0.02ms        ? ?/sec              1.00      4.6±0.03ms        ? ?/sec
smol-songs.csv: words/The Disneyland Children's Sing-Alone song                                          1.00      4.2±0.02ms        ? ?/sec              1.00      4.2±0.01ms        ? ?/sec
smol-songs.csv: words/les liaisons dangeureuses 1793                                                     1.00      3.2±0.01ms        ? ?/sec              1.09      3.5±0.01ms        ? ?/sec
smol-songs.csv: words/seven nation mummy                                                                 1.01   1143.1±9.20µs        ? ?/sec              1.00  1128.3±10.00µs        ? ?/sec
smol-songs.csv: words/the black saint and the sinner lady and the good doggo                             1.00      3.0±0.01ms        ? ?/sec              1.01      3.0±0.01ms        ? ?/sec
smol-songs.csv: words/whathavenotnsuchforth and a good amount of words to pop to match the first one     1.00      2.2±0.01ms        ? ?/sec              1.01      2.2±0.01ms        ? ?/sec

curquiza · 2024-01-11T11:18:01Z

After discussing with Many, removing this PR from v1.6.0 since the benchmarks results are not satisfying

ManyTheFish added 2 commits December 20, 2023 14:54

CHange prefix treasholds to reduce their impact on the indexing time

8cc2bc4

Update tests

b06a7a4

curquiza added this to the v1.6.0 milestone Dec 20, 2023

Kerollmops approved these changes Dec 20, 2023

View reviewed changes

ManyTheFish marked this pull request as draft December 22, 2023 14:42

curquiza removed this from the v1.6.0 milestone Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt prefix thresholds #4273

Adapt prefix thresholds #4273

ManyTheFish commented Dec 20, 2023 •

edited

ManyTheFish commented Dec 20, 2023

meili-bot commented Dec 20, 2023

Kerollmops left a comment

dureuill commented Dec 21, 2023

ManyTheFish commented Dec 21, 2023

meili-bot commented Dec 21, 2023

curquiza commented Jan 11, 2024

Adapt prefix thresholds #4273

Are you sure you want to change the base?

Adapt prefix thresholds #4273

Conversation

ManyTheFish commented Dec 20, 2023 • edited

Explanations

Prior values and drawback

New value

Benchmarks results

Indexing Time

Database size

Search time

Related

@ManyTheFish

ManyTheFish commented Dec 20, 2023

meili-bot commented Dec 20, 2023

Kerollmops left a comment

Choose a reason for hiding this comment

dureuill commented Dec 21, 2023

ManyTheFish commented Dec 21, 2023

meili-bot commented Dec 21, 2023

curquiza commented Jan 11, 2024

ManyTheFish commented Dec 20, 2023 •

edited