Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for dynamic chunk size #1224

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

morgo
Copy link
Contributor

@morgo morgo commented Dec 7, 2022

A Pull Request should be associated with an Issue.

We wish to have discussions in Issues. A single issue may be targeted by multiple PRs.
If you're offering a new feature or fixing anything, we'd like to know beforehand in Issues,
and potentially we'll be able to point development in a particular direction.

Related issue: #1204

Further notes in https://github.com/github/gh-ost/blob/master/.github/CONTRIBUTING.md
Thank you! We are open to PRs, but please understand if for technical reasons we are unable to accept each and any PR

Description

This PR dynamically adjusts the chunkSize to be based on execution time feedback. It is disabled by default for backward compatibility, but should be considered a default for future releases. As well as improving performance (my motivation) it is a little bit safer than a static chunkSize considering wide-tables with a lot of secondary indexes.

There are also some guard rails in place to make sure if the observed execution time is 5x the target, it quickly scales down the chunkSize (which is not a problem, as it will later scale back up if this was a temporary blip).

Here's what I observed with some sample tables:

Test Default Chunk Size (1000) Dynamic Chunk Size (default 50ms target)
Typical Table (stock) 1m10s(copy) 58s(copy)
Skinny Table (skinnytable) 38s(copy) 30s(copy)

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

@meiji163
Copy link
Contributor

meiji163 commented Dec 7, 2023

This looks promising, I'll do some testing on larger tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants