-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Migrate Training Update Process to Sidekiq #5754
base: master
Are you sure you want to change the base?
[WIP] Migrate Training Update Process to Sidekiq #5754
Conversation
Hello @ragesoss, I have tried to move the process to Sidekiq it is working fine. LMKWYT if this seems feasible, I'll continue with this approach to add other features. |
Hey, I tried a few things to identify if there is a Sidekiq job already running? I am not able to figure out a clean way to do it. You suggested to handle this on the database level but if I add a boolean column to every training content, will it be a nice way to do things? |
Few of the options I tried-
|
I don't know the best approach here, but one that might work would be: Include a column for update status on TrainingLibrary and TrainingModule records, but make it able to handle more details than just a boolean state. (It might or might not make sense to include more than one column.) The idea would be to be able to keep track of whether it was not scheduled for an update, scheduled for an update that hasn't started yet, or started an update that has not successfully completed yet (with a timestamp for when it started). So the update process would update that status in the database as soon as it starts, and then we could use some logic like 'if it's been more than X amount of time since the start of an update, check whether an update process is running and if not, it probably errored'. Then, except for that case where an update failed but the status still shows it being updated, we wouldn't need to check Redis to show state to the user. |
Hi @ragesoss, Here is the working ( not written tests for them till now but it should work fine after i write tests and do minor changes accordingly )
For errors like InvalidWikiContentError & NoMatchingWikiPagesFound - all the rows in the table will have the error as this is not slug specific. After hitting on /reload_trainings a timer (currently for 15min) would run up with all the rows in training_libraries changes their update_status to 1 ( scheduled ).. as we will start to fetch content from wikimedia for a slug it will change state to 2 ( stared updating ) then after calling .inflate on that slug if no errors occur it will again default to 0 showing update complete if errors occurs it is handled by not setting state of 2 to 0 so we could catch if error occured. After 15min the timer would check for all rows if a row has 2 as update_status then it would render the error stored in update_error of that row to the user we can also check if a thread of this 15min timer is currently running to see if a update is already running? I still have a few blockers as to how to deal with yaml content should i entierly exclude it from the new workflow or just add conditions in between and make it work & how to deal with ModuleNotFound error as to where to store if this occurs. I'll work on them and add new commits as I figure out the issues. |
@ragesoss I had a doubt. |
That typically happens because you have done training updates in both wiki_education |
I wanted to ask how should these erros in prod behave? if an error is raised should it stop the workflow or that slug should be skipped and the workflow should continue. |
Hopefully such errors won't occur in production, but skipping that slug and continuing would be nice |
Hey @ragesoss, I was wondering if it is feasible to make a new table to record the errors that are raised during the background update process so we dont need to iterate on all the rows in all 3 tables to do any operations.
This would reduce the complexity of the workflow + reduce iterations over the dashboard |
I'd rather not add a table just to hold this information. If the idea would be to limit the data to just one entry per record type, perhaps a |
Okay I'll use setting records to store the data, and I'll start to write tests to test the functionality |
Hey @ragesoss |
Hey @ragesoss, I am adding a new initializer - WikiEduDashboard/config/initializers/training_update_process_cleanup.rb
This is to basically clean any leftovers from the last run ( in development ), this is working locally for me but when I push this to github I get this error -
am I missing a step or doing something wrong? |
I don't see that error in the CI log. Where during the build is it happening? |
Commit - e00b7e0 Started to after this commit |
fixes #4712
This PR aims to migrate the training update process to Sidekiq. So that there are free resources available on the main thread.