Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add job.runAt property to expose timestamp that job is supposed to run at #1732

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

hyperair
Copy link

@hyperair hyperair commented Mar 9, 2023

Fixes: #1728

  • feat(job): add runAt field to job key in redis
  • feat(job): add runAt property to Job
  • test(job): extend job tests to check runAt correctness

@manast
Copy link
Contributor

manast commented Mar 9, 2023

Thank you for the PR.
I think that it would be better to retrieve runtAt lazily, as it is difficult to keep it in sync at all times. Since runAt can be calculated from the delayed set, we can just return it when it is actually needed, for example when calling getJobs("delayed"), maybe Job could have as well an async method getRunAt() that can be called when needed.

@hyperair
Copy link
Author

hyperair commented Mar 10, 2023

@manast While that works for jobs that are in the delayed state, it doesn't work for calculating something like a "job wait time" metric, i.e. how long after after a job's runAt that it actually starts running, because there's no timestamp for when a job was enqueued into waiting, and once the job is removed from the delayed zset, there's no score to look at either.

I don't think storing the job's runAt on the job key and keeping it in sync is any worse than the way delay is stored on the job key today, and I think the way this PR does it, it's a more useful property and stays in sync better than job.delay does.

@manast
Copy link
Contributor

manast commented Mar 16, 2023

@hyperair could we delete the "delay" field on the job then as it does not seem to be useful? Regarding runAt, I think it could be misleading as it should not represent "when" the job will be run, it represents how much the job will wait at least before it is run, or the timestamp from which it could run, but for different reasons it could run after that, so "processedOn" and "runAt" would not be the same most of the time.

@manast
Copy link
Contributor

manast commented Mar 16, 2023

@hyperair then the other question is, should "runAt" be cleared after the job has been moved from the delayed set to the wait list?

@manast
Copy link
Contributor

manast commented Mar 16, 2023

so in a sense "runAt" would be the same as the score in the delay set, just a cached value that makes it more convenient when getting the jobs.

@manast
Copy link
Contributor

manast commented Mar 16, 2023

@manast While that works for jobs that are in the delayed state, it doesn't work for calculating something like a "job wait time" metric, i.e. how long after after a job's runAt that it actually starts running, because there's no timestamp for when a job was enqueued into waiting, and once the job is removed from the delayed zset, there's no score to look at either.

Ok, this seems to be quite a niche use-case, since usually you are mostly interested in the total time it took from when the job was added to a queue up to when it was completed 🤔

@hyperair
Copy link
Author

@manast

could we delete the "delay" field on the job then as it does not seem to be useful?

Yep, it'll be an API break, but if you're okay with it I'll drop the delay field.

Regarding runAt, I think it could be misleading as it should not represent "when" the job will be run, it represents how much the job will wait at least before it is run, or the timestamp from which it could run, but for different reasons it could run after that, so "processedOn" and "runAt" would not be the same most of the time.

Hmm, that's true. Will one of the following work, or do you have suggestions for a better name?

  • delayedUntil -- might not be good either since it implies that the job is delayed, which may not be the case for a job that was added without delay
  • notBefore
  • scheduledFor

then the other question is, should "runAt" be cleared after the job has been moved from the delayed set to the wait list?
so in a sense "runAt" would be the same as the score in the delay set, just a cached value that makes it more convenient when getting the jobs.

Nope, the idea is for runAt (or alternative name) to be the timestamp from which a job should be available for execution

  • For delayed -> waiting -> active, it'll be the same as the delayedTimestamp encoded in the score of the job
  • For job.changeDelay or job.moveToDelayed, same as above
  • For waiting -> active, i.e. jobs scheduled without delay, it should be the same as job.timestamp
  • For (completed|failed) -> waiting -> active, it should be the timestamp at which job.retry() was called

Ok, this seems to be quite a niche use-case, since usually you are mostly interested in the total time it took from when the job was added to a queue up to when it was completed 🤔

In my case I'm generating two timing metrics for this:

  • job_wait_time -- the time a job spent waiting in queue, excluding the time before runAt (useful for troubleshooting queue backlog, e.g. insufficient workers to process the jobs, or jobs not getting promoted in a timely fashion..)
  • job_execution_time -- the time a job spent actually executing the job (useful for troubleshooting other kinds of trouble, e.g. slow API calls, other kinds of bugs)

A single combined metric summing the above two durations up makes it more difficult to differentiate between these two kinds of problems, but even that's not really possible to generate with bullmq master right now.

@hyperair
Copy link
Author

could we delete the "delay" field on the job then as it does not seem to be useful?

Yep, it'll be an API break, but if you're okay with it I'll drop the delay field.

As it turns out, there's one purpose for delay that runAt doesn't cover -- updateParentDepsIfNeeded uses it to schedule the parent into delayed state instead of waiting

@Riley-Brown
Copy link

This would be a very useful addition for me as I cannot accurately tell when a job will after calling job.changeDelay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature request] A way to get the timestamp that a delayed job is supposed to run
3 participants