Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an option to opt-out of automatic Redis connection closing on worker.close #2437

Closed
kibertoad opened this issue Feb 20, 2024 · 6 comments · May be fixed by #2449
Closed

Provide an option to opt-out of automatic Redis connection closing on worker.close #2437

kibertoad opened this issue Feb 20, 2024 · 6 comments · May be fixed by #2449

Comments

@kibertoad
Copy link

kibertoad commented Feb 20, 2024

Is your feature request related to a problem? Please describe.

Currently it is easy to get Error: Connection is closed error when Redis instance is reused across multiple workers and is explicitly managed by the application. In case you close all the workers and try to close the Redis connection, you will get Error: Connection is closed, because BullMQ has already closed the connection by itself. It is also a potentially unwanted behaviour, because sometimes you may only stop one worker, but not destroy the connection for all other workers.

Describe the solution you'd like

Extra flag on worker.close() method that would avoid terminating the Redis connection/closing client.

Describe alternatives you've considered

New method worker.stop() which does the same thing as worker.close() without terminating the Redis connection. If this is introduced, probably worker.close() should start invoking worker.stop as well to reduce the duplication.

If this is accepted, we can send a PR with the change.

This should also potentially address #2402

@manast
Copy link
Contributor

manast commented Feb 25, 2024

Do you have an actual test snippet that demonstrates the issue?

@kibertoad
Copy link
Author

kibertoad commented Feb 25, 2024

@manast The error itself that we are observing is that there are tons of entries like this in logs for our automated tests:

   Error: Connection is closed.
      at close (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/redis/event_handler.js:189:25)
      at Socket.<anonymous> (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/redis/event_handler.js:156:20)
      at Object.onceWrapper (node:events:633:26)
      at Socket.emit (node:events:518:28)
      at Socket.emit (node:domain:488:12)
      at TCP.<anonymous> (node:net:337:12)
  Error: Connection is closed.
      at close (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/redis/event_handler.js:189:25)
      at Socket.<anonymous> (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/redis/event_handler.js:156:20)
      at Object.onceWrapper (node:events:633:26)
      at Socket.emit (node:events:518:28)
      at Socket.emit (node:domain:488:12)
      at TCP.<anonymous> (node:net:337:12)
  Error: Connection is closed.
      at EventEmitter.sendCommand (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Redis.js:332:28)
      at Script.execute (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Script.js:59:26)
      at EventEmitter.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/utils/Commander.js:111:27)
      at ScriptsPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/packages/app/bullmq-pro/src/classes/scripts-pro.ts:142:40)
      at processTicksAndRejections (node:internal/process/task_queues:95:5)
      at WorkerPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/bullmq/src/classes/worker.ts:518:7)
  Error: Connection is closed.
      at EventEmitter.sendCommand (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Redis.js:332:28)
      at Script.execute (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Script.js:59:26)
      at EventEmitter.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/utils/Commander.js:111:27)
      at ScriptsPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/packages/app/bullmq-pro/src/classes/scripts-pro.ts:142:40)
      at processTicksAndRejections (node:internal/process/task_queues:95:5)
      at WorkerPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/bullmq/src/classes/worker.ts:518:7)
  Error: Connection is closed.
      at EventEmitter.sendCommand (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Redis.js:332:28)
      at Script.execute (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Script.js:59:26)
      at EventEmitter.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/utils/Commander.js:111:27)
      at ScriptsPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/packages/app/bullmq-pro/src/classes/scripts-pro.ts:142:40)
      at processTicksAndRejections (node:internal/process/task_queues:95:5)
      at WorkerPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/bullmq/src/classes/worker.ts:518:7)
  Error: Connection is closed.
      at EventEmitter.sendCommand (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Redis.js:332:28)
      at Script.execute (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Script.js:59:26)
      at EventEmitter.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/utils/Commander.js:111:27)
      at ScriptsPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/packages/app/bullmq-pro/src/classes/scripts-pro.ts:142:40)
      at processTicksAndRejections (node:internal/process/task_queues:95:5)
      at WorkerPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/bullmq/src/classes/worker.ts:518:7)
  Error: Connection is closed.
      at EventEmitter.sendCommand (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Redis.js:332:28)
      at Script.execute (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/Script.js:59:26)
      at EventEmitter.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/ioredis/built/utils/Commander.js:111:27)
      at ScriptsPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/packages/app/bullmq-pro/src/classes/scripts-pro.ts:142:40)
      at processTicksAndRejections (node:internal/process/task_queues:95:5)
      at WorkerPro.moveToActive (/home/runner/_work/autopilot/autopilot/out/node_modules/bullmq/src/classes/worker.ts:518:7)

My assumption was that it's due to one worker closing connection for all other workers, but based on what you've said in taskforcesh/bullmq-proxy#7 (comment), it seems that they should all be fully independent. If each worker is owning their own connection, what might cause it to have Connection closed error? I would assume that if worker itself was stopped, it shouldn't try to do moveToActive anymore.

Surprisingly the issue seems to go away if we are closing the parent Redis connection before we start closing any workers, which seems counter-intuitive. We use this defensive check:

	public async dispose(): Promise<void> {
		if (this.redis.status === 'end') {
			return
		}

		// On test forcing the worker to close to not wait for current job to finish
		await this.worker?.close(this.config.isTest)
		await this.queue?.close()
	}

So it seems that dispose works correctly only when workers are not closed explicitly at all (because if parent redis is closed, per-worker connections won't be closed).

I'll try to create an isolated reproduction, but I'm not sure it's easy to get in a small amount of simple tests.

(this is with BullMQ Pro 6.9.6 / BullMQ 4.17.0)

@manast
Copy link
Contributor

manast commented Feb 25, 2024

I do not think that when you duplicate an ioredis connection a parent-child relationship is established, so it should not really matter in which order they are closed.

@kibertoad
Copy link
Author

@manast Surprisingly, majority of these errors are gone now that we have updated to bullmq-pro 7.0.0 and bumped bullmq to the latest version.

the problem itself reminds a lot this: #2385 (comment)

Wonder why throw this error in the first place, is there a real use-case for it? Just not doing anything silently sounds like a generally more useful approach, as the end goal - connection is closed - is achieved.

@manast
Copy link
Contributor

manast commented Mar 15, 2024

@kibertoad there is no optimal solution to this problem. We cannot know why ioredis could generate an exception when we try to close it. In this particular case it happens because we are closing before the connection was stablished, but there could be many other cases.

@kibertoad
Copy link
Author

Got it! Closing as addressed then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants