Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty error message, db-pool size surpassed #3214

Open
steve-chavez opened this issue Feb 7, 2024 · 1 comment · May be fixed by #3468
Open

Empty error message, db-pool size surpassed #3214

steve-chavez opened this issue Feb 7, 2024 · 1 comment · May be fixed by #3468
Labels
idea Needs of discussion to become an enhancement, not ready for implementation needs-repro pending reproduction QOS

Comments

@steve-chavez
Copy link
Member

steve-chavez commented Feb 7, 2024

Environment

  • PostgreSQL version: PostgreSQL 14.1
  • PostgREST version: 10.2.0
  • Operating system: Ubuntu

Problem

A running instance with a db-pool = 60 reports the following problems.

1. The authenticator role surpases db-pool connections.

Here it can be seen it reaches 71 connections.

select usename as rolname, count(*) as total_connections, sum(case when state = 'active' then 1 else 0 end) as active_connections, sum(case when state = 'idle' then 1 else 0 end) as idle_connections from pg_stat_activity group by usename;


     rolname       | total_connections | active_connections | idle_connections
---------------------+-------------------+--------------------+------------------
authenticator       |                71 |                  0 |               71

At some other time it reports 130 connections and surpasses pg max_connections:

01/Feb/2024:17:43:04 +0000: {"code":"PGRST000","details":"FATAL:  remaining connection slots are reserved for non-replication superuser connections\n","hint":null,"message":"Database connection error. Retrying the connection."}

2. PostgREST replies with an empty error message.

$ curl localhost:3000/x  -i

400 Bad Request
{ code: '', details: null, hint: null, message: '' }

A subsequent request succeeds since PostgREST recovers though.

I wasn't able to find the root cause of this. This instance in particular just went through a migration from ipv4 to ipv6 only.

However it's clear that the hasql pool is not recycling the pool connections. An empty error message shouldn't happen too.

Proposal

We need to increase the pool log traces. Log whenever the hasql pool reaper works (ref). This way at least we can find out if the pool is misbehaving. Currently it's not possible to know if the reaper fails.

Workarounds

A connection limit can be set on the authenticator role so the db-pool max is guaranteed.

alter role authenticator connection limit <db-pool + 1 for listen channel>

An idle timeout can be set in case the pool doesn't recycle idle connections.

alter role authenticator set idle_session_timeout to 1800000;

Related

@steve-chavez steve-chavez added QOS needs-repro pending reproduction idea Needs of discussion to become an enhancement, not ready for implementation labels Feb 7, 2024
@steve-chavez
Copy link
Member Author

Update on nikita-volkov/hasql-pool#43, we now have nikita-volkov/hasql-pool#45.

So with those new observations, we should be able to add two more metrics:

  • Total number of created connections (counter)
  • Number of used connections (gauge)

And then we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Needs of discussion to become an enhancement, not ready for implementation needs-repro pending reproduction QOS
Development

Successfully merging a pull request may close this issue.

1 participant