Skip to content

Conversation

@hellobrett
Copy link

@hellobrett hellobrett commented Feb 8, 2022

Depending upon the timing of the container startup, the wait-for-db command may produce false positives, i.e., it will think the database is ready when it is not. In fact, I think the current implementation of wait-for-db does not work -- it only gets lucky. The connection object is lazy and must be used in order to produce the OperationalError exception. This PR calls out for a cursor object, which reliably tests that the database is functional.

Steps to Reproduce:

You can reproduce this pretty reliably by repeating the following steps a few times:

docker system prune  #<-- remove all stopped containers
docker-compose up

docker system prune will delete any cached containers (in a stopped state). In this case, the app container will sometimes start before the db container, yielding the error below. Notice in the log trace that the app_1 container is spinning up at the same time as the database and produces a false positive:

app_1  | Waiting for database...
app_1  | Database available!

Full stack trace for false positive:

db_1   | performing post-bootstrap initialization ... sh: locale: not found
db_1   | 2022-02-08 13:28:26.326 UTC [30] WARNING:  no usable system locales were found
db_1   | ok
app_1  | Waiting for database...
app_1  | Database available!
db_1   | syncing data to disk ...
db_1   | WARNING: enabling "trust" authentication for local connections
db_1   | You can change this by editing pg_hba.conf or using the option -A, or
db_1   | --auth-local and --auth-host, the next time you run initdb.
db_1   | ok
db_1   |
db_1   | Success. You can now start the database server using:
db_1   |
db_1   |     pg_ctl -D /var/lib/postgresql/data -l logfile start
db_1   |
db_1   | waiting for server to start....2022-02-08 13:28:27.202 UTC [36] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
db_1   | 2022-02-08 13:28:27.215 UTC [37] LOG:  database system was shut down at 2022-02-08 13:28:26 UTC
db_1   | 2022-02-08 13:28:27.219 UTC [36] LOG:  database system is ready to accept connections
db_1   |  done
db_1   | server started
app_1  | Traceback (most recent call last):
app_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 216, in ensure_connection
app_1  |     self.connect()
app_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 194, in connect
app_1  |     self.connection = self.get_new_connection(conn_params)
app_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
app_1  |     connection = Database.connect(**conn_params)
app_1  |   File "/usr/local/lib/python3.7/site-packages/psycopg2/__init__.py", line 130, in connect
app_1  |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
app_1  | psycopg2.OperationalError: connection to server at "db" (172.24.0.2), port 5432 failed: Connection refused
app_1  | 	Is the server running on that host and accepting TCP/IP connections?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant