Skip to content

Conversation

josecelano
Copy link
Member

@josecelano josecelano commented Nov 16, 2023

After discussing how to implement a proper container HEALTHCHECK for the Index this final implementation:

  • Doesn't use curl, wget or other system dependency to avoid maintaining that dependency and security attacks. It uses a new binary that you can use like this: cargo run --bin health_check http://localhost:PORT/health_check.
  • Adds an API to the tracker statistics importer so that other processes can communicate with the cronjob.
  • Exposes a /health_check endpoint in both APIs: the Index API and the Importer API (console cronjob).
  • The endpoint for the Index core API is public but the one for the cronjob it runs only on the localhost.

You can run the health_check from inside the container with the following commands:

For the API:

/ # /usr/bin/health_check http://localhost:3001/health_check
Health check ...
STATUS: 200 OK

For the Importer:

/ # /usr/bin/health_check http://localhost:3002/health_check
Health check ...
STATUS: 200 OK

NOTES

  • The Index application has two services: the rest API and the tracker statistic importer. The healthcheck should check both. The API service should not know the "importer".
  • For the time being, we do not check the services properly. We only ensure that they are running. In the future, we can add more checks to the health_check handlers like checking database connection, SMTP server, connection to the tracker, etcetera. However, I don't know if we should check those things here. This healthcheck is only to ensure the container's processes are up and running. Maybe it's not the right place to check its dependencies. Those dependencies can fail at runtime. Maybe a better approach would be to monitor each service independently.
  • The new /health_check endpoint could be used for monitoring.
  • Right now, It's only used to wait for the container to be healthy after docker compose up.

@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from 5c71143 to aa08bb2 Compare November 16, 2023 17:18
@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from aa08bb2 to 307b41d Compare November 16, 2023 17:33
@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from 307b41d to 02f8bf5 Compare November 16, 2023 17:37
@josecelano josecelano marked this pull request as draft November 16, 2023 17:37
@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from 02f8bf5 to 9aac0c9 Compare November 16, 2023 17:41
@josecelano
Copy link
Member Author

josecelano commented Nov 16, 2023

This is how it looks while starting:

CONTAINER ID   IMAGE                       COMMAND                  CREATED         STATUS                            PORTS                                                                                                                             NAMES
b0496f94ce2f   torrust-index               "/usr/local/bin/entr…"   5 seconds ago   Up 3 seconds (health: starting)   0.0.0.0:3001-3002->3001-3002/tcp, :::3001-3002->3001-3002/tcp                                                                     torrust-index-1
2387a1402423   torrust/tracker:develop     "/usr/local/bin/entr…"   5 seconds ago   Up 3 seconds                      0.0.0.0:1212->1212/tcp, :::1212->1212/tcp, 0.0.0.0:7070->7070/tcp, :::7070->7070/tcp, 0.0.0.0:6969->6969/udp, :::6969->6969/udp   torrust-tracker-1
147594c5c442   mysql:8.0                   "docker-entrypoint.s…"   5 seconds ago   Up 4 seconds (healthy)            0.0.0.0:3306->3306/tcp, :::3306->3306/tcp, 33060/tcp                                                                              torrust-mysql-1
8a3d58ce84aa   dockage/mailcatcher:0.8.2   "entrypoint mailcatc…"   5 seconds ago   Up 4 seconds                      0.0.0.0:1025->1025/tcp, :::1025->1025/tcp, 0.0.0.0:1080->1080/tcp, :::1080->1080/tcp                                              torrust-mailcatcher-1

And this is when is healthy:

CONTAINER ID   IMAGE                       COMMAND                  CREATED          STATUS                    PORTS                                                                                                                             NAMES
82ec9d15fb87   torrust-index               "/usr/local/bin/entr…"   21 seconds ago   Up 20 seconds (healthy)   0.0.0.0:3001-3002->3001-3002/tcp, :::3001-3002->3001-3002/tcp                                                                     torrust-index-1
858c668d3cab   torrust/tracker:develop     "/usr/local/bin/entr…"   21 seconds ago   Up 20 seconds             0.0.0.0:1212->1212/tcp, :::1212->1212/tcp, 0.0.0.0:7070->7070/tcp, :::7070->7070/tcp, 0.0.0.0:6969->6969/udp, :::6969->6969/udp   torrust-tracker-1
b05a2efef1a5   mysql:8.0                   "docker-entrypoint.s…"   21 seconds ago   Up 20 seconds (healthy)   0.0.0.0:3306->3306/tcp, :::3306->3306/tcp, 33060/tcp                                                                              torrust-mysql-1
647055d30674   dockage/mailcatcher:0.8.2   "entrypoint mailcatc…"   21 seconds ago   Up 20 seconds             0.0.0.0:1025->1025/tcp, :::1025->1025/tcp, 0.0.0.0:1080->1080/tcp, :::1080->1080/tcp   

@josecelano
Copy link
Member Author

josecelano commented Nov 16, 2023

The solution works, but in order to run tests with independent ports, I need to add a new option in the settings:

#...
[net]
port = 3001
#...
[health_check]
port = 3002
#...

Because we need to bind the service to different ports for each test to avoid port collisions. E2E tests work because they use only one shared env.

I would prefer to add a healthcheck endpoint to the API, but since we also have the "statistics importer", I don't like to check if the importer is working from inside the API. Unless both services (The API and the importer) use another service to report their status. We could inject that service into both services.

For the time being, I think prefer to continue with the current solution because I think API and importer should not be coupled anyway. The health_check function can ask both the API and the importer their states.

@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from c3eedc0 to c094557 Compare November 17, 2023 13:32
@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from c094557 to 05d7ac9 Compare November 17, 2023 13:39
@josecelano
Copy link
Member Author

The solution works, but in order to run tests with independent ports, I need to add a new option in the settings:

#...
[net]
port = 3001
#...
[health_check]
port = 3002
#...

Because we need to bind the service to different ports for each test to avoid port collisions. E2E tests work because they use only one shared env.

I would prefer to add a healthcheck endpoint to the API, but since we also have the "statistics importer", I don't like to check if the importer is working from inside the API. Unless both services (The API and the importer) use another service to report their status. We could inject that service into both services.

For the time being, I think prefer to continue with the current solution because I think API and importer should not be coupled anyway. The health_check function can ask both the API and the importer their states.

Finally, instead of having one new API only for the health checker service I decided to add an API to the importer (cronjob). So we have two APIs in the Index:

  • The core API.
  • The importer API (cronjob). It only has the health_check endpoint.

The reason why I changed my mind is we need to implement a communication mechanism between the cronjob and the health checker API, that couples the importer to the checker, so we would need anyway an API for the importer (I discarded other solutions because they seemed to be too complex). At this point, I prefer both services to have their own health checker and expose the status via an API endpoint. This way they are decoupled.

I still have to add a new configuration option for the importer with the API port. Tests are failing because they are trying to bind the socket to the same address 3002. We need to add the config option and bind to free ports when we create ephemeral configurations for isolated servers.

@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from 05d7ac9 to a392ea1 Compare November 17, 2023 13:50
@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from f18bad5 to 2a2d625 Compare November 17, 2023 15:31
@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from 2a2d625 to af06cfb Compare November 17, 2023 15:51
@josecelano josecelano marked this pull request as ready for review November 17, 2023 15:51
@josecelano
Copy link
Member Author

josecelano commented Nov 17, 2023

ACK 0e8b32b

@codecov-commenter
Copy link

codecov-commenter commented Nov 17, 2023

Codecov Report

Attention: 44 lines in your changes are missing coverage. Please review.

Comparison is base (04b70ba) 42.05% compared to head (0e8b32b) 42.27%.

Files Patch % Lines
src/console/tracker_statistics_importer.rs 42.37% 34 Missing ⚠️
src/app.rs 0.00% 7 Missing ⚠️
src/web/api/v1/routes.rs 40.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #395      +/-   ##
===========================================
+ Coverage    42.05%   42.27%   +0.21%     
===========================================
  Files           79       80       +1     
  Lines         4829     4887      +58     
===========================================
+ Hits          2031     2066      +35     
- Misses        2798     2821      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@josecelano josecelano added Continuous Integration Workflows and Automation - Developer - Torrust Improvement Experience Quality & Assurance Relates to QA, Testing, and CI labels Nov 17, 2023
For both services:

- The Index API
- The Tracker Stattistics Importer (console cronjob)

An API was added for the Importer. The Importer API has two endpoints:

- GET /health_check -> to get the crobjob status. It only checks that
  is running periodically.
- POST /heartbeat -> used by the cronjob to inform it's alive.

And a new endpoint was added the the Index API:

- GET /health_check -> to check API responsiveness.
@josecelano josecelano linked an issue Nov 17, 2023 that may be closed by this pull request
@josecelano josecelano force-pushed the issue-384-add-index-container-healthcheck branch from af06cfb to 0e8b32b Compare November 17, 2023 16:10
@josecelano
Copy link
Member Author

Hi @da2ce7, this is ready for review. I can do the same for the Tracker. Although adding an API to the importer seems to be too much, the API is just a few lines and enormously increases the service's observability. It can be helpful for other things in the future. Some examples:

  • We can easily decouple the importer into an independent container.
  • We could monitor the service if we make the endpoint public.

@josecelano josecelano merged commit fa3bba7 into torrust:develop Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

- Developer - Torrust Improvement Experience Continuous Integration Workflows and Automation Quality & Assurance Relates to QA, Testing, and CI

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Implement health check for the index application

2 participants