Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,20 @@ jobs:
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}

alert-test:
name: Test Prometheus Alert Rules
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Checkout repo
uses: actions/checkout@v5
- name: Install prometheus snap
run: sudo snap install prometheus
- name: Check validity of prometheus alert rules
run: promtool check rules src/prometheus_alert_rules/*
- name: Run unit tests for prometheus alert rules
run: promtool test rules tests/alerts/*.yaml

build:
name: Build charm
uses: canonical/data-platform-workflows/.github/workflows/[email protected]
Expand Down
11 changes: 10 additions & 1 deletion docs/reference/alert-rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,17 @@ This page contains a markdown version of the alert rules described in the `postg
| `PatroniPostgresqlDown` | ![critical] | Patroni PostgreSQL instance is down.<br>Check for errors in the Loki logs. |
| `PatroniHasNoLeader` | ![critical] | Patroni instance has no leader node.<br>A leader node (neither primary nor standby) cannot be found inside a cluster.<br>Check for errors in the Loki logs. |

## `PgbackrestExporterK8s`

| Alert | Severity | Notes |
| ----- | -------- | ----- |
| `PgBackRestBackupError` | ![critical] | Backup failed for a stanza.<br>The last pgBackRest backup ended with error status > 0.<br>Check the pgBackRest logs for the stanza. |
| `PgBackRestBackupTooOld` | ![warning] | No recent backup available.<br>The last pgBackRest backup is older than 7 days.<br>Consider checking your backup schedule, capacity, and logs. |
| `PgBackRestStanzaError` | ![warning] | A stanza has reported errors.<br>Status > 0 indicates problems such as missing stanza path or no valid backups.<br>Check pgBackRest logs for details. |
| `PgBackRestRepoError` | ![warning] | A repository has reported errors.<br>Status > 0 indicates the repo may be inaccessible, out of space, or otherwise unhealthy.<br>Check pgBackRest logs and storage system. |
| `PgBackRestExporterError` | ![critical] | The pgBackRest exporter failed to fetch data.<br>Metric `pgbackrest_exporter_status == 0` indicates exporter-side issues.<br>This may be a misconfiguration or runtime error; check exporter logs. |

<!-- Badges -->
[info]: https://img.shields.io/badge/info-blue
[warning]: https://img.shields.io/badge/warning-yellow
[critical]: https://img.shields.io/badge/critical-red

18 changes: 17 additions & 1 deletion src/charm.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@
MONITORING_USER,
PATRONI_PASSWORD_KEY,
PEER,
PGBACKREST_METRICS_PORT,
PLUGIN_OVERRIDES,
POSTGRES_LOG_FILES,
REPLICATION_PASSWORD_KEY,
Expand Down Expand Up @@ -213,6 +214,7 @@ def __init__(self, *args):
self.pgbackrest_server_service = "pgbackrest server"
self.ldap_sync_service = "ldap-sync"
self.metrics_service = "metrics_server"
self.pgbackrest_metrics_service = "pgbackrest_metrics_service"
self._unit = self.model.unit.name
self._name = self.model.app.name
self._namespace = self.model.name
Expand Down Expand Up @@ -299,6 +301,7 @@ def _generate_metrics_jobs(self, enable_tls: bool) -> dict:
"""Generate spec for Prometheus scraping."""
return [
{"static_configs": [{"targets": [f"*:{METRICS_PORT}"]}]},
{"static_configs": [{"targets": [f"*:{PGBACKREST_METRICS_PORT}"]}]},
{
"static_configs": [{"targets": ["*:8008"]}],
"scheme": "https" if enable_tls else "http",
Expand Down Expand Up @@ -1805,7 +1808,7 @@ def _generate_ldap_service(self) -> dict:
}

def _generate_metrics_service(self) -> dict:
"""Generate the metrics service definition."""
"""Generate the postgresql metrics service definition."""
return {
"override": "replace",
"summary": "postgresql metrics exporter",
Expand All @@ -1827,6 +1830,18 @@ def _generate_metrics_service(self) -> dict:
},
}

def _generate_pgbackrest_metrics_service(self) -> dict:
"""Generate the pgbackrest metrics service definition."""
return {
"override": "replace",
"summary": "pgbackrest metrics exporter",
"command": "/usr/bin/pgbackrest_exporter",
Copy link
Contributor Author

@Deezzir Deezzir Sep 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use /start-pgbackrest-exporter.sh at the moment because the current script is not prepared to be run in a non-snapped environment, reference, like it is done for start-exporter.sh script.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. Could you create an issue in the repo for us to address this later? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"startup": "enabled",
"after": [self.postgresql_service],
"user": WORKLOAD_OS_USER,
"group": WORKLOAD_OS_GROUP,
}

def _postgresql_layer(self) -> Layer:
"""Returns a Pebble configuration layer for PostgreSQL."""
pod_name = self._unit_name_to_pod_name(self._unit)
Expand Down Expand Up @@ -1871,6 +1886,7 @@ def _postgresql_layer(self) -> Layer:
"startup": "disabled",
},
self.metrics_service: self._generate_metrics_service(),
self.pgbackrest_metrics_service: self._generate_pgbackrest_metrics_service(),
self.rotate_logs_service: {
"override": "replace",
"summary": "rotate logs",
Expand Down
1 change: 1 addition & 0 deletions src/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
WORKLOAD_OS_GROUP = "postgres"
WORKLOAD_OS_USER = "postgres"
METRICS_PORT = "9187"
PGBACKREST_METRICS_PORT = "9854"
POSTGRESQL_DATA_PATH = "/var/lib/postgresql/data/pgdata"
POSTGRESQL_LOGS_PATH = "/var/log/postgresql"
POSTGRESQL_LOGS_PATTERN = "postgresql*.log"
Expand Down
Loading
Loading