Commit graph

10 commits

Author SHA1 Message Date
benoit 78ef0f6ada Fix cluster_node_count's management of replication states
The service now supports the `streaming` state.

Since we dont check for lag or timeline in this service, a healthy node
is :

* leader : in a running state
* standby_leader : running (pre Patroni 3.0.4), streaming otherwise
* standby & sync_standby : running (pre Patroni 3.0.4), streaming otherwise

Updated the tests for this service.
2024-01-09 06:50:00 +01:00
benoit 46db3e2d15 Fix the cluster_has_leader service for standby clusters
Before this patch we checked the expected standby leader state
was `running` for all versions of Patroni.

With this patch, for:
* Patroni < 3.0.4, standby leaders are in `running` state.
* Patroni >= 3.0.4, standby leaders can be in `streaming` or `in
archive recovey` state. We will raise a warning for the latter.

The tests where modified to account for this.

Co-authored-by: Denis Laxalde <denis@laxalde.org>
2023-12-18 13:17:37 +01:00
benoit 8d6b8502b6 cluster_has_replica: fix the way a healthy replica is detected
For patroni >= version 3.0.4:
* the role is `replica` or `sync_standby`
* the state is `streaming` or `in archive recovery`
* the timeline is the same as the leader
* the lag is lower or equal to `max_lag`

For prio versions of patroni:
* the role is `replica` or `sync_standby`
* the state is `running`
* the timeline is the same as the leader
* the lag is lower or equal to `max_lag`

Additionnally, we now display the timeline in the perfstats. We also try
to display the perf stats of unhealthy replica as much as possible.

Update tests for cluster_has_replica:
* Fix the tests to make them work with the new algotithm
* Add a specific test for tl divergences
2023-11-11 10:50:35 +01:00
benoit 259f04587b Add a node_is_leader service to check for the leader states
It's possible to check for any kind of leader of specifically for a
standby leader.
2023-08-23 18:22:49 +02:00
benoit 8883d6bdc4 Add standby-leader as a valid leader type for cluster_has_leader 2023-08-23 18:22:49 +02:00
benoit 99bf1c5bb5 Add new service cluster_has_scheduled_action 2023-08-23 12:08:19 +02:00
benoit d99faeba15 Add tests for the output of the script and support pre/post 3.0.4
* Change all replica status from `running` to `streaming`
* Add an option to pytest to change the state back to `running`
* Also tests the output of the script
* Add a quick test script for live clusters
2023-08-23 10:53:09 +02:00
benoit 77722f40c1 Fix liveness check
The liveness probe used to return something. It looks like it doesn't do
it anymore and it breaks the `node_is_alive` check.

Issue: #31
2023-08-21 15:13:22 +02:00
benoit a01a535680 Add sync_standby as an acceptable state for a replica 2023-08-21 13:11:08 +02:00
benoit 8519416c11 Rename test to tests 2022-07-11 12:42:59 +02:00