Commit graph

29 commits

Author SHA1 Message Date
benoit e0589b97a8 Black run 2024-02-27 11:29:52 +01:00
benoit 364a385a2f Fix cluster_has_leader in archive recovery tests
Since replication states are also over-ridden for standby_leaders since
the commit fixing cluster_node_count, the tests had to be adapted.
2024-01-09 06:50:00 +01:00
benoit 78ef0f6ada Fix cluster_node_count's management of replication states
The service now supports the `streaming` state.

Since we dont check for lag or timeline in this service, a healthy node
is :

* leader : in a running state
* standby_leader : running (pre Patroni 3.0.4), streaming otherwise
* standby & sync_standby : running (pre Patroni 3.0.4), streaming otherwise

Updated the tests for this service.
2024-01-09 06:50:00 +01:00
benoit 46db3e2d15 Fix the cluster_has_leader service for standby clusters
Before this patch we checked the expected standby leader state
was `running` for all versions of Patroni.

With this patch, for:
* Patroni < 3.0.4, standby leaders are in `running` state.
* Patroni >= 3.0.4, standby leaders can be in `streaming` or `in
archive recovey` state. We will raise a warning for the latter.

The tests where modified to account for this.

Co-authored-by: Denis Laxalde <denis@laxalde.org>
2023-12-18 13:17:37 +01:00
benoit 8d6b8502b6 cluster_has_replica: fix the way a healthy replica is detected
For patroni >= version 3.0.4:
* the role is `replica` or `sync_standby`
* the state is `streaming` or `in archive recovery`
* the timeline is the same as the leader
* the lag is lower or equal to `max_lag`

For prio versions of patroni:
* the role is `replica` or `sync_standby`
* the state is `running`
* the timeline is the same as the leader
* the lag is lower or equal to `max_lag`

Additionnally, we now display the timeline in the perfstats. We also try
to display the perf stats of unhealthy replica as much as possible.

Update tests for cluster_has_replica:
* Fix the tests to make them work with the new algotithm
* Add a specific test for tl divergences
2023-11-11 10:50:35 +01:00
Denis Laxalde a8c4a3125d Work around nagiosplugin issue about stdout in tests
We basically apply the change from
https://github.com/mpounsett/nagiosplugin/issues/24 as a fixture, but
only when nagiosplugin's version is old.
2023-10-13 11:45:39 +02:00
Denis Laxalde 4035f1a3da Add compat for old pytest in type hints 2023-10-13 11:45:39 +02:00
Denis Laxalde 903b83e211 Use fake HTTP server for the Patroni API in tests
We introduce a patroni_api fixture, defined in tests/conftest.py, which
sets up an HTTP server serving files in a temporary directory. The
server is itself defined by the PatroniAPI class; it has a 'routes()'
context manager method to be used in actual tests to setup expected
responses based on specified JSON files.

We set up some logging in order to improve debugging.

The direct advantage of this is that PatroniResource.rest_api() method
is now covered by the test suite.

Coverage before this commit:

  Name                        Stmts   Miss  Cover
  -----------------------------------------------
  check_patroni/__init__.py       3      0   100%
  check_patroni/cli.py          193     18    91%
  check_patroni/cluster.py      113      0   100%
  check_patroni/convert.py       23      5    78%
  check_patroni/node.py         146      1    99%
  check_patroni/types.py         50     23    54%
  -----------------------------------------------
  TOTAL                         528     47    91%

and after this commit:

  Name                        Stmts   Miss  Cover
  -----------------------------------------------
  check_patroni/__init__.py       3      0   100%
  check_patroni/cli.py          193     18    91%
  check_patroni/cluster.py      113      0   100%
  check_patroni/convert.py       23      5    78%
  check_patroni/node.py         146      1    99%
  check_patroni/types.py         50      9    82%
  -----------------------------------------------
  TOTAL                         528     33    94%

In actual test functions, we either invoke patroni_api.routes() to
configure which JSON file(s) should be served for each endpoint, or we
define dedicated fixtures (e.g. cluster_config_has_changed()) to
configure this for several test functions or the whole module.

The 'old_replica_state' parametrized fixture is used when needed to
adjust such fixtures, e.g. in cluster_has_replica_ok(), to modify the
JSON content using cluster_api_set_replica_running() (previously in
tests/tools.py, now in tests/__init__.py).

The dependency on pytest-mock is no longer needed.
2023-10-06 10:40:29 +02:00
Denis Laxalde 34f576ea0f Turn --use-old-replica-state into a parametrized fixture
Instead of requiring the user to run the test suite with and without the
--use-old-replica-state flag, we introduce an 'old_replica_state()'
parametrized fixture that is used only when needed (i.e. in
test_cluster_{has_replica,node_count}.py).
2023-10-06 10:33:04 +02:00
Denis Laxalde ea92809cb3 Introduce a 'runner' test fixture
Instead of defining the CliRunner value in each test, we use a fixture.
The CliRunner is also configured with stdout and stderr separated
because mixing them will pose problem if we use stderr for other
purposes in tests, e.g. to emit log messages from a forth-coming HTTP
server.
2023-10-03 09:54:13 +02:00
Denis Laxalde d34e597e61 Use the tmp_path fixture instead of writing files to tests/ 2023-10-03 09:54:13 +02:00
Denis Laxalde bc2d2917c3 Introduce a fake_restapi test fixture
This fixture itself uses the 'use_old_replica_state' fixture, so that
it's no longer needed to use it explicitly in test functions.
2023-10-03 09:54:13 +02:00
Denis Laxalde c3cdb8cdd4 Set a default value to status parameter of my_mock in tests
Most of the times, it's 200, so the default value simplifies usage in
actual tests.
2023-10-03 09:54:13 +02:00
Denis Laxalde 123c300911 Add type hints in tests/conftest.py 2023-10-03 09:54:13 +02:00
benoit ee3837fab1 Add info and options about sync standby to cluster_has_replica
* Add `--sync-warning` and `--sync-critical`
* Add `sync_replica` to track the number of sync replica in the perf data
* Add `MEMBER-sync` to track if a member is a sync replica in the perf data
2023-08-24 17:34:08 +02:00
benoit 259f04587b Add a node_is_leader service to check for the leader states
It's possible to check for any kind of leader of specifically for a
standby leader.
2023-08-23 18:22:49 +02:00
benoit 8883d6bdc4 Add standby-leader as a valid leader type for cluster_has_leader 2023-08-23 18:22:49 +02:00
benoit 46dd431775 Fix test for node_is_primary 2023-08-23 15:41:15 +02:00
benoit 2bcddf9f87 Add a --is-sync and --is-async to node_is_replica 2023-08-23 15:41:15 +02:00
benoit 99bf1c5bb5 Add new service cluster_has_scheduled_action 2023-08-23 12:08:19 +02:00
benoit d99faeba15 Add tests for the output of the script and support pre/post 3.0.4
* Change all replica status from `running` to `streaming`
* Add an option to pytest to change the state back to `running`
* Also tests the output of the script
* Add a quick test script for live clusters
2023-08-23 10:53:09 +02:00
benoit 77722f40c1 Fix liveness check
The liveness probe used to return something. It looks like it doesn't do
it anymore and it breaks the `node_is_alive` check.

Issue: #31
2023-08-21 15:13:22 +02:00
benoit a01a535680 Add sync_standby as an acceptable state for a replica 2023-08-21 13:11:08 +02:00
benoit 021b572e53 Redefining cluster_node_count using Patroni 3.0.4's new status indicators
Previously, replica nodes were labeled with a `running` state. As a
result, our checks were based on nodes marked as `running` through
the `--running-[warning|critical]` options.

However, with the recent changes in Patroni 3.0.4, replica nodes now
carry a `streaming` state. This shift in terminology calls for an
adjustment in our approach. A new state, `healthy_member`, has been
introduced to encompass both `running` and `streaming` nodes.

Key Modifications:

* The existing `--running-[warning|critical]` option is now designated
  as `--healthy-[warning|critical]`.
* Introduction of the `healthy_member` perfdata, which serves as the
  reference point for the aforementioned options.
* Updates to documentation, help messages, and tests.
2023-08-21 11:59:55 +02:00
benoit df744bf7dc Use isort to automatically sort imports 2023-03-20 14:56:11 +01:00
benoit 7256c1894a Fix tests for the urllib3 to requests change 2023-03-20 12:25:32 +01:00
benoit 275901006b Add spellcheck + tox in requirements-dev.txt 2023-03-02 17:32:18 +01:00
benoit 908669f073 Add a --save option when state files are used
The checks `cluster_config_has_changed` and `node_tl_has_changed` use a
state file to store the previous value of the config hash and the
timeline.

Previously the check would fail if something changed, but the new value
would be saved directly. This behavious has changed. The new value
is saved only if `--save` is passed to the check.

The mimics the way [check_pgactivity] manages this kind of checks.

[check_pgactivity]: https://github.com/OPMDG/check_pgactivity
2023-03-02 17:32:18 +01:00
benoit 8519416c11 Rename test to tests 2022-07-11 12:42:59 +02:00