This exception is only present in "recent" version of requests,
typically not in the version distributed by Debian bullseye. Since
requests' JSONDecodeError is in general a subclass of
json.JSONDecodeError, we use the latter, but also handle the plain
ValueError (which json.JSONDecodeError is a subclass of) because
requests might use simplejson (which uses its own JSONDecodeError, also
a subclass of ValueError).
* Add `--sync-warning` and `--sync-critical`
* Add `sync_replica` to track the number of sync replica in the perf data
* Add `MEMBER-sync` to track if a member is a sync replica in the perf data
For `node_is_alive`, it seemed to be a good idea to exit with a
`CRITICAL` when the target doesn't exist. But for all the rest, UNKNOWN
(which corresponds to a configuration error) seems better.
Previously if a node wasn't reachable whe would get an UNKNOWN error.
instead of a CRITICAL error.
```
NODEISALIVE UNKNOWN - Connection failed for all provided endpoints
```
We now get the correct error.
```
NODEISALIVE CRITICAL - This node is not alive (patroni is not running). | is_alive=0;;@0
```
* Change all replica status from `running` to `streaming`
* Add an option to pytest to change the state back to `running`
* Also tests the output of the script
* Add a quick test script for live clusters
Previously, replica nodes were labeled with a `running` state. As a
result, our checks were based on nodes marked as `running` through
the `--running-[warning|critical]` options.
However, with the recent changes in Patroni 3.0.4, replica nodes now
carry a `streaming` state. This shift in terminology calls for an
adjustment in our approach. A new state, `healthy_member`, has been
introduced to encompass both `running` and `streaming` nodes.
Key Modifications:
* The existing `--running-[warning|critical]` option is now designated
as `--healthy-[warning|critical]`.
* Introduction of the `healthy_member` perfdata, which serves as the
reference point for the aforementioned options.
* Updates to documentation, help messages, and tests.
Since patroni 3.0.4, standby node nominal state is "streaming" instead
of "running". Some services need to be changed to account for that.
Reported in issue #28
Since the desired state is for there to be no restart pending state, it
makes more sense to modify the service logic so that the return code
reflects this. As a result, the test for the service `node_is_pending`
has been reversed.
* it is now possible to specify a comma separated list of endpoints
* the documentation as been updated to explain that:
+ for node services if several addresses are specified they should
point to different interfaces on the same server.
+ for cluster services several addresses should be used because we
want the cluster status so the more API we try the better our chance
of having a reply.
The checks `cluster_config_has_changed` and `node_tl_has_changed` use a
state file to store the previous value of the config hash and the
timeline.
Previously the check would fail if something changed, but the new value
would be saved directly. This behavious has changed. The new value
is saved only if `--save` is passed to the check.
The mimics the way [check_pgactivity] manages this kind of checks.
[check_pgactivity]: https://github.com/OPMDG/check_pgactivity
Stop using ctx.parent.params to get the verbose and timeout parameters
parsed in main and use ctx.obj instead.
ctx.parent.params is typed as Optional[Context] which forces us to test
if it's NULL before using it. It's useless in our case because we know
it's not empty and the resulting code is ugly.
The mypy ierror.
Item "None" of "Optional[Context]" has an attribute "params"
[union-attr]