New upstream version 2.0.0
This commit is contained in:
commit
c52e34116d
3
.coveragerc
Normal file
3
.coveragerc
Normal file
|
@ -0,0 +1,3 @@
|
||||||
|
[run]
|
||||||
|
include =
|
||||||
|
check_patroni/*
|
2
.gitignore
vendored
2
.gitignore
vendored
|
@ -1,10 +1,10 @@
|
||||||
__pycache__/
|
__pycache__/
|
||||||
check_patroni.egg-info
|
check_patroni.egg-info
|
||||||
tests/*.state_file
|
|
||||||
tests/config.ini
|
tests/config.ini
|
||||||
vagrant/.vagrant
|
vagrant/.vagrant
|
||||||
vagrant/*.state_file
|
vagrant/*.state_file
|
||||||
.*.swp
|
.*.swp
|
||||||
|
.coverage
|
||||||
.venv/
|
.venv/
|
||||||
.tox/
|
.tox/
|
||||||
dist/
|
dist/
|
||||||
|
|
26
CHANGELOG.md
26
CHANGELOG.md
|
@ -1,13 +1,37 @@
|
||||||
# Change log
|
# Change log
|
||||||
|
|
||||||
## Unreleased
|
## check_patroni 2.0.0 - 2024-04-09
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
|
||||||
|
* In `cluster_node_count`, a healthy standby, sync replica or standby leaders cannot be "in
|
||||||
|
archive recovery" because this service doesn't check for lag and timelines.
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
|
||||||
|
* Add the timeline in the `cluster_has_replica` perfstats. (#50)
|
||||||
|
* Add a mention about shell completion support and shell versions in the doc. (#53)
|
||||||
|
* Add the leader type and whether it's archiving to the `cluster_has_leader` perfstats. (#58)
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
|
|
||||||
|
* Add compatibility with [requests](https://requests.readthedocs.io)
|
||||||
|
version 2.25 and higher.
|
||||||
|
* Fix what `cluster_has_replica` deems a healthy replica. (#50, reported by @mbanck)
|
||||||
|
* Fix `cluster_has_replica` to display perfstats for replicas whenever it's possible (healthy or not). (#50)
|
||||||
|
* Fix `cluster_has_leader` to correctly check for standby leaders. (#58, reported by @mbanck)
|
||||||
|
* Fix `cluster_node_count` to correctly manage replication states. (#50, reported by @mbanck)
|
||||||
|
|
||||||
### Misc
|
### Misc
|
||||||
|
|
||||||
|
* Improve the documentation for `node_is_replica`.
|
||||||
|
* Improve test coverage by running an HTTP server to fake the Patroni API (#55
|
||||||
|
by @dlax).
|
||||||
|
* Work around old pytest versions in type annotations in the test suite.
|
||||||
|
* Declare compatibility with click version 7.1 (or higher).
|
||||||
|
* In tests, work around nagiosplugin 1.3.2 not properly handling stdout
|
||||||
|
redirection.
|
||||||
|
|
||||||
## check_patroni 1.0.0 - 2023-08-28
|
## check_patroni 1.0.0 - 2023-08-28
|
||||||
|
|
||||||
Check patroni is now tagged as Production/Stable.
|
Check patroni is now tagged as Production/Stable.
|
||||||
|
|
|
@ -43,15 +43,14 @@ A vagrant file can be found in [this
|
||||||
repository](https://github.com/ioguix/vagrant-patroni) to generate a patroni/etcd
|
repository](https://github.com/ioguix/vagrant-patroni) to generate a patroni/etcd
|
||||||
setup.
|
setup.
|
||||||
|
|
||||||
The `README.md` can be geneated with `./docs/make_readme.sh`.
|
The `README.md` can be generated with `./docs/make_readme.sh`.
|
||||||
|
|
||||||
## Executing Tests
|
## Executing Tests
|
||||||
|
|
||||||
Crafting repeatable tests using a live Patroni cluster can be intricate. To
|
Crafting repeatable tests using a live Patroni cluster can be intricate. To
|
||||||
simplify the development process, interactions with Patroni's API are
|
simplify the development process, a fake HTTP server is set up as a test
|
||||||
substituted with a mock function that yields an HTTP return code and a JSON
|
fixture and serves static files (either from `tests/json` directory or from
|
||||||
object outlining the cluster's status. The JSON files containing this
|
in-memory data).
|
||||||
information are housed in the `./tests/json` directory.
|
|
||||||
|
|
||||||
An important consideration is that there is a potential drawback: if the JSON
|
An important consideration is that there is a potential drawback: if the JSON
|
||||||
data is incorrect or if modifications have been made to Patroni without
|
data is incorrect or if modifications have been made to Patroni without
|
||||||
|
@ -61,21 +60,15 @@ erroneously.
|
||||||
The tests are executed automatically for each PR using the ci (see
|
The tests are executed automatically for each PR using the ci (see
|
||||||
`.github/workflow/lint.yml` and `.github/workflow/tests.yml`).
|
`.github/workflow/lint.yml` and `.github/workflow/tests.yml`).
|
||||||
|
|
||||||
Running the tests manually:
|
Running the tests,
|
||||||
|
|
||||||
* Using patroni's nominal replica state of `streaming` (since v3.0.4):
|
* manually:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pytest ./tests
|
pytest --cov tests
|
||||||
```
|
```
|
||||||
|
|
||||||
* Using patroni's nominal replica state of `running` (before v3.0.4):
|
* or using tox:
|
||||||
|
|
||||||
```bash
|
|
||||||
pytest --use-old-replica-state ./tests
|
|
||||||
```
|
|
||||||
|
|
||||||
* Using tox:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
tox -e lint # mypy + flake8 + black + isort ° codespell
|
tox -e lint # mypy + flake8 + black + isort ° codespell
|
||||||
|
@ -83,9 +76,9 @@ Running the tests manually:
|
||||||
tox -e py # pytests and "lint" tests for the default version of python
|
tox -e py # pytests and "lint" tests for the default version of python
|
||||||
```
|
```
|
||||||
|
|
||||||
Please note that when dealing with any service that checks the state of a node
|
Please note that when dealing with any service that checks the state of a node,
|
||||||
in patroni's `cluster` endpoint, the corresponding JSON test file must be added
|
the related tests must use the `old_replica_state` fixture to test with both
|
||||||
in `./tests/tools.py`.
|
old (pre 3.0.4) and new replica states.
|
||||||
|
|
||||||
A bash script, `check_patroni.sh`, is provided to facilitate testing all
|
A bash script, `check_patroni.sh`, is provided to facilitate testing all
|
||||||
services on a Patroni endpoint (`./vagrant/check_patroni.sh`). It requires one
|
services on a Patroni endpoint (`./vagrant/check_patroni.sh`). It requires one
|
||||||
|
@ -99,17 +92,3 @@ Here's an example usage:
|
||||||
```bash
|
```bash
|
||||||
./vagrant/check_patroni.sh http://10.20.30.51:8008
|
./vagrant/check_patroni.sh http://10.20.30.51:8008
|
||||||
```
|
```
|
||||||
|
|
||||||
## Release
|
|
||||||
|
|
||||||
Update the Changelog.
|
|
||||||
|
|
||||||
The package is generated and uploaded to pypi when a `v*` tag is created (see
|
|
||||||
`.github/workflow/publish.yml`).
|
|
||||||
|
|
||||||
Alternatively, the release can be done manually with:
|
|
||||||
|
|
||||||
```
|
|
||||||
tox -e build
|
|
||||||
tox -e upload
|
|
||||||
```
|
|
||||||
|
|
|
@ -2,6 +2,7 @@ include *.md
|
||||||
include mypy.ini
|
include mypy.ini
|
||||||
include pytest.ini
|
include pytest.ini
|
||||||
include tox.ini
|
include tox.ini
|
||||||
|
include .coveragerc
|
||||||
include .flake8
|
include .flake8
|
||||||
include pyproject.toml
|
include pyproject.toml
|
||||||
recursive-include docs *.sh
|
recursive-include docs *.sh
|
||||||
|
|
124
README.md
124
README.md
|
@ -45,7 +45,7 @@ Commands:
|
||||||
node_is_leader Check if the node is a leader node.
|
node_is_leader Check if the node is a leader node.
|
||||||
node_is_pending_restart Check if the node is in pending restart...
|
node_is_pending_restart Check if the node is in pending restart...
|
||||||
node_is_primary Check if the node is the primary with the...
|
node_is_primary Check if the node is the primary with the...
|
||||||
node_is_replica Check if the node is a running replica...
|
node_is_replica Check if the node is a replica with no...
|
||||||
node_patroni_version Check if the version is equal to the input
|
node_patroni_version Check if the version is equal to the input
|
||||||
node_tl_has_changed Check if the timeline has changed.
|
node_tl_has_changed Check if the timeline has changed.
|
||||||
```
|
```
|
||||||
|
@ -60,7 +60,7 @@ $ pip install git+https://github.com/dalibo/check_patroni.git
|
||||||
|
|
||||||
check_patroni works on python 3.6, we keep it that way because patroni also
|
check_patroni works on python 3.6, we keep it that way because patroni also
|
||||||
supports it and there are still lots of RH 7 variants around. That being said
|
supports it and there are still lots of RH 7 variants around. That being said
|
||||||
python 3.6 has been EOL for age and there is no support for it in the github
|
python 3.6 has been EOL for ages and there is no support for it in the github
|
||||||
CI.
|
CI.
|
||||||
|
|
||||||
## Support
|
## Support
|
||||||
|
@ -98,8 +98,8 @@ A match is found when: `start <= VALUE <= end`.
|
||||||
|
|
||||||
For example, the following command will raise:
|
For example, the following command will raise:
|
||||||
|
|
||||||
* a warning if there is less than 1 nodes, wich can be translated to outside of range [2;+INF[
|
* a warning if there is less than 1 nodes, which can be translated to outside of range [2;+INF[
|
||||||
* a critical if there are no nodes, wich can be translated to outside of range [1;+INF[
|
* a critical if there are no nodes, which can be translated to outside of range [1;+INF[
|
||||||
|
|
||||||
```
|
```
|
||||||
check_patroni -e https://10.20.199.3:8008 cluster_has_replica --warning 2: --critical 1:
|
check_patroni -e https://10.20.199.3:8008 cluster_has_replica --warning 2: --critical 1:
|
||||||
|
@ -115,6 +115,30 @@ Several options are available:
|
||||||
* `--cert_file`: your certificate or the concatenation of your certificate and private key
|
* `--cert_file`: your certificate or the concatenation of your certificate and private key
|
||||||
* `--key_file`: your private key (optional)
|
* `--key_file`: your private key (optional)
|
||||||
|
|
||||||
|
## Shell completion
|
||||||
|
|
||||||
|
We use the [click] library which supports shell completion natively.
|
||||||
|
|
||||||
|
Shell completion can be added by typing the following command or adding it to
|
||||||
|
a file spécific to your shell of choice.
|
||||||
|
|
||||||
|
* for Bash (add to `~/.bashrc`):
|
||||||
|
```
|
||||||
|
eval "$(_CHECK_PATRONI_COMPLETE=bash_source check_patroni)"
|
||||||
|
```
|
||||||
|
* for Zsh (add to `~/.zshrc`):
|
||||||
|
```
|
||||||
|
eval "$(_CHECK_PATRONI_COMPLETE=zsh_source check_patroni)"
|
||||||
|
```
|
||||||
|
* for Fish (add to `~/.config/fish/completions/check_patroni.fish`):
|
||||||
|
```
|
||||||
|
eval "$(_CHECK_PATRONI_COMPLETE=fish_source check_patroni)"
|
||||||
|
```
|
||||||
|
|
||||||
|
Please note that shell completion is not supported far all shell versions, for
|
||||||
|
example only Bash versions older than 4.4 are supported.
|
||||||
|
|
||||||
|
[click]: https://click.palletsprojects.com/en/8.1.x/shell-completion/
|
||||||
|
|
||||||
## Cluster services
|
## Cluster services
|
||||||
|
|
||||||
|
@ -152,11 +176,27 @@ Usage: check_patroni cluster_has_leader [OPTIONS]
|
||||||
|
|
||||||
This check applies to any kind of leaders including standby leaders.
|
This check applies to any kind of leaders including standby leaders.
|
||||||
|
|
||||||
|
A leader is a node with the "leader" role and a "running" state.
|
||||||
|
|
||||||
|
A standby leader is a node with a "standby_leader" role and a "streaming" or
|
||||||
|
"in archive recovery" state. Please note that log shipping could be stuck
|
||||||
|
because the WAL are not available or applicable. Patroni doesn't provide
|
||||||
|
information about the origin cluster (timeline or lag), so we cannot check
|
||||||
|
if there is a problem in that particular case. That's why we issue a warning
|
||||||
|
when the node is "in archive recovery". We suggest using other supervision
|
||||||
|
tools to do this (eg. check_pgactivity).
|
||||||
|
|
||||||
Check:
|
Check:
|
||||||
* `OK`: if there is a leader node.
|
* `OK`: if there is a leader node.
|
||||||
* `CRITICAL`: otherwise
|
* 'WARNING': if there is a stanby leader in archive mode.
|
||||||
|
* `CRITICAL`: otherwise.
|
||||||
|
|
||||||
Perfdata: `has_leader` is 1 if there is a leader node, 0 otherwise
|
Perfdata:
|
||||||
|
* `has_leader` is 1 if there is any kind of leader node, 0 otherwise
|
||||||
|
* `is_standby_leader_in_arc_rec` is 1 if the standby leader node is "in
|
||||||
|
archive recovery", 0 otherwise
|
||||||
|
* `is_standby_leader` is 1 if there is a standby leader node, 0 otherwise
|
||||||
|
* `is_leader` is 1 if there is a "classical" leader node, 0 otherwise
|
||||||
|
|
||||||
Options:
|
Options:
|
||||||
--help Show this message and exit.
|
--help Show this message and exit.
|
||||||
|
@ -169,10 +209,27 @@ Usage: check_patroni cluster_has_replica [OPTIONS]
|
||||||
|
|
||||||
Check if the cluster has healthy replicas and/or if some are sync standbies
|
Check if the cluster has healthy replicas and/or if some are sync standbies
|
||||||
|
|
||||||
|
For patroni (and this check):
|
||||||
|
* a replica is `streaming` if the `pg_stat_wal_receiver` say's so.
|
||||||
|
* a replica is `in archive recovery`, if it's not `streaming` and has a `restore_command`.
|
||||||
|
|
||||||
A healthy replica:
|
A healthy replica:
|
||||||
* is in running or streaming state (V3.0.4)
|
* has a `replica` or `sync_standby` role
|
||||||
* has a replica or sync_standby role
|
* has the same timeline as the leader and
|
||||||
* has a lag lower or equal to max_lag
|
* is in `running` state (patroni < V3.0.4)
|
||||||
|
* is in `streaming` or `in archive recovery` state (patroni >= V3.0.4)
|
||||||
|
* has a lag lower or equal to `max_lag`
|
||||||
|
|
||||||
|
Please note that replica `in archive recovery` could be stuck because the
|
||||||
|
WAL are not available or applicable (the server's timeline has diverged for
|
||||||
|
the leader's). We already detect the latter but we will miss the former.
|
||||||
|
Therefore, it's preferable to check for the lag in addition to the healthy
|
||||||
|
state if you rely on log shipping to help lagging standbies to catch up.
|
||||||
|
|
||||||
|
Since we require a healthy replica to have the same timeline as the leader,
|
||||||
|
it's possible that we raise alerts when the cluster is performing a
|
||||||
|
switchover or failover and the standbies are in the process of catching up
|
||||||
|
with the new leader. The alert shouldn't last long.
|
||||||
|
|
||||||
Check:
|
Check:
|
||||||
* `OK`: if the healthy_replica count and their lag are compatible with the replica count threshold.
|
* `OK`: if the healthy_replica count and their lag are compatible with the replica count threshold.
|
||||||
|
@ -182,8 +239,9 @@ Usage: check_patroni cluster_has_replica [OPTIONS]
|
||||||
Perfdata:
|
Perfdata:
|
||||||
* healthy_replica & unhealthy_replica count
|
* healthy_replica & unhealthy_replica count
|
||||||
* the number of sync_replica, they are included in the previous count
|
* the number of sync_replica, they are included in the previous count
|
||||||
* the lag of each replica labelled with "member name"_lag
|
* the lag of each replica labelled with "member name"_lag
|
||||||
* a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync
|
* the timeline of each replica labelled with "member name"_timeline
|
||||||
|
* a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync
|
||||||
|
|
||||||
Options:
|
Options:
|
||||||
-w, --warning TEXT Warning threshold for the number of healthy replica
|
-w, --warning TEXT Warning threshold for the number of healthy replica
|
||||||
|
@ -241,26 +299,37 @@ Usage: check_patroni cluster_node_count [OPTIONS]
|
||||||
|
|
||||||
Count the number of nodes in the cluster.
|
Count the number of nodes in the cluster.
|
||||||
|
|
||||||
|
The role refers to the role of the server in the cluster. Possible values
|
||||||
|
are:
|
||||||
|
* master or leader
|
||||||
|
* replica
|
||||||
|
* standby_leader
|
||||||
|
* sync_standby
|
||||||
|
* demoted
|
||||||
|
* promoted
|
||||||
|
* uninitialized
|
||||||
|
|
||||||
The state refers to the state of PostgreSQL. Possible values are:
|
The state refers to the state of PostgreSQL. Possible values are:
|
||||||
* initializing new cluster, initdb failed
|
* initializing new cluster, initdb failed
|
||||||
* running custom bootstrap script, custom bootstrap failed
|
* running custom bootstrap script, custom bootstrap failed
|
||||||
* starting, start failed
|
* starting, start failed
|
||||||
* restarting, restart failed
|
* restarting, restart failed
|
||||||
* running, streaming (for a replica V3.0.4)
|
* running, streaming, in archive recovery
|
||||||
* stopping, stopped, stop failed
|
* stopping, stopped, stop failed
|
||||||
* creating replica
|
* creating replica
|
||||||
* crashed
|
* crashed
|
||||||
|
|
||||||
The role refers to the role of the server in the cluster. Possible values
|
The "healthy" checks only ensures that:
|
||||||
are:
|
* a leader has the running state
|
||||||
* master or leader (V3.0.0+)
|
* a standby_leader has the running or streaming (V3.0.4) state
|
||||||
* replica
|
* a replica or sync-standby has the running or streaming (V3.0.4) state
|
||||||
* demoted
|
|
||||||
* promoted
|
Since we dont check the lag or timeline, "in archive recovery" is not
|
||||||
* uninitialized
|
considered a valid state for this service. See cluster_has_leader and
|
||||||
|
cluster_has_replica for specialized checks.
|
||||||
|
|
||||||
Check:
|
Check:
|
||||||
* Compares the number of nodes against the normal and healthy (running + streaming) nodes warning and critical thresholds.
|
* Compares the number of nodes against the normal and healthy nodes warning and critical thresholds.
|
||||||
* `OK`: If they are not provided.
|
* `OK`: If they are not provided.
|
||||||
|
|
||||||
Perfdata:
|
Perfdata:
|
||||||
|
@ -307,7 +376,7 @@ Usage: check_patroni node_is_pending_restart [OPTIONS]
|
||||||
|
|
||||||
Check if the node is in pending restart state.
|
Check if the node is in pending restart state.
|
||||||
|
|
||||||
This situation can arise if the configuration has been modified but requiers
|
This situation can arise if the configuration has been modified but requires
|
||||||
a restart of PostgreSQL to take effect.
|
a restart of PostgreSQL to take effect.
|
||||||
|
|
||||||
Check:
|
Check:
|
||||||
|
@ -368,12 +437,21 @@ Options:
|
||||||
```
|
```
|
||||||
Usage: check_patroni node_is_replica [OPTIONS]
|
Usage: check_patroni node_is_replica [OPTIONS]
|
||||||
|
|
||||||
Check if the node is a running replica with no noloadbalance tag.
|
Check if the node is a replica with no noloadbalance tag.
|
||||||
|
|
||||||
It is possible to check if the node is synchronous or asynchronous. If
|
It is possible to check if the node is synchronous or asynchronous. If
|
||||||
nothing is specified any kind of replica is accepted. When checking for a
|
nothing is specified any kind of replica is accepted. When checking for a
|
||||||
synchronous replica, it's not possible to specify a lag.
|
synchronous replica, it's not possible to specify a lag.
|
||||||
|
|
||||||
|
This service is using the following Patroni endpoints: replica, asynchronous
|
||||||
|
and synchronous. The first two implement the `lag` tag. For these endpoints
|
||||||
|
the state of a replica node doesn't reflect the replication state
|
||||||
|
(`streaming` or `in archive recovery`), we only know if it's `running`. The
|
||||||
|
timeline is also not checked.
|
||||||
|
|
||||||
|
Therefore, if a cluster is using asynchronous replication, it is recommended
|
||||||
|
to check for the lag to detect a divegence as soon as possible.
|
||||||
|
|
||||||
Check:
|
Check:
|
||||||
* `OK`: if the node is a running replica with noloadbalance tag and the lag is under the maximum threshold.
|
* `OK`: if the node is a running replica with noloadbalance tag and the lag is under the maximum threshold.
|
||||||
* `CRITICAL`: otherwise
|
* `CRITICAL`: otherwise
|
||||||
|
|
38
RELEASE.md
Normal file
38
RELEASE.md
Normal file
|
@ -0,0 +1,38 @@
|
||||||
|
# Release HOW TO
|
||||||
|
|
||||||
|
## Preparatory changes
|
||||||
|
|
||||||
|
* Review the **Unreleased** section, if any, in `CHANGELOG.md` possibly adding
|
||||||
|
any missing item from closed issues, merged pull requests, or directly the
|
||||||
|
git history[^git-changes],
|
||||||
|
* Rename the **Unreleased** section according to the version to be released,
|
||||||
|
with a date,
|
||||||
|
* Bump the version in `check_patroni/__init__.py`,
|
||||||
|
* Rebuild the `README.md` (`cd docs; ./make_readme.sh`),
|
||||||
|
* Commit these changes (either on a dedicated branch, before submitting a pull
|
||||||
|
request or directly on the `master` branch) with the commit message `release
|
||||||
|
X.Y.Z`.
|
||||||
|
* Then, when changes landed in the `master` branch, create an annotated (and
|
||||||
|
possibly signed) tag, as `git tag -a [-s] -m 'release X.Y.Z' vX.Y.Z`,
|
||||||
|
and,
|
||||||
|
* Push with `--follow-tags`.
|
||||||
|
|
||||||
|
[^git-changes]: Use `git log $(git describe --tags --abbrev=0).. --format=%s
|
||||||
|
--reverse` to get commits from the previous tag.
|
||||||
|
|
||||||
|
## PyPI package
|
||||||
|
|
||||||
|
The package is generated and uploaded to pypi when a `v*` tag is created (see
|
||||||
|
`.github/workflow/publish.yml`).
|
||||||
|
|
||||||
|
Alternatively, the release can be done manually with:
|
||||||
|
|
||||||
|
```
|
||||||
|
tox -e build
|
||||||
|
tox -e upload
|
||||||
|
```
|
||||||
|
|
||||||
|
## GitHub release
|
||||||
|
|
||||||
|
Draft a new release from the release page, choosing the tag just pushed and
|
||||||
|
copy the relevant change log section as a description.
|
|
@ -1,5 +1,5 @@
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
__version__ = "1.0.0"
|
__version__ = "2.0.0"
|
||||||
|
|
||||||
_log: logging.Logger = logging.getLogger(__name__)
|
_log: logging.Logger = logging.getLogger(__name__)
|
||||||
|
|
|
@ -226,29 +226,40 @@ def cluster_node_count(
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Count the number of nodes in the cluster.
|
"""Count the number of nodes in the cluster.
|
||||||
|
|
||||||
\b
|
|
||||||
The state refers to the state of PostgreSQL. Possible values are:
|
|
||||||
* initializing new cluster, initdb failed
|
|
||||||
* running custom bootstrap script, custom bootstrap failed
|
|
||||||
* starting, start failed
|
|
||||||
* restarting, restart failed
|
|
||||||
* running, streaming (for a replica V3.0.4)
|
|
||||||
* stopping, stopped, stop failed
|
|
||||||
* creating replica
|
|
||||||
* crashed
|
|
||||||
|
|
||||||
\b
|
\b
|
||||||
The role refers to the role of the server in the cluster. Possible values
|
The role refers to the role of the server in the cluster. Possible values
|
||||||
are:
|
are:
|
||||||
* master or leader (V3.0.0+)
|
* master or leader
|
||||||
* replica
|
* replica
|
||||||
|
* standby_leader
|
||||||
|
* sync_standby
|
||||||
* demoted
|
* demoted
|
||||||
* promoted
|
* promoted
|
||||||
* uninitialized
|
* uninitialized
|
||||||
|
|
||||||
|
\b
|
||||||
|
The state refers to the state of PostgreSQL. Possible values are:
|
||||||
|
* initializing new cluster, initdb failed
|
||||||
|
* running custom bootstrap script, custom bootstrap failed
|
||||||
|
* starting, start failed
|
||||||
|
* restarting, restart failed
|
||||||
|
* running, streaming, in archive recovery
|
||||||
|
* stopping, stopped, stop failed
|
||||||
|
* creating replica
|
||||||
|
* crashed
|
||||||
|
|
||||||
|
\b
|
||||||
|
The "healthy" checks only ensures that:
|
||||||
|
* a leader has the running state
|
||||||
|
* a standby_leader has the running or streaming (V3.0.4) state
|
||||||
|
* a replica or sync-standby has the running or streaming (V3.0.4) state
|
||||||
|
|
||||||
|
Since we dont check the lag or timeline, "in archive recovery" is not considered a valid state
|
||||||
|
for this service. See cluster_has_leader and cluster_has_replica for specialized checks.
|
||||||
|
|
||||||
\b
|
\b
|
||||||
Check:
|
Check:
|
||||||
* Compares the number of nodes against the normal and healthy (running + streaming) nodes warning and critical thresholds.
|
* Compares the number of nodes against the normal and healthy nodes warning and critical thresholds.
|
||||||
* `OK`: If they are not provided.
|
* `OK`: If they are not provided.
|
||||||
|
|
||||||
\b
|
\b
|
||||||
|
@ -285,17 +296,38 @@ def cluster_has_leader(ctx: click.Context) -> None:
|
||||||
|
|
||||||
This check applies to any kind of leaders including standby leaders.
|
This check applies to any kind of leaders including standby leaders.
|
||||||
|
|
||||||
|
A leader is a node with the "leader" role and a "running" state.
|
||||||
|
|
||||||
|
A standby leader is a node with a "standby_leader" role and a "streaming"
|
||||||
|
or "in archive recovery" state. Please note that log shipping could be
|
||||||
|
stuck because the WAL are not available or applicable. Patroni doesn't
|
||||||
|
provide information about the origin cluster (timeline or lag), so we
|
||||||
|
cannot check if there is a problem in that particular case. That's why we
|
||||||
|
issue a warning when the node is "in archive recovery". We suggest using
|
||||||
|
other supervision tools to do this (eg. check_pgactivity).
|
||||||
|
|
||||||
\b
|
\b
|
||||||
Check:
|
Check:
|
||||||
* `OK`: if there is a leader node.
|
* `OK`: if there is a leader node.
|
||||||
* `CRITICAL`: otherwise
|
* 'WARNING': if there is a stanby leader in archive mode.
|
||||||
|
* `CRITICAL`: otherwise.
|
||||||
|
|
||||||
|
\b
|
||||||
|
Perfdata:
|
||||||
|
* `has_leader` is 1 if there is any kind of leader node, 0 otherwise
|
||||||
|
* `is_standby_leader_in_arc_rec` is 1 if the standby leader node is "in
|
||||||
|
archive recovery", 0 otherwise
|
||||||
|
* `is_standby_leader` is 1 if there is a standby leader node, 0 otherwise
|
||||||
|
* `is_leader` is 1 if there is a "classical" leader node, 0 otherwise
|
||||||
|
|
||||||
Perfdata: `has_leader` is 1 if there is a leader node, 0 otherwise
|
|
||||||
"""
|
"""
|
||||||
check = nagiosplugin.Check()
|
check = nagiosplugin.Check()
|
||||||
check.add(
|
check.add(
|
||||||
ClusterHasLeader(ctx.obj.connection_info),
|
ClusterHasLeader(ctx.obj.connection_info),
|
||||||
nagiosplugin.ScalarContext("has_leader", None, "@0:0"),
|
nagiosplugin.ScalarContext("has_leader", None, "@0:0"),
|
||||||
|
nagiosplugin.ScalarContext("is_standby_leader_in_arc_rec", "@1:1", None),
|
||||||
|
nagiosplugin.ScalarContext("is_leader", None, None),
|
||||||
|
nagiosplugin.ScalarContext("is_standby_leader", None, None),
|
||||||
ClusterHasLeaderSummary(),
|
ClusterHasLeaderSummary(),
|
||||||
)
|
)
|
||||||
check.main(verbose=ctx.obj.verbose, timeout=ctx.obj.timeout)
|
check.main(verbose=ctx.obj.verbose, timeout=ctx.obj.timeout)
|
||||||
|
@ -341,11 +373,29 @@ def cluster_has_replica(
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Check if the cluster has healthy replicas and/or if some are sync standbies
|
"""Check if the cluster has healthy replicas and/or if some are sync standbies
|
||||||
|
|
||||||
|
\b
|
||||||
|
For patroni (and this check):
|
||||||
|
* a replica is `streaming` if the `pg_stat_wal_receiver` say's so.
|
||||||
|
* a replica is `in archive recovery`, if it's not `streaming` and has a `restore_command`.
|
||||||
|
|
||||||
\b
|
\b
|
||||||
A healthy replica:
|
A healthy replica:
|
||||||
* is in running or streaming state (V3.0.4)
|
* has a `replica` or `sync_standby` role
|
||||||
* has a replica or sync_standby role
|
* has the same timeline as the leader and
|
||||||
* has a lag lower or equal to max_lag
|
* is in `running` state (patroni < V3.0.4)
|
||||||
|
* is in `streaming` or `in archive recovery` state (patroni >= V3.0.4)
|
||||||
|
* has a lag lower or equal to `max_lag`
|
||||||
|
|
||||||
|
Please note that replica `in archive recovery` could be stuck because the WAL
|
||||||
|
are not available or applicable (the server's timeline has diverged for the
|
||||||
|
leader's). We already detect the latter but we will miss the former.
|
||||||
|
Therefore, it's preferable to check for the lag in addition to the healthy
|
||||||
|
state if you rely on log shipping to help lagging standbies to catch up.
|
||||||
|
|
||||||
|
Since we require a healthy replica to have the same timeline as the
|
||||||
|
leader, it's possible that we raise alerts when the cluster is performing a
|
||||||
|
switchover or failover and the standbies are in the process of catching up with
|
||||||
|
the new leader. The alert shouldn't last long.
|
||||||
|
|
||||||
\b
|
\b
|
||||||
Check:
|
Check:
|
||||||
|
@ -357,8 +407,9 @@ def cluster_has_replica(
|
||||||
Perfdata:
|
Perfdata:
|
||||||
* healthy_replica & unhealthy_replica count
|
* healthy_replica & unhealthy_replica count
|
||||||
* the number of sync_replica, they are included in the previous count
|
* the number of sync_replica, they are included in the previous count
|
||||||
* the lag of each replica labelled with "member name"_lag
|
* the lag of each replica labelled with "member name"_lag
|
||||||
* a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync
|
* the timeline of each replica labelled with "member name"_timeline
|
||||||
|
* a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync
|
||||||
"""
|
"""
|
||||||
|
|
||||||
tmax_lag = size_to_byte(max_lag) if max_lag is not None else None
|
tmax_lag = size_to_byte(max_lag) if max_lag is not None else None
|
||||||
|
@ -377,6 +428,7 @@ def cluster_has_replica(
|
||||||
),
|
),
|
||||||
nagiosplugin.ScalarContext("unhealthy_replica"),
|
nagiosplugin.ScalarContext("unhealthy_replica"),
|
||||||
nagiosplugin.ScalarContext("replica_lag"),
|
nagiosplugin.ScalarContext("replica_lag"),
|
||||||
|
nagiosplugin.ScalarContext("replica_timeline"),
|
||||||
nagiosplugin.ScalarContext("replica_sync"),
|
nagiosplugin.ScalarContext("replica_sync"),
|
||||||
)
|
)
|
||||||
check.main(verbose=ctx.obj.verbose, timeout=ctx.obj.timeout)
|
check.main(verbose=ctx.obj.verbose, timeout=ctx.obj.timeout)
|
||||||
|
@ -569,10 +621,20 @@ def node_is_leader(ctx: click.Context, check_standby_leader: bool) -> None:
|
||||||
def node_is_replica(
|
def node_is_replica(
|
||||||
ctx: click.Context, max_lag: str, check_is_sync: bool, check_is_async: bool
|
ctx: click.Context, max_lag: str, check_is_sync: bool, check_is_async: bool
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Check if the node is a running replica with no noloadbalance tag.
|
"""Check if the node is a replica with no noloadbalance tag.
|
||||||
|
|
||||||
It is possible to check if the node is synchronous or asynchronous. If nothing is specified any kind of replica is accepted.
|
It is possible to check if the node is synchronous or asynchronous. If
|
||||||
When checking for a synchronous replica, it's not possible to specify a lag.
|
nothing is specified any kind of replica is accepted. When checking for a
|
||||||
|
synchronous replica, it's not possible to specify a lag.
|
||||||
|
|
||||||
|
This service is using the following Patroni endpoints: replica, asynchronous
|
||||||
|
and synchronous. The first two implement the `lag` tag. For these endpoints
|
||||||
|
the state of a replica node doesn't reflect the replication state
|
||||||
|
(`streaming` or `in archive recovery`), we only know if it's `running`. The
|
||||||
|
timeline is also not checked.
|
||||||
|
|
||||||
|
Therefore, if a cluster is using asynchronous replication, it is
|
||||||
|
recommended to check for the lag to detect a divegence as soon as possible.
|
||||||
|
|
||||||
\b
|
\b
|
||||||
Check:
|
Check:
|
||||||
|
@ -610,7 +672,7 @@ def node_is_pending_restart(ctx: click.Context) -> None:
|
||||||
"""Check if the node is in pending restart state.
|
"""Check if the node is in pending restart state.
|
||||||
|
|
||||||
This situation can arise if the configuration has been modified but
|
This situation can arise if the configuration has been modified but
|
||||||
requiers a restart of PostgreSQL to take effect.
|
requires a restart of PostgreSQL to take effect.
|
||||||
|
|
||||||
\b
|
\b
|
||||||
Check:
|
Check:
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
import hashlib
|
import hashlib
|
||||||
import json
|
import json
|
||||||
from collections import Counter
|
from collections import Counter
|
||||||
from typing import Iterable, Union
|
from typing import Any, Iterable, Union
|
||||||
|
|
||||||
import nagiosplugin
|
import nagiosplugin
|
||||||
|
|
||||||
|
@ -14,25 +14,52 @@ def replace_chars(text: str) -> str:
|
||||||
|
|
||||||
|
|
||||||
class ClusterNodeCount(PatroniResource):
|
class ClusterNodeCount(PatroniResource):
|
||||||
def probe(self: "ClusterNodeCount") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
|
def debug_member(member: Any, health: str) -> None:
|
||||||
|
_log.debug(
|
||||||
|
"Node %(node_name)s is %(health)s: role %(role)s state %(state)s.",
|
||||||
|
{
|
||||||
|
"node_name": member["name"],
|
||||||
|
"health": health,
|
||||||
|
"role": member["role"],
|
||||||
|
"state": member["state"],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# get the cluster info
|
||||||
item_dict = self.rest_api("cluster")
|
item_dict = self.rest_api("cluster")
|
||||||
|
|
||||||
role_counters: Counter[str] = Counter()
|
role_counters: Counter[str] = Counter()
|
||||||
roles = []
|
roles = []
|
||||||
status_counters: Counter[str] = Counter()
|
status_counters: Counter[str] = Counter()
|
||||||
statuses = []
|
statuses = []
|
||||||
|
healthy_member = 0
|
||||||
|
|
||||||
for member in item_dict["members"]:
|
for member in item_dict["members"]:
|
||||||
roles.append(replace_chars(member["role"]))
|
state, role = member["state"], member["role"]
|
||||||
statuses.append(replace_chars(member["state"]))
|
roles.append(replace_chars(role))
|
||||||
|
statuses.append(replace_chars(state))
|
||||||
|
|
||||||
|
if role == "leader" and state == "running":
|
||||||
|
healthy_member += 1
|
||||||
|
debug_member(member, "healthy")
|
||||||
|
continue
|
||||||
|
|
||||||
|
if role in ["standby_leader", "replica", "sync_standby"] and (
|
||||||
|
(self.has_detailed_states() and state == "streaming")
|
||||||
|
or (not self.has_detailed_states() and state == "running")
|
||||||
|
):
|
||||||
|
healthy_member += 1
|
||||||
|
debug_member(member, "healthy")
|
||||||
|
continue
|
||||||
|
|
||||||
|
debug_member(member, "unhealthy")
|
||||||
role_counters.update(roles)
|
role_counters.update(roles)
|
||||||
status_counters.update(statuses)
|
status_counters.update(statuses)
|
||||||
|
|
||||||
# The actual check: members, healthy_members
|
# The actual check: members, healthy_members
|
||||||
yield nagiosplugin.Metric("members", len(item_dict["members"]))
|
yield nagiosplugin.Metric("members", len(item_dict["members"]))
|
||||||
yield nagiosplugin.Metric(
|
yield nagiosplugin.Metric("healthy_members", healthy_member)
|
||||||
"healthy_members",
|
|
||||||
status_counters["running"] + status_counters.get("streaming", 0),
|
|
||||||
)
|
|
||||||
|
|
||||||
# The performance data : role
|
# The performance data : role
|
||||||
for role in role_counters:
|
for role in role_counters:
|
||||||
|
@ -48,74 +75,149 @@ class ClusterNodeCount(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class ClusterHasLeader(PatroniResource):
|
class ClusterHasLeader(PatroniResource):
|
||||||
def probe(self: "ClusterHasLeader") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("cluster")
|
item_dict = self.rest_api("cluster")
|
||||||
|
|
||||||
is_leader_found = False
|
is_leader_found = False
|
||||||
|
is_standby_leader_found = False
|
||||||
|
is_standby_leader_in_arc_rec = False
|
||||||
for member in item_dict["members"]:
|
for member in item_dict["members"]:
|
||||||
if (
|
if member["role"] == "leader" and member["state"] == "running":
|
||||||
member["role"] in ("leader", "standby_leader")
|
|
||||||
and member["state"] == "running"
|
|
||||||
):
|
|
||||||
is_leader_found = True
|
is_leader_found = True
|
||||||
break
|
break
|
||||||
|
|
||||||
|
if member["role"] == "standby_leader":
|
||||||
|
if member["state"] not in ["streaming", "in archive recovery"]:
|
||||||
|
# for patroni >= 3.0.4 any state would be wrong
|
||||||
|
# for patroni < 3.0.4 a state different from running would be wrong
|
||||||
|
if self.has_detailed_states() or member["state"] != "running":
|
||||||
|
continue
|
||||||
|
|
||||||
|
if member["state"] in ["in archive recovery"]:
|
||||||
|
is_standby_leader_in_arc_rec = True
|
||||||
|
|
||||||
|
is_standby_leader_found = True
|
||||||
|
break
|
||||||
return [
|
return [
|
||||||
nagiosplugin.Metric(
|
nagiosplugin.Metric(
|
||||||
"has_leader",
|
"has_leader",
|
||||||
|
1 if is_leader_found or is_standby_leader_found else 0,
|
||||||
|
),
|
||||||
|
nagiosplugin.Metric(
|
||||||
|
"is_standby_leader_in_arc_rec",
|
||||||
|
1 if is_standby_leader_in_arc_rec else 0,
|
||||||
|
),
|
||||||
|
nagiosplugin.Metric(
|
||||||
|
"is_standby_leader",
|
||||||
|
1 if is_standby_leader_found else 0,
|
||||||
|
),
|
||||||
|
nagiosplugin.Metric(
|
||||||
|
"is_leader",
|
||||||
1 if is_leader_found else 0,
|
1 if is_leader_found else 0,
|
||||||
)
|
),
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
class ClusterHasLeaderSummary(nagiosplugin.Summary):
|
class ClusterHasLeaderSummary(nagiosplugin.Summary):
|
||||||
def ok(self: "ClusterHasLeaderSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return "The cluster has a running leader."
|
return "The cluster has a running leader."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(self: "ClusterHasLeaderSummary", results: nagiosplugin.Result) -> str:
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
return "The cluster has no running leader."
|
return "The cluster has no running leader or the standby leader is in archive recovery."
|
||||||
|
|
||||||
|
|
||||||
class ClusterHasReplica(PatroniResource):
|
class ClusterHasReplica(PatroniResource):
|
||||||
def __init__(
|
def __init__(self, connection_info: ConnectionInfo, max_lag: Union[int, None]):
|
||||||
self: "ClusterHasReplica",
|
|
||||||
connection_info: ConnectionInfo,
|
|
||||||
max_lag: Union[int, None],
|
|
||||||
):
|
|
||||||
super().__init__(connection_info)
|
super().__init__(connection_info)
|
||||||
self.max_lag = max_lag
|
self.max_lag = max_lag
|
||||||
|
|
||||||
def probe(self: "ClusterHasReplica") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("cluster")
|
def debug_member(member: Any, health: str) -> None:
|
||||||
|
_log.debug(
|
||||||
|
"Node %(node_name)s is %(health)s: lag %(lag)s, state %(state)s, tl %(tl)s.",
|
||||||
|
{
|
||||||
|
"node_name": member["name"],
|
||||||
|
"health": health,
|
||||||
|
"lag": member["lag"],
|
||||||
|
"state": member["state"],
|
||||||
|
"tl": member["timeline"],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# get the cluster info
|
||||||
|
cluster_item_dict = self.rest_api("cluster")
|
||||||
|
|
||||||
replicas = []
|
replicas = []
|
||||||
healthy_replica = 0
|
healthy_replica = 0
|
||||||
unhealthy_replica = 0
|
unhealthy_replica = 0
|
||||||
sync_replica = 0
|
sync_replica = 0
|
||||||
for member in item_dict["members"]:
|
leader_tl = None
|
||||||
# FIXME are there other acceptable states
|
|
||||||
|
# Look for replicas
|
||||||
|
for member in cluster_item_dict["members"]:
|
||||||
if member["role"] in ["replica", "sync_standby"]:
|
if member["role"] in ["replica", "sync_standby"]:
|
||||||
# patroni 3.0.4 changed the standby state from running to streaming
|
if member["lag"] == "unknown":
|
||||||
if (
|
# This could happen if the node is stopped
|
||||||
member["state"] in ["running", "streaming"]
|
# nagiosplugin doesn't handle strings in perfstats
|
||||||
and member["lag"] != "unknown"
|
# so we have to ditch all the stats in that case
|
||||||
):
|
debug_member(member, "unhealthy")
|
||||||
|
unhealthy_replica += 1
|
||||||
|
continue
|
||||||
|
else:
|
||||||
replicas.append(
|
replicas.append(
|
||||||
{
|
{
|
||||||
"name": member["name"],
|
"name": member["name"],
|
||||||
"lag": member["lag"],
|
"lag": member["lag"],
|
||||||
|
"timeline": member["timeline"],
|
||||||
"sync": 1 if member["role"] == "sync_standby" else 0,
|
"sync": 1 if member["role"] == "sync_standby" else 0,
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
if member["role"] == "sync_standby":
|
# Get the leader tl if we haven't already
|
||||||
sync_replica += 1
|
if leader_tl is None:
|
||||||
|
# If there are no leaders, we will loop here for all
|
||||||
|
# members because leader_tl will remain None. it's not
|
||||||
|
# a big deal since having no leader is rare.
|
||||||
|
for tmember in cluster_item_dict["members"]:
|
||||||
|
if tmember["role"] == "leader":
|
||||||
|
leader_tl = int(tmember["timeline"])
|
||||||
|
break
|
||||||
|
|
||||||
if self.max_lag is None or self.max_lag >= int(member["lag"]):
|
_log.debug(
|
||||||
healthy_replica += 1
|
"Patroni's leader_timeline is %(leader_tl)s",
|
||||||
continue
|
{
|
||||||
unhealthy_replica += 1
|
"leader_tl": leader_tl,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Test for an unhealthy replica
|
||||||
|
if (
|
||||||
|
self.has_detailed_states()
|
||||||
|
and not (
|
||||||
|
member["state"] in ["streaming", "in archive recovery"]
|
||||||
|
and int(member["timeline"]) == leader_tl
|
||||||
|
)
|
||||||
|
) or (
|
||||||
|
not self.has_detailed_states()
|
||||||
|
and not (
|
||||||
|
member["state"] == "running"
|
||||||
|
and int(member["timeline"]) == leader_tl
|
||||||
|
)
|
||||||
|
):
|
||||||
|
debug_member(member, "unhealthy")
|
||||||
|
unhealthy_replica += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
if member["role"] == "sync_standby":
|
||||||
|
sync_replica += 1
|
||||||
|
|
||||||
|
if self.max_lag is None or self.max_lag >= int(member["lag"]):
|
||||||
|
debug_member(member, "healthy")
|
||||||
|
healthy_replica += 1
|
||||||
|
else:
|
||||||
|
debug_member(member, "unhealthy")
|
||||||
|
unhealthy_replica += 1
|
||||||
|
|
||||||
# The actual check
|
# The actual check
|
||||||
yield nagiosplugin.Metric("healthy_replica", healthy_replica)
|
yield nagiosplugin.Metric("healthy_replica", healthy_replica)
|
||||||
|
@ -127,6 +229,11 @@ class ClusterHasReplica(PatroniResource):
|
||||||
yield nagiosplugin.Metric(
|
yield nagiosplugin.Metric(
|
||||||
f"{replica['name']}_lag", replica["lag"], context="replica_lag"
|
f"{replica['name']}_lag", replica["lag"], context="replica_lag"
|
||||||
)
|
)
|
||||||
|
yield nagiosplugin.Metric(
|
||||||
|
f"{replica['name']}_timeline",
|
||||||
|
replica["timeline"],
|
||||||
|
context="replica_timeline",
|
||||||
|
)
|
||||||
yield nagiosplugin.Metric(
|
yield nagiosplugin.Metric(
|
||||||
f"{replica['name']}_sync", replica["sync"], context="replica_sync"
|
f"{replica['name']}_sync", replica["sync"], context="replica_sync"
|
||||||
)
|
)
|
||||||
|
@ -140,7 +247,7 @@ class ClusterHasReplica(PatroniResource):
|
||||||
|
|
||||||
class ClusterConfigHasChanged(PatroniResource):
|
class ClusterConfigHasChanged(PatroniResource):
|
||||||
def __init__(
|
def __init__(
|
||||||
self: "ClusterConfigHasChanged",
|
self,
|
||||||
connection_info: ConnectionInfo,
|
connection_info: ConnectionInfo,
|
||||||
config_hash: str, # Always contains the old hash
|
config_hash: str, # Always contains the old hash
|
||||||
state_file: str, # Only used to update the hash in the state_file (when needed)
|
state_file: str, # Only used to update the hash in the state_file (when needed)
|
||||||
|
@ -151,7 +258,7 @@ class ClusterConfigHasChanged(PatroniResource):
|
||||||
self.config_hash = config_hash
|
self.config_hash = config_hash
|
||||||
self.save = save
|
self.save = save
|
||||||
|
|
||||||
def probe(self: "ClusterConfigHasChanged") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("config")
|
item_dict = self.rest_api("config")
|
||||||
|
|
||||||
new_hash = hashlib.md5(json.dumps(item_dict).encode()).hexdigest()
|
new_hash = hashlib.md5(json.dumps(item_dict).encode()).hexdigest()
|
||||||
|
@ -183,23 +290,21 @@ class ClusterConfigHasChanged(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class ClusterConfigHasChangedSummary(nagiosplugin.Summary):
|
class ClusterConfigHasChangedSummary(nagiosplugin.Summary):
|
||||||
def __init__(self: "ClusterConfigHasChangedSummary", config_hash: str) -> None:
|
def __init__(self, config_hash: str) -> None:
|
||||||
self.old_config_hash = config_hash
|
self.old_config_hash = config_hash
|
||||||
|
|
||||||
# Note: It would be helpful to display the old / new hash here. Unfortunately, it's not a metric.
|
# Note: It would be helpful to display the old / new hash here. Unfortunately, it's not a metric.
|
||||||
# So we only have the old / expected one.
|
# So we only have the old / expected one.
|
||||||
def ok(self: "ClusterConfigHasChangedSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return f"The hash of patroni's dynamic configuration has not changed ({self.old_config_hash})."
|
return f"The hash of patroni's dynamic configuration has not changed ({self.old_config_hash})."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
self: "ClusterConfigHasChangedSummary", results: nagiosplugin.Result
|
|
||||||
) -> str:
|
|
||||||
return f"The hash of patroni's dynamic configuration has changed. The old hash was {self.old_config_hash}."
|
return f"The hash of patroni's dynamic configuration has changed. The old hash was {self.old_config_hash}."
|
||||||
|
|
||||||
|
|
||||||
class ClusterIsInMaintenance(PatroniResource):
|
class ClusterIsInMaintenance(PatroniResource):
|
||||||
def probe(self: "ClusterIsInMaintenance") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("cluster")
|
item_dict = self.rest_api("cluster")
|
||||||
|
|
||||||
# The actual check
|
# The actual check
|
||||||
|
@ -212,7 +317,7 @@ class ClusterIsInMaintenance(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class ClusterHasScheduledAction(PatroniResource):
|
class ClusterHasScheduledAction(PatroniResource):
|
||||||
def probe(self: "ClusterIsInMaintenance") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("cluster")
|
item_dict = self.rest_api("cluster")
|
||||||
|
|
||||||
scheduled_switchover = 0
|
scheduled_switchover = 0
|
||||||
|
|
|
@ -7,7 +7,7 @@ from .types import APIError, ConnectionInfo, PatroniResource, handle_unknown
|
||||||
|
|
||||||
|
|
||||||
class NodeIsPrimary(PatroniResource):
|
class NodeIsPrimary(PatroniResource):
|
||||||
def probe(self: "NodeIsPrimary") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
try:
|
try:
|
||||||
self.rest_api("primary")
|
self.rest_api("primary")
|
||||||
except APIError:
|
except APIError:
|
||||||
|
@ -16,24 +16,22 @@ class NodeIsPrimary(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class NodeIsPrimarySummary(nagiosplugin.Summary):
|
class NodeIsPrimarySummary(nagiosplugin.Summary):
|
||||||
def ok(self: "NodeIsPrimarySummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return "This node is the primary with the leader lock."
|
return "This node is the primary with the leader lock."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(self: "NodeIsPrimarySummary", results: nagiosplugin.Result) -> str:
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
return "This node is not the primary with the leader lock."
|
return "This node is not the primary with the leader lock."
|
||||||
|
|
||||||
|
|
||||||
class NodeIsLeader(PatroniResource):
|
class NodeIsLeader(PatroniResource):
|
||||||
def __init__(
|
def __init__(
|
||||||
self: "NodeIsLeader",
|
self, connection_info: ConnectionInfo, check_is_standby_leader: bool
|
||||||
connection_info: ConnectionInfo,
|
|
||||||
check_is_standby_leader: bool,
|
|
||||||
) -> None:
|
) -> None:
|
||||||
super().__init__(connection_info)
|
super().__init__(connection_info)
|
||||||
self.check_is_standby_leader = check_is_standby_leader
|
self.check_is_standby_leader = check_is_standby_leader
|
||||||
|
|
||||||
def probe(self: "NodeIsLeader") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
apiname = "leader"
|
apiname = "leader"
|
||||||
if self.check_is_standby_leader:
|
if self.check_is_standby_leader:
|
||||||
apiname = "standby-leader"
|
apiname = "standby-leader"
|
||||||
|
@ -46,26 +44,23 @@ class NodeIsLeader(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class NodeIsLeaderSummary(nagiosplugin.Summary):
|
class NodeIsLeaderSummary(nagiosplugin.Summary):
|
||||||
def __init__(
|
def __init__(self, check_is_standby_leader: bool) -> None:
|
||||||
self: "NodeIsLeaderSummary",
|
|
||||||
check_is_standby_leader: bool,
|
|
||||||
) -> None:
|
|
||||||
if check_is_standby_leader:
|
if check_is_standby_leader:
|
||||||
self.leader_kind = "standby leader"
|
self.leader_kind = "standby leader"
|
||||||
else:
|
else:
|
||||||
self.leader_kind = "leader"
|
self.leader_kind = "leader"
|
||||||
|
|
||||||
def ok(self: "NodeIsLeaderSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return f"This node is a {self.leader_kind} node."
|
return f"This node is a {self.leader_kind} node."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(self: "NodeIsLeaderSummary", results: nagiosplugin.Result) -> str:
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
return f"This node is not a {self.leader_kind} node."
|
return f"This node is not a {self.leader_kind} node."
|
||||||
|
|
||||||
|
|
||||||
class NodeIsReplica(PatroniResource):
|
class NodeIsReplica(PatroniResource):
|
||||||
def __init__(
|
def __init__(
|
||||||
self: "NodeIsReplica",
|
self,
|
||||||
connection_info: ConnectionInfo,
|
connection_info: ConnectionInfo,
|
||||||
max_lag: str,
|
max_lag: str,
|
||||||
check_is_sync: bool,
|
check_is_sync: bool,
|
||||||
|
@ -76,7 +71,7 @@ class NodeIsReplica(PatroniResource):
|
||||||
self.check_is_sync = check_is_sync
|
self.check_is_sync = check_is_sync
|
||||||
self.check_is_async = check_is_async
|
self.check_is_async = check_is_async
|
||||||
|
|
||||||
def probe(self: "NodeIsReplica") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
try:
|
try:
|
||||||
if self.check_is_sync:
|
if self.check_is_sync:
|
||||||
api_name = "synchronous"
|
api_name = "synchronous"
|
||||||
|
@ -95,12 +90,7 @@ class NodeIsReplica(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class NodeIsReplicaSummary(nagiosplugin.Summary):
|
class NodeIsReplicaSummary(nagiosplugin.Summary):
|
||||||
def __init__(
|
def __init__(self, lag: str, check_is_sync: bool, check_is_async: bool) -> None:
|
||||||
self: "NodeIsReplicaSummary",
|
|
||||||
lag: str,
|
|
||||||
check_is_sync: bool,
|
|
||||||
check_is_async: bool,
|
|
||||||
) -> None:
|
|
||||||
self.lag = lag
|
self.lag = lag
|
||||||
if check_is_sync:
|
if check_is_sync:
|
||||||
self.replica_kind = "synchronous replica"
|
self.replica_kind = "synchronous replica"
|
||||||
|
@ -109,7 +99,7 @@ class NodeIsReplicaSummary(nagiosplugin.Summary):
|
||||||
else:
|
else:
|
||||||
self.replica_kind = "replica"
|
self.replica_kind = "replica"
|
||||||
|
|
||||||
def ok(self: "NodeIsReplicaSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
if self.lag is None:
|
if self.lag is None:
|
||||||
return (
|
return (
|
||||||
f"This node is a running {self.replica_kind} with no noloadbalance tag."
|
f"This node is a running {self.replica_kind} with no noloadbalance tag."
|
||||||
|
@ -117,14 +107,14 @@ class NodeIsReplicaSummary(nagiosplugin.Summary):
|
||||||
return f"This node is a running {self.replica_kind} with no noloadbalance tag and the lag is under {self.lag}."
|
return f"This node is a running {self.replica_kind} with no noloadbalance tag and the lag is under {self.lag}."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(self: "NodeIsReplicaSummary", results: nagiosplugin.Result) -> str:
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
if self.lag is None:
|
if self.lag is None:
|
||||||
return f"This node is not a running {self.replica_kind} with no noloadbalance tag."
|
return f"This node is not a running {self.replica_kind} with no noloadbalance tag."
|
||||||
return f"This node is not a running {self.replica_kind} with no noloadbalance tag and a lag under {self.lag}."
|
return f"This node is not a running {self.replica_kind} with no noloadbalance tag and a lag under {self.lag}."
|
||||||
|
|
||||||
|
|
||||||
class NodeIsPendingRestart(PatroniResource):
|
class NodeIsPendingRestart(PatroniResource):
|
||||||
def probe(self: "NodeIsPendingRestart") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("patroni")
|
item_dict = self.rest_api("patroni")
|
||||||
|
|
||||||
is_pending_restart = item_dict.get("pending_restart", False)
|
is_pending_restart = item_dict.get("pending_restart", False)
|
||||||
|
@ -137,19 +127,17 @@ class NodeIsPendingRestart(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class NodeIsPendingRestartSummary(nagiosplugin.Summary):
|
class NodeIsPendingRestartSummary(nagiosplugin.Summary):
|
||||||
def ok(self: "NodeIsPendingRestartSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return "This node doesn't have the pending restart flag."
|
return "This node doesn't have the pending restart flag."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
self: "NodeIsPendingRestartSummary", results: nagiosplugin.Result
|
|
||||||
) -> str:
|
|
||||||
return "This node has the pending restart flag."
|
return "This node has the pending restart flag."
|
||||||
|
|
||||||
|
|
||||||
class NodeTLHasChanged(PatroniResource):
|
class NodeTLHasChanged(PatroniResource):
|
||||||
def __init__(
|
def __init__(
|
||||||
self: "NodeTLHasChanged",
|
self,
|
||||||
connection_info: ConnectionInfo,
|
connection_info: ConnectionInfo,
|
||||||
timeline: str, # Always contains the old timeline
|
timeline: str, # Always contains the old timeline
|
||||||
state_file: str, # Only used to update the timeline in the state_file (when needed)
|
state_file: str, # Only used to update the timeline in the state_file (when needed)
|
||||||
|
@ -160,7 +148,7 @@ class NodeTLHasChanged(PatroniResource):
|
||||||
self.timeline = timeline
|
self.timeline = timeline
|
||||||
self.save = save
|
self.save = save
|
||||||
|
|
||||||
def probe(self: "NodeTLHasChanged") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("patroni")
|
item_dict = self.rest_api("patroni")
|
||||||
new_tl = item_dict["timeline"]
|
new_tl = item_dict["timeline"]
|
||||||
|
|
||||||
|
@ -193,27 +181,23 @@ class NodeTLHasChanged(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class NodeTLHasChangedSummary(nagiosplugin.Summary):
|
class NodeTLHasChangedSummary(nagiosplugin.Summary):
|
||||||
def __init__(self: "NodeTLHasChangedSummary", timeline: str) -> None:
|
def __init__(self, timeline: str) -> None:
|
||||||
self.timeline = timeline
|
self.timeline = timeline
|
||||||
|
|
||||||
def ok(self: "NodeTLHasChangedSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return f"The timeline is still {self.timeline}."
|
return f"The timeline is still {self.timeline}."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(self: "NodeTLHasChangedSummary", results: nagiosplugin.Result) -> str:
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
return f"The expected timeline was {self.timeline} got {results['timeline'].metric}."
|
return f"The expected timeline was {self.timeline} got {results['timeline'].metric}."
|
||||||
|
|
||||||
|
|
||||||
class NodePatroniVersion(PatroniResource):
|
class NodePatroniVersion(PatroniResource):
|
||||||
def __init__(
|
def __init__(self, connection_info: ConnectionInfo, patroni_version: str) -> None:
|
||||||
self: "NodePatroniVersion",
|
|
||||||
connection_info: ConnectionInfo,
|
|
||||||
patroni_version: str,
|
|
||||||
) -> None:
|
|
||||||
super().__init__(connection_info)
|
super().__init__(connection_info)
|
||||||
self.patroni_version = patroni_version
|
self.patroni_version = patroni_version
|
||||||
|
|
||||||
def probe(self: "NodePatroniVersion") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
item_dict = self.rest_api("patroni")
|
item_dict = self.rest_api("patroni")
|
||||||
|
|
||||||
version = item_dict["patroni"]["version"]
|
version = item_dict["patroni"]["version"]
|
||||||
|
@ -232,21 +216,21 @@ class NodePatroniVersion(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class NodePatroniVersionSummary(nagiosplugin.Summary):
|
class NodePatroniVersionSummary(nagiosplugin.Summary):
|
||||||
def __init__(self: "NodePatroniVersionSummary", patroni_version: str) -> None:
|
def __init__(self, patroni_version: str) -> None:
|
||||||
self.patroni_version = patroni_version
|
self.patroni_version = patroni_version
|
||||||
|
|
||||||
def ok(self: "NodePatroniVersionSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return f"Patroni's version is {self.patroni_version}."
|
return f"Patroni's version is {self.patroni_version}."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(self: "NodePatroniVersionSummary", results: nagiosplugin.Result) -> str:
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
# FIXME find a way to make the following work, check is perf data can be strings
|
# FIXME find a way to make the following work, check is perf data can be strings
|
||||||
# return f"The expected patroni version was {self.patroni_version} got {results['patroni_version'].metric}."
|
# return f"The expected patroni version was {self.patroni_version} got {results['patroni_version'].metric}."
|
||||||
return f"Patroni's version is not {self.patroni_version}."
|
return f"Patroni's version is not {self.patroni_version}."
|
||||||
|
|
||||||
|
|
||||||
class NodeIsAlive(PatroniResource):
|
class NodeIsAlive(PatroniResource):
|
||||||
def probe(self: "NodeIsAlive") -> Iterable[nagiosplugin.Metric]:
|
def probe(self) -> Iterable[nagiosplugin.Metric]:
|
||||||
try:
|
try:
|
||||||
self.rest_api("liveness")
|
self.rest_api("liveness")
|
||||||
except APIError:
|
except APIError:
|
||||||
|
@ -255,9 +239,9 @@ class NodeIsAlive(PatroniResource):
|
||||||
|
|
||||||
|
|
||||||
class NodeIsAliveSummary(nagiosplugin.Summary):
|
class NodeIsAliveSummary(nagiosplugin.Summary):
|
||||||
def ok(self: "NodeIsAliveSummary", results: nagiosplugin.Result) -> str:
|
def ok(self, results: nagiosplugin.Result) -> str:
|
||||||
return "This node is alive (patroni is running)."
|
return "This node is alive (patroni is running)."
|
||||||
|
|
||||||
@handle_unknown
|
@handle_unknown
|
||||||
def problem(self: "NodeIsAliveSummary", results: nagiosplugin.Result) -> str:
|
def problem(self, results: nagiosplugin.Result) -> str:
|
||||||
return "This node is not alive (patroni is not running)."
|
return "This node is not alive (patroni is not running)."
|
||||||
|
|
|
@ -1,3 +1,5 @@
|
||||||
|
import json
|
||||||
|
from functools import lru_cache
|
||||||
from typing import Any, Callable, List, Optional, Tuple, Union
|
from typing import Any, Callable, List, Optional, Tuple, Union
|
||||||
from urllib.parse import urlparse
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
@ -28,11 +30,11 @@ class Parameters:
|
||||||
verbose: int
|
verbose: int
|
||||||
|
|
||||||
|
|
||||||
@attr.s(auto_attribs=True, slots=True)
|
@attr.s(auto_attribs=True, eq=False, slots=True)
|
||||||
class PatroniResource(nagiosplugin.Resource):
|
class PatroniResource(nagiosplugin.Resource):
|
||||||
conn_info: ConnectionInfo
|
conn_info: ConnectionInfo
|
||||||
|
|
||||||
def rest_api(self: "PatroniResource", service: str) -> Any:
|
def rest_api(self, service: str) -> Any:
|
||||||
"""Try to connect to all the provided endpoints for the requested service"""
|
"""Try to connect to all the provided endpoints for the requested service"""
|
||||||
for endpoint in self.conn_info.endpoints:
|
for endpoint in self.conn_info.endpoints:
|
||||||
cert: Optional[Union[Tuple[str, str], str]] = None
|
cert: Optional[Union[Tuple[str, str], str]] = None
|
||||||
|
@ -71,10 +73,31 @@ class PatroniResource(nagiosplugin.Resource):
|
||||||
|
|
||||||
try:
|
try:
|
||||||
return r.json()
|
return r.json()
|
||||||
except requests.exceptions.JSONDecodeError:
|
except (json.JSONDecodeError, ValueError):
|
||||||
return None
|
return None
|
||||||
raise nagiosplugin.CheckError("Connection failed for all provided endpoints")
|
raise nagiosplugin.CheckError("Connection failed for all provided endpoints")
|
||||||
|
|
||||||
|
@lru_cache(maxsize=None)
|
||||||
|
def has_detailed_states(self) -> bool:
|
||||||
|
# get patroni's version to find out if the "streaming" and "in archive recovery" states are available
|
||||||
|
patroni_item_dict = self.rest_api("patroni")
|
||||||
|
|
||||||
|
if tuple(
|
||||||
|
int(v) for v in patroni_item_dict["patroni"]["version"].split(".", 2)
|
||||||
|
) >= (3, 0, 4):
|
||||||
|
_log.debug(
|
||||||
|
"Patroni's version is %(version)s, more detailed states can be used to check for the health of replicas.",
|
||||||
|
{"version": patroni_item_dict["patroni"]["version"]},
|
||||||
|
)
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
_log.debug(
|
||||||
|
"Patroni's version is %(version)s, the running state and the timelines must be used to check for the health of replicas.",
|
||||||
|
{"version": patroni_item_dict["patroni"]["version"]},
|
||||||
|
)
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
HandleUnknown = Callable[[nagiosplugin.Summary, nagiosplugin.Results], Any]
|
HandleUnknown = Callable[[nagiosplugin.Summary, nagiosplugin.Results], Any]
|
||||||
|
|
||||||
|
|
|
@ -42,7 +42,7 @@ $ pip install git+https://github.com/dalibo/check_patroni.git
|
||||||
|
|
||||||
check_patroni works on python 3.6, we keep it that way because patroni also
|
check_patroni works on python 3.6, we keep it that way because patroni also
|
||||||
supports it and there are still lots of RH 7 variants around. That being said
|
supports it and there are still lots of RH 7 variants around. That being said
|
||||||
python 3.6 has been EOL for age and there is no support for it in the github
|
python 3.6 has been EOL for ages and there is no support for it in the github
|
||||||
CI.
|
CI.
|
||||||
|
|
||||||
## Support
|
## Support
|
||||||
|
@ -80,8 +80,8 @@ A match is found when: `start <= VALUE <= end`.
|
||||||
|
|
||||||
For example, the following command will raise:
|
For example, the following command will raise:
|
||||||
|
|
||||||
* a warning if there is less than 1 nodes, wich can be translated to outside of range [2;+INF[
|
* a warning if there is less than 1 nodes, which can be translated to outside of range [2;+INF[
|
||||||
* a critical if there are no nodes, wich can be translated to outside of range [1;+INF[
|
* a critical if there are no nodes, which can be translated to outside of range [1;+INF[
|
||||||
|
|
||||||
```
|
```
|
||||||
check_patroni -e https://10.20.199.3:8008 cluster_has_replica --warning 2: --critical 1:
|
check_patroni -e https://10.20.199.3:8008 cluster_has_replica --warning 2: --critical 1:
|
||||||
|
@ -97,6 +97,30 @@ Several options are available:
|
||||||
* `--cert_file`: your certificate or the concatenation of your certificate and private key
|
* `--cert_file`: your certificate or the concatenation of your certificate and private key
|
||||||
* `--key_file`: your private key (optional)
|
* `--key_file`: your private key (optional)
|
||||||
|
|
||||||
|
## Shell completion
|
||||||
|
|
||||||
|
We use the [click] library which supports shell completion natively.
|
||||||
|
|
||||||
|
Shell completion can be added by typing the following command or adding it to
|
||||||
|
a file spécific to your shell of choice.
|
||||||
|
|
||||||
|
* for Bash (add to `~/.bashrc`):
|
||||||
|
```
|
||||||
|
eval "$(_CHECK_PATRONI_COMPLETE=bash_source check_patroni)"
|
||||||
|
```
|
||||||
|
* for Zsh (add to `~/.zshrc`):
|
||||||
|
```
|
||||||
|
eval "$(_CHECK_PATRONI_COMPLETE=zsh_source check_patroni)"
|
||||||
|
```
|
||||||
|
* for Fish (add to `~/.config/fish/completions/check_patroni.fish`):
|
||||||
|
```
|
||||||
|
eval "$(_CHECK_PATRONI_COMPLETE=fish_source check_patroni)"
|
||||||
|
```
|
||||||
|
|
||||||
|
Please note that shell completion is not supported far all shell versions, for
|
||||||
|
example only Bash versions older than 4.4 are supported.
|
||||||
|
|
||||||
|
[click]: https://click.palletsprojects.com/en/8.1.x/shell-completion/
|
||||||
_EOF_
|
_EOF_
|
||||||
readme
|
readme
|
||||||
readme "## Cluster services"
|
readme "## Cluster services"
|
||||||
|
|
1
mypy.ini
1
mypy.ini
|
@ -1,4 +1,5 @@
|
||||||
[mypy]
|
[mypy]
|
||||||
|
files = .
|
||||||
show_error_codes = true
|
show_error_codes = true
|
||||||
strict = true
|
strict = true
|
||||||
exclude = build/
|
exclude = build/
|
||||||
|
|
|
@ -4,7 +4,7 @@ isort
|
||||||
flake8
|
flake8
|
||||||
mypy==0.961
|
mypy==0.961
|
||||||
pytest
|
pytest
|
||||||
pytest-mock
|
pytest-cov
|
||||||
types-requests
|
types-requests
|
||||||
setuptools
|
setuptools
|
||||||
tox
|
tox
|
||||||
|
|
7
setup.py
7
setup.py
|
@ -41,12 +41,12 @@ setup(
|
||||||
"attrs >= 17, !=21.1",
|
"attrs >= 17, !=21.1",
|
||||||
"requests",
|
"requests",
|
||||||
"nagiosplugin >= 1.3.2",
|
"nagiosplugin >= 1.3.2",
|
||||||
"click >= 8.0.1",
|
"click >= 7.1",
|
||||||
],
|
],
|
||||||
extras_require={
|
extras_require={
|
||||||
"test": [
|
"test": [
|
||||||
"pytest",
|
"importlib_metadata; python_version < '3.8'",
|
||||||
"pytest-mock",
|
"pytest >= 6.0.2",
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
entry_points={
|
entry_points={
|
||||||
|
@ -56,4 +56,3 @@ setup(
|
||||||
},
|
},
|
||||||
zip_safe=False,
|
zip_safe=False,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,65 @@
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import shutil
|
||||||
|
from contextlib import contextmanager
|
||||||
|
from functools import partial
|
||||||
|
from http.server import HTTPServer, SimpleHTTPRequestHandler
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any, Iterator, Mapping, Union
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class PatroniAPI(HTTPServer):
|
||||||
|
def __init__(self, directory: Path, *, datadir: Path) -> None:
|
||||||
|
self.directory = directory
|
||||||
|
self.datadir = datadir
|
||||||
|
handler_cls = partial(SimpleHTTPRequestHandler, directory=str(directory))
|
||||||
|
super().__init__(("", 0), handler_cls)
|
||||||
|
|
||||||
|
def serve_forever(self, *args: Any) -> None:
|
||||||
|
logger.info(
|
||||||
|
"starting fake Patroni API at %s (directory=%s)",
|
||||||
|
self.endpoint,
|
||||||
|
self.directory,
|
||||||
|
)
|
||||||
|
return super().serve_forever(*args)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def endpoint(self) -> str:
|
||||||
|
return f"http://{self.server_name}:{self.server_port}"
|
||||||
|
|
||||||
|
@contextmanager
|
||||||
|
def routes(self, mapping: Mapping[str, Union[Path, str]]) -> Iterator[None]:
|
||||||
|
"""Temporarily install specified files in served directory, thus
|
||||||
|
building "routes" from given mapping.
|
||||||
|
|
||||||
|
The 'mapping' defines target route paths as keys and files to be
|
||||||
|
installed in served directory as values. Mapping values of type 'str'
|
||||||
|
are assumed be relative file path to the 'datadir'.
|
||||||
|
"""
|
||||||
|
for route_path, fpath in mapping.items():
|
||||||
|
if isinstance(fpath, str):
|
||||||
|
fpath = self.datadir / fpath
|
||||||
|
shutil.copy(fpath, self.directory / route_path)
|
||||||
|
try:
|
||||||
|
yield None
|
||||||
|
finally:
|
||||||
|
for fname in mapping:
|
||||||
|
(self.directory / fname).unlink()
|
||||||
|
|
||||||
|
|
||||||
|
def cluster_api_set_replica_running(in_json: Path, target_dir: Path) -> Path:
|
||||||
|
# starting from 3.0.4 the state of replicas is streaming or in archive recovery
|
||||||
|
# instead of running
|
||||||
|
with in_json.open() as f:
|
||||||
|
js = json.load(f)
|
||||||
|
for node in js["members"]:
|
||||||
|
if node["role"] in ["replica", "sync_standby", "standby_leader"]:
|
||||||
|
if node["state"] in ["streaming", "in archive recovery"]:
|
||||||
|
node["state"] = "running"
|
||||||
|
assert target_dir.is_dir()
|
||||||
|
out_json = target_dir / in_json.name
|
||||||
|
with out_json.open("w") as f:
|
||||||
|
json.dump(js, f)
|
||||||
|
return out_json
|
|
@ -1,12 +1,76 @@
|
||||||
def pytest_addoption(parser):
|
import logging
|
||||||
"""
|
import sys
|
||||||
Add CLI options to `pytest` to pass those options to the test cases.
|
from pathlib import Path
|
||||||
These options are used in `pytest_generate_tests`.
|
from threading import Thread
|
||||||
"""
|
from typing import Any, Iterator, Tuple
|
||||||
parser.addoption("--use-old-replica-state", action="store_true", default=False)
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
if sys.version_info >= (3, 8):
|
||||||
|
from importlib.metadata import version as metadata_version
|
||||||
|
else:
|
||||||
|
from importlib_metadata import version as metadata_version
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from click.testing import CliRunner
|
||||||
|
|
||||||
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def pytest_generate_tests(metafunc):
|
def numversion(pkgname: str) -> Tuple[int, ...]:
|
||||||
metafunc.parametrize(
|
version = metadata_version(pkgname)
|
||||||
"use_old_replica_state", [metafunc.config.getoption("use_old_replica_state")]
|
return tuple(int(v) for v in version.split(".", 3))
|
||||||
)
|
|
||||||
|
|
||||||
|
if numversion("pytest") >= (6, 2):
|
||||||
|
TempPathFactory = pytest.TempPathFactory
|
||||||
|
else:
|
||||||
|
from _pytest.tmpdir import TempPathFactory
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(scope="session", autouse=True)
|
||||||
|
def nagioplugin_runtime_stdout() -> Iterator[None]:
|
||||||
|
# work around https://github.com/mpounsett/nagiosplugin/issues/24 when
|
||||||
|
# nagiosplugin is older than 1.3.3
|
||||||
|
if numversion("nagiosplugin") < (1, 3, 3):
|
||||||
|
target = "nagiosplugin.runtime.Runtime.stdout"
|
||||||
|
with patch(target, None):
|
||||||
|
logger.warning("patching %r", target)
|
||||||
|
yield None
|
||||||
|
else:
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(
|
||||||
|
params=[False, True],
|
||||||
|
ids=lambda v: "new-replica-state" if v else "old-replica-state",
|
||||||
|
)
|
||||||
|
def old_replica_state(request: Any) -> Any:
|
||||||
|
return request.param
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(scope="session")
|
||||||
|
def datadir() -> Path:
|
||||||
|
return Path(__file__).parent / "json"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(scope="session")
|
||||||
|
def patroni_api(
|
||||||
|
tmp_path_factory: TempPathFactory, datadir: Path
|
||||||
|
) -> Iterator[PatroniAPI]:
|
||||||
|
"""A fake HTTP server for the Patroni API serving files from a temporary
|
||||||
|
directory.
|
||||||
|
"""
|
||||||
|
httpd = PatroniAPI(tmp_path_factory.mktemp("api"), datadir=datadir)
|
||||||
|
t = Thread(target=httpd.serve_forever)
|
||||||
|
t.start()
|
||||||
|
yield httpd
|
||||||
|
httpd.shutdown()
|
||||||
|
t.join()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def runner() -> CliRunner:
|
||||||
|
"""A CliRunner with stdout and stderr not mixed."""
|
||||||
|
return CliRunner(mix_stderr=False)
|
||||||
|
|
33
tests/json/cluster_has_leader_ko_standby_leader.json
Normal file
33
tests/json/cluster_has_leader_ko_standby_leader.json
Normal file
|
@ -0,0 +1,33 @@
|
||||||
|
{
|
||||||
|
"members": [
|
||||||
|
{
|
||||||
|
"name": "srv1",
|
||||||
|
"role": "standby_leader",
|
||||||
|
"state": "stopped",
|
||||||
|
"api_url": "https://10.20.199.3:8008/patroni",
|
||||||
|
"host": "10.20.199.3",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv2",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "streaming",
|
||||||
|
"api_url": "https://10.20.199.4:8008/patroni",
|
||||||
|
"host": "10.20.199.4",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv3",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "streaming",
|
||||||
|
"api_url": "https://10.20.199.5:8008/patroni",
|
||||||
|
"host": "10.20.199.5",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
|
@ -0,0 +1,33 @@
|
||||||
|
{
|
||||||
|
"members": [
|
||||||
|
{
|
||||||
|
"name": "srv1",
|
||||||
|
"role": "standby_leader",
|
||||||
|
"state": "in archive recovery",
|
||||||
|
"api_url": "https://10.20.199.3:8008/patroni",
|
||||||
|
"host": "10.20.199.3",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv2",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "streaming",
|
||||||
|
"api_url": "https://10.20.199.4:8008/patroni",
|
||||||
|
"host": "10.20.199.4",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv3",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "streaming",
|
||||||
|
"api_url": "https://10.20.199.5:8008/patroni",
|
||||||
|
"host": "10.20.199.5",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
|
@ -3,7 +3,7 @@
|
||||||
{
|
{
|
||||||
"name": "srv1",
|
"name": "srv1",
|
||||||
"role": "standby_leader",
|
"role": "standby_leader",
|
||||||
"state": "running",
|
"state": "streaming",
|
||||||
"api_url": "https://10.20.199.3:8008/patroni",
|
"api_url": "https://10.20.199.3:8008/patroni",
|
||||||
"host": "10.20.199.3",
|
"host": "10.20.199.3",
|
||||||
"port": 5432,
|
"port": 5432,
|
||||||
|
|
35
tests/json/cluster_has_replica_ko_all_replica.json
Normal file
35
tests/json/cluster_has_replica_ko_all_replica.json
Normal file
|
@ -0,0 +1,35 @@
|
||||||
|
{
|
||||||
|
"members": [
|
||||||
|
{
|
||||||
|
"name": "srv1",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "running",
|
||||||
|
"api_url": "https://10.20.199.3:8008/patroni",
|
||||||
|
"host": "10.20.199.3",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv2",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "running",
|
||||||
|
"api_url": "https://10.20.199.4:8008/patroni",
|
||||||
|
"host": "10.20.199.4",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv3",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "running",
|
||||||
|
"api_url": "https://10.20.199.5:8008/patroni",
|
||||||
|
"host": "10.20.199.5",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
33
tests/json/cluster_has_replica_ko_wrong_tl.json
Normal file
33
tests/json/cluster_has_replica_ko_wrong_tl.json
Normal file
|
@ -0,0 +1,33 @@
|
||||||
|
{
|
||||||
|
"members": [
|
||||||
|
{
|
||||||
|
"name": "srv1",
|
||||||
|
"role": "leader",
|
||||||
|
"state": "running",
|
||||||
|
"api_url": "https://10.20.199.3:8008/patroni",
|
||||||
|
"host": "10.20.199.3",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv2",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "running",
|
||||||
|
"api_url": "https://10.20.199.4:8008/patroni",
|
||||||
|
"host": "10.20.199.4",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 50,
|
||||||
|
"lag": 1000000
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv3",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "streaming",
|
||||||
|
"api_url": "https://10.20.199.5:8008/patroni",
|
||||||
|
"host": "10.20.199.5",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
|
@ -12,7 +12,7 @@
|
||||||
{
|
{
|
||||||
"name": "srv2",
|
"name": "srv2",
|
||||||
"role": "replica",
|
"role": "replica",
|
||||||
"state": "streaming",
|
"state": "in archive recovery",
|
||||||
"api_url": "https://10.20.199.4:8008/patroni",
|
"api_url": "https://10.20.199.4:8008/patroni",
|
||||||
"host": "10.20.199.4",
|
"host": "10.20.199.4",
|
||||||
"port": 5432,
|
"port": 5432,
|
||||||
|
|
26
tests/json/cluster_has_replica_patroni_verion_3.0.0.json
Normal file
26
tests/json/cluster_has_replica_patroni_verion_3.0.0.json
Normal file
|
@ -0,0 +1,26 @@
|
||||||
|
{
|
||||||
|
"state": "running",
|
||||||
|
"postmaster_start_time": "2021-08-11 07:02:20.732 UTC",
|
||||||
|
"role": "master",
|
||||||
|
"server_version": 110012,
|
||||||
|
"cluster_unlocked": false,
|
||||||
|
"xlog": {
|
||||||
|
"location": 1174407088
|
||||||
|
},
|
||||||
|
"timeline": 51,
|
||||||
|
"replication": [
|
||||||
|
{
|
||||||
|
"usename": "replicator",
|
||||||
|
"application_name": "srv1",
|
||||||
|
"client_addr": "10.20.199.3",
|
||||||
|
"state": "streaming",
|
||||||
|
"sync_state": "async",
|
||||||
|
"sync_priority": 0
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"database_system_identifier": "6965971025273547206",
|
||||||
|
"patroni": {
|
||||||
|
"version": "3.0.0",
|
||||||
|
"scope": "patroni-demo"
|
||||||
|
}
|
||||||
|
}
|
26
tests/json/cluster_has_replica_patroni_verion_3.1.0.json
Normal file
26
tests/json/cluster_has_replica_patroni_verion_3.1.0.json
Normal file
|
@ -0,0 +1,26 @@
|
||||||
|
{
|
||||||
|
"state": "running",
|
||||||
|
"postmaster_start_time": "2021-08-11 07:02:20.732 UTC",
|
||||||
|
"role": "master",
|
||||||
|
"server_version": 110012,
|
||||||
|
"cluster_unlocked": false,
|
||||||
|
"xlog": {
|
||||||
|
"location": 1174407088
|
||||||
|
},
|
||||||
|
"timeline": 51,
|
||||||
|
"replication": [
|
||||||
|
{
|
||||||
|
"usename": "replicator",
|
||||||
|
"application_name": "srv1",
|
||||||
|
"client_addr": "10.20.199.3",
|
||||||
|
"state": "streaming",
|
||||||
|
"sync_state": "async",
|
||||||
|
"sync_priority": 0
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"database_system_identifier": "6965971025273547206",
|
||||||
|
"patroni": {
|
||||||
|
"version": "3.1.0",
|
||||||
|
"scope": "patroni-demo"
|
||||||
|
}
|
||||||
|
}
|
33
tests/json/cluster_node_count_ko_in_archive_recovery.json
Normal file
33
tests/json/cluster_node_count_ko_in_archive_recovery.json
Normal file
|
@ -0,0 +1,33 @@
|
||||||
|
{
|
||||||
|
"members": [
|
||||||
|
{
|
||||||
|
"name": "srv1",
|
||||||
|
"role": "standby_leader",
|
||||||
|
"state": "in archive recovery",
|
||||||
|
"api_url": "https://10.20.199.3:8008/patroni",
|
||||||
|
"host": "10.20.199.3",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv2",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "in archive recovery",
|
||||||
|
"api_url": "https://10.20.199.4:8008/patroni",
|
||||||
|
"host": "10.20.199.4",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "srv3",
|
||||||
|
"role": "replica",
|
||||||
|
"state": "streaming",
|
||||||
|
"api_url": "https://10.20.199.5:8008/patroni",
|
||||||
|
"host": "10.20.199.5",
|
||||||
|
"port": 5432,
|
||||||
|
"timeline": 51,
|
||||||
|
"lag": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
|
@ -1,30 +1,20 @@
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_api_status_code_200(
|
def test_api_status_code_200(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
with patroni_api.routes({"patroni": "node_is_pending_restart_ok.json"}):
|
||||||
) -> None:
|
result = runner.invoke(
|
||||||
runner = CliRunner()
|
main, ["-e", patroni_api.endpoint, "node_is_pending_restart"]
|
||||||
|
)
|
||||||
my_mock(mocker, "node_is_pending_restart_ok", 200)
|
|
||||||
result = runner.invoke(
|
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
def test_api_status_code_404(
|
def test_api_status_code_404(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "Fake test", 404)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"]
|
main, ["-e", patroni_api.endpoint, "node_is_pending_restart"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 3
|
assert result.exit_code == 3
|
||||||
|
|
|
@ -1,23 +1,29 @@
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Iterator
|
||||||
|
|
||||||
import nagiosplugin
|
import nagiosplugin
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import here, my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(scope="module", autouse=True)
|
||||||
|
def cluster_config_has_changed(patroni_api: PatroniAPI) -> Iterator[None]:
|
||||||
|
with patroni_api.routes({"config": "cluster_config_has_changed.json"}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_config_has_changed_ok_with_hash(
|
def test_cluster_config_has_changed_ok_with_hash(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_config_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_config_has_changed",
|
"cluster_config_has_changed",
|
||||||
"--hash",
|
"--hash",
|
||||||
"96b12d82571473d13e890b893734e731",
|
"96b12d82571473d13e890b893734e731",
|
||||||
|
@ -31,22 +37,20 @@ def test_cluster_config_has_changed_ok_with_hash(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_config_has_changed_ok_with_state_file(
|
def test_cluster_config_has_changed_ok_with_state_file(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
state_file = tmp_path / "cluster_config_has_changed.state_file"
|
||||||
|
with state_file.open("w") as f:
|
||||||
with open(here / "cluster_config_has_changed.state_file", "w") as f:
|
|
||||||
f.write('{"hash": "96b12d82571473d13e890b893734e731"}')
|
f.write('{"hash": "96b12d82571473d13e890b893734e731"}')
|
||||||
|
|
||||||
my_mock(mocker, "cluster_config_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_config_has_changed",
|
"cluster_config_has_changed",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "cluster_config_has_changed.state_file"),
|
str(state_file),
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
|
@ -57,16 +61,13 @@ def test_cluster_config_has_changed_ok_with_state_file(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_config_has_changed_ko_with_hash(
|
def test_cluster_config_has_changed_ko_with_hash(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_config_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_config_has_changed",
|
"cluster_config_has_changed",
|
||||||
"--hash",
|
"--hash",
|
||||||
"96b12d82571473d13e890b8937ffffff",
|
"96b12d82571473d13e890b8937ffffff",
|
||||||
|
@ -80,24 +81,21 @@ def test_cluster_config_has_changed_ko_with_hash(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_config_has_changed_ko_with_state_file_and_save(
|
def test_cluster_config_has_changed_ko_with_state_file_and_save(
|
||||||
mocker: MockerFixture,
|
runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path
|
||||||
use_old_replica_state: bool,
|
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
state_file = tmp_path / "cluster_config_has_changed.state_file"
|
||||||
|
with state_file.open("w") as f:
|
||||||
with open(here / "cluster_config_has_changed.state_file", "w") as f:
|
|
||||||
f.write('{"hash": "96b12d82571473d13e890b8937ffffff"}')
|
f.write('{"hash": "96b12d82571473d13e890b8937ffffff"}')
|
||||||
|
|
||||||
my_mock(mocker, "cluster_config_has_changed", 200)
|
|
||||||
# test without saving the new hash
|
# test without saving the new hash
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_config_has_changed",
|
"cluster_config_has_changed",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "cluster_config_has_changed.state_file"),
|
str(state_file),
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
|
@ -106,7 +104,8 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save(
|
||||||
== "CLUSTERCONFIGHASCHANGED CRITICAL - The hash of patroni's dynamic configuration has changed. The old hash was 96b12d82571473d13e890b8937ffffff. | is_configuration_changed=1;;@1:1\n"
|
== "CLUSTERCONFIGHASCHANGED CRITICAL - The hash of patroni's dynamic configuration has changed. The old hash was 96b12d82571473d13e890b8937ffffff. | is_configuration_changed=1;;@1:1\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
cookie = nagiosplugin.Cookie(here / "cluster_config_has_changed.state_file")
|
state_file = tmp_path / "cluster_config_has_changed.state_file"
|
||||||
|
cookie = nagiosplugin.Cookie(state_file)
|
||||||
cookie.open()
|
cookie.open()
|
||||||
new_config_hash = cookie.get("hash")
|
new_config_hash = cookie.get("hash")
|
||||||
cookie.close()
|
cookie.close()
|
||||||
|
@ -118,10 +117,10 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_config_has_changed",
|
"cluster_config_has_changed",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "cluster_config_has_changed.state_file"),
|
str(state_file),
|
||||||
"--save",
|
"--save",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
|
@ -131,7 +130,7 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save(
|
||||||
== "CLUSTERCONFIGHASCHANGED CRITICAL - The hash of patroni's dynamic configuration has changed. The old hash was 96b12d82571473d13e890b8937ffffff. | is_configuration_changed=1;;@1:1\n"
|
== "CLUSTERCONFIGHASCHANGED CRITICAL - The hash of patroni's dynamic configuration has changed. The old hash was 96b12d82571473d13e890b8937ffffff. | is_configuration_changed=1;;@1:1\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
cookie = nagiosplugin.Cookie(here / "cluster_config_has_changed.state_file")
|
cookie = nagiosplugin.Cookie(state_file)
|
||||||
cookie.open()
|
cookie.open()
|
||||||
new_config_hash = cookie.get("hash")
|
new_config_hash = cookie.get("hash")
|
||||||
cookie.close()
|
cookie.close()
|
||||||
|
@ -140,22 +139,20 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_config_has_changed_params(
|
def test_cluster_config_has_changed_params(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path
|
||||||
) -> None:
|
) -> None:
|
||||||
# This one is placed last because it seems like the exceptions are not flushed from stderr for the next tests.
|
# This one is placed last because it seems like the exceptions are not flushed from stderr for the next tests.
|
||||||
runner = CliRunner()
|
fake_state_file = tmp_path / "fake_file_name.state_file"
|
||||||
|
|
||||||
my_mock(mocker, "cluster_config_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_config_has_changed",
|
"cluster_config_has_changed",
|
||||||
"--hash",
|
"--hash",
|
||||||
"640df9f0211c791723f18fc3ed9dbb95",
|
"640df9f0211c791723f18fc3ed9dbb95",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "fake_file_name.state_file"),
|
str(fake_state_file),
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 3
|
assert result.exit_code == 3
|
||||||
|
|
|
@ -1,54 +1,139 @@
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Iterator, Union
|
||||||
|
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI, cluster_api_set_replica_running
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_has_leader_ok(
|
@pytest.fixture
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
def cluster_has_leader_ok(
|
||||||
) -> None:
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
runner = CliRunner()
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_leader_ok.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_leader_ok", 200)
|
|
||||||
result = runner.invoke(
|
@pytest.mark.usefixtures("cluster_has_leader_ok")
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_has_leader"]
|
def test_cluster_has_leader_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
)
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"])
|
||||||
assert result.exit_code == 0
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0\n"
|
== "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0 is_leader=1 is_standby_leader=0 is_standby_leader_in_arc_rec=0;@1:1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_leader_ok_standby_leader(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_leader_ok_standby_leader.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_leader_ok_standby_leader")
|
||||||
def test_cluster_has_leader_ok_standby_leader(
|
def test_cluster_has_leader_ok_standby_leader(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"])
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_leader_ok_standby_leader", 200)
|
|
||||||
result = runner.invoke(
|
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_has_leader"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 0
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0\n"
|
== "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0 is_leader=0 is_standby_leader=1 is_standby_leader_in_arc_rec=0;@1:1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_has_leader_ko(
|
@pytest.fixture
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
def cluster_has_leader_ko(
|
||||||
) -> None:
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
runner = CliRunner()
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_leader_ko.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_leader_ko", 200)
|
|
||||||
result = runner.invoke(
|
@pytest.mark.usefixtures("cluster_has_leader_ko")
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_has_leader"]
|
def test_cluster_has_leader_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"])
|
||||||
|
assert (
|
||||||
|
result.stdout
|
||||||
|
== "CLUSTERHASLEADER CRITICAL - The cluster has no running leader or the standby leader is in archive recovery. | has_leader=0;;@0 is_leader=0 is_standby_leader=0 is_standby_leader_in_arc_rec=0;@1:1\n"
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_leader_ko_standby_leader(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_leader_ko_standby_leader.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_leader_ko_standby_leader")
|
||||||
|
def test_cluster_has_leader_ko_standby_leader(
|
||||||
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
|
) -> None:
|
||||||
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"])
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASLEADER CRITICAL - The cluster has no running leader. | has_leader=0;;@0\n"
|
== "CLUSTERHASLEADER CRITICAL - The cluster has no running leader or the standby leader is in archive recovery. | has_leader=0;;@0 is_leader=0 is_standby_leader=0 is_standby_leader_in_arc_rec=0;@1:1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_leader_ko_standby_leader_archiving(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = (
|
||||||
|
"cluster_has_leader_ko_standby_leader_archiving.json"
|
||||||
|
)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_leader_ko_standby_leader_archiving")
|
||||||
|
def test_cluster_has_leader_ko_standby_leader_archiving(
|
||||||
|
runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool
|
||||||
|
) -> None:
|
||||||
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"])
|
||||||
|
if old_replica_state:
|
||||||
|
assert (
|
||||||
|
result.stdout
|
||||||
|
== "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0 is_leader=0 is_standby_leader=1 is_standby_leader_in_arc_rec=0;@1:1\n"
|
||||||
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
else:
|
||||||
|
assert (
|
||||||
|
result.stdout
|
||||||
|
== "CLUSTERHASLEADER WARNING - The cluster has no running leader or the standby leader is in archive recovery. | has_leader=1;;@0 is_leader=0 is_standby_leader=1 is_standby_leader_in_arc_rec=1;@1:1\n"
|
||||||
|
)
|
||||||
|
assert result.exit_code == 1
|
||||||
|
|
|
@ -1,39 +1,46 @@
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Iterator, Union
|
||||||
|
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI, cluster_api_set_replica_running
|
||||||
|
|
||||||
|
|
||||||
# TODO Lag threshold tests
|
@pytest.fixture
|
||||||
def test_cluster_has_relica_ok(
|
def cluster_has_replica_ok(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
) -> None:
|
) -> Iterator[None]:
|
||||||
runner = CliRunner()
|
cluster_path: Union[str, Path] = "cluster_has_replica_ok.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_replica_ok", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
@pytest.mark.usefixtures("cluster_has_replica_ok")
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_has_replica"]
|
def test_cluster_has_relica_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
)
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_replica"])
|
||||||
assert result.exit_code == 0
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv3_lag=0 srv3_sync=1 sync_replica=1 unhealthy_replica=0\n"
|
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=1 srv3_timeline=51 sync_replica=1 unhealthy_replica=0\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ok")
|
||||||
def test_cluster_has_replica_ok_with_count_thresholds(
|
def test_cluster_has_replica_ok_with_count_thresholds(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_replica_ok", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_has_replica",
|
"cluster_has_replica",
|
||||||
"--warning",
|
"--warning",
|
||||||
"@1",
|
"@1",
|
||||||
|
@ -41,48 +48,56 @@ def test_cluster_has_replica_ok_with_count_thresholds(
|
||||||
"@0",
|
"@0",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=0 srv2_sync=0 srv3_lag=0 srv3_sync=1 sync_replica=1 unhealthy_replica=0\n"
|
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=1 srv3_timeline=51 sync_replica=1 unhealthy_replica=0\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ok")
|
||||||
def test_cluster_has_replica_ok_with_sync_count_thresholds(
|
def test_cluster_has_replica_ok_with_sync_count_thresholds(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_replica_ok", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_has_replica",
|
"cluster_has_replica",
|
||||||
"--sync-warning",
|
"--sync-warning",
|
||||||
"1:",
|
"1:",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv3_lag=0 srv3_sync=1 sync_replica=1;1: unhealthy_replica=0\n"
|
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=1 srv3_timeline=51 sync_replica=1;1: unhealthy_replica=0\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_replica_ok_lag(
|
||||||
|
patroni_api: PatroniAPI, datadir: Path, tmp_path: Path, old_replica_state: bool
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_replica_ok_lag.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ok_lag")
|
||||||
def test_cluster_has_replica_ok_with_count_thresholds_lag(
|
def test_cluster_has_replica_ok_with_count_thresholds_lag(
|
||||||
mocker: MockerFixture,
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
use_old_replica_state: bool,
|
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_replica_ok_lag", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_has_replica",
|
"cluster_has_replica",
|
||||||
"--warning",
|
"--warning",
|
||||||
"@1",
|
"@1",
|
||||||
|
@ -92,24 +107,35 @@ def test_cluster_has_replica_ok_with_count_thresholds_lag(
|
||||||
"1MB",
|
"1MB",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=1024 srv2_sync=0 srv3_lag=0 srv3_sync=0 sync_replica=0 unhealthy_replica=0\n"
|
== "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=1024 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=0\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_replica_ko(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_replica_ko.json"
|
||||||
|
patroni_path: Union[str, Path] = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ko")
|
||||||
def test_cluster_has_replica_ko_with_count_thresholds(
|
def test_cluster_has_replica_ko_with_count_thresholds(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_replica_ko", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_has_replica",
|
"cluster_has_replica",
|
||||||
"--warning",
|
"--warning",
|
||||||
"@1",
|
"@1",
|
||||||
|
@ -117,24 +143,22 @@ def test_cluster_has_replica_ko_with_count_thresholds(
|
||||||
"@0",
|
"@0",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 1
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASREPLICA WARNING - healthy_replica is 1 (outside range @0:1) | healthy_replica=1;@1;@0 srv3_lag=0 srv3_sync=0 sync_replica=0 unhealthy_replica=1\n"
|
== "CLUSTERHASREPLICA WARNING - healthy_replica is 1 (outside range @0:1) | healthy_replica=1;@1;@0 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 1
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ko")
|
||||||
def test_cluster_has_replica_ko_with_sync_count_thresholds(
|
def test_cluster_has_replica_ko_with_sync_count_thresholds(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_replica_ko", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_has_replica",
|
"cluster_has_replica",
|
||||||
"--sync-warning",
|
"--sync-warning",
|
||||||
"2:",
|
"2:",
|
||||||
|
@ -142,25 +166,36 @@ def test_cluster_has_replica_ko_with_sync_count_thresholds(
|
||||||
"1:",
|
"1:",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
# The lag on srv2 is "unknown". We don't handle string in perfstats so we have to scratch all the second node stats
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASREPLICA CRITICAL - sync_replica is 0 (outside range 1:) | healthy_replica=1 srv3_lag=0 srv3_sync=0 sync_replica=0;2:;1: unhealthy_replica=1\n"
|
== "CLUSTERHASREPLICA CRITICAL - sync_replica is 0 (outside range 1:) | healthy_replica=1 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0;2:;1: unhealthy_replica=1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_replica_ko_lag(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_replica_ko_lag.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ko_lag")
|
||||||
def test_cluster_has_replica_ko_with_count_thresholds_and_lag(
|
def test_cluster_has_replica_ko_with_count_thresholds_and_lag(
|
||||||
mocker: MockerFixture,
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
use_old_replica_state: bool,
|
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_has_replica_ko_lag", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_has_replica",
|
"cluster_has_replica",
|
||||||
"--warning",
|
"--warning",
|
||||||
"@1",
|
"@1",
|
||||||
|
@ -170,8 +205,84 @@ def test_cluster_has_replica_ko_with_count_thresholds_and_lag(
|
||||||
"1MB",
|
"1MB",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERHASREPLICA CRITICAL - healthy_replica is 0 (outside range @0:0) | healthy_replica=0;@1;@0 srv2_lag=10241024 srv2_sync=0 srv3_lag=20000000 srv3_sync=0 sync_replica=0 unhealthy_replica=2\n"
|
== "CLUSTERHASREPLICA CRITICAL - healthy_replica is 0 (outside range @0:0) | healthy_replica=0;@1;@0 srv2_lag=10241024 srv2_sync=0 srv2_timeline=51 srv3_lag=20000000 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=2\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_replica_ko_wrong_tl(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_replica_ko_wrong_tl.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ko_wrong_tl")
|
||||||
|
def test_cluster_has_replica_ko_wrong_tl(
|
||||||
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
|
) -> None:
|
||||||
|
result = runner.invoke(
|
||||||
|
main,
|
||||||
|
[
|
||||||
|
"-e",
|
||||||
|
patroni_api.endpoint,
|
||||||
|
"cluster_has_replica",
|
||||||
|
"--warning",
|
||||||
|
"@1",
|
||||||
|
"--critical",
|
||||||
|
"@0",
|
||||||
|
"--max-lag",
|
||||||
|
"1MB",
|
||||||
|
],
|
||||||
|
)
|
||||||
|
assert (
|
||||||
|
result.stdout
|
||||||
|
== "CLUSTERHASREPLICA WARNING - healthy_replica is 1 (outside range @0:1) | healthy_replica=1;@1;@0 srv2_lag=1000000 srv2_sync=0 srv2_timeline=50 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=1\n"
|
||||||
|
)
|
||||||
|
assert result.exit_code == 1
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_has_replica_ko_all_replica(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_has_replica_ko_all_replica.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_has_replica_ko_all_replica")
|
||||||
|
def test_cluster_has_replica_ko_all_replica(
|
||||||
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
|
) -> None:
|
||||||
|
result = runner.invoke(
|
||||||
|
main,
|
||||||
|
[
|
||||||
|
"-e",
|
||||||
|
patroni_api.endpoint,
|
||||||
|
"cluster_has_replica",
|
||||||
|
"--warning",
|
||||||
|
"@1",
|
||||||
|
"--critical",
|
||||||
|
"@0",
|
||||||
|
"--max-lag",
|
||||||
|
"1MB",
|
||||||
|
],
|
||||||
|
)
|
||||||
|
assert (
|
||||||
|
result.stdout
|
||||||
|
== "CLUSTERHASREPLICA CRITICAL - healthy_replica is 0 (outside range @0:0) | healthy_replica=0;@1;@0 srv1_lag=0 srv1_sync=0 srv1_timeline=51 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=3\n"
|
||||||
|
)
|
||||||
|
assert result.exit_code == 2
|
||||||
|
|
|
@ -1,20 +1,17 @@
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_has_scheduled_action_ok(
|
def test_cluster_has_scheduled_action_ok(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
with patroni_api.routes({"cluster": "cluster_has_scheduled_action_ok.json"}):
|
||||||
|
result = runner.invoke(
|
||||||
my_mock(mocker, "cluster_has_scheduled_action_ok", 200)
|
main, ["-e", patroni_api.endpoint, "cluster_has_scheduled_action"]
|
||||||
result = runner.invoke(
|
)
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_has_scheduled_action"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -23,14 +20,14 @@ def test_cluster_has_scheduled_action_ok(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_has_scheduled_action_ko_switchover(
|
def test_cluster_has_scheduled_action_ko_switchover(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
with patroni_api.routes(
|
||||||
|
{"cluster": "cluster_has_scheduled_action_ko_switchover.json"}
|
||||||
my_mock(mocker, "cluster_has_scheduled_action_ko_switchover", 200)
|
):
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_has_scheduled_action"]
|
main, ["-e", patroni_api.endpoint, "cluster_has_scheduled_action"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -39,14 +36,14 @@ def test_cluster_has_scheduled_action_ko_switchover(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_has_scheduled_action_ko_restart(
|
def test_cluster_has_scheduled_action_ko_restart(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
with patroni_api.routes(
|
||||||
|
{"cluster": "cluster_has_scheduled_action_ko_restart.json"}
|
||||||
my_mock(mocker, "cluster_has_scheduled_action_ko_restart", 200)
|
):
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_has_scheduled_action"]
|
main, ["-e", patroni_api.endpoint, "cluster_has_scheduled_action"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
|
|
@ -1,20 +1,17 @@
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_is_in_maintenance_ok(
|
def test_cluster_is_in_maintenance_ok(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
with patroni_api.routes({"cluster": "cluster_is_in_maintenance_ok.json"}):
|
||||||
|
result = runner.invoke(
|
||||||
my_mock(mocker, "cluster_is_in_maintenance_ok", 200)
|
main, ["-e", patroni_api.endpoint, "cluster_is_in_maintenance"]
|
||||||
result = runner.invoke(
|
)
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_is_in_maintenance"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -23,14 +20,12 @@ def test_cluster_is_in_maintenance_ok(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_is_in_maintenance_ko(
|
def test_cluster_is_in_maintenance_ko(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
with patroni_api.routes({"cluster": "cluster_is_in_maintenance_ko.json"}):
|
||||||
|
result = runner.invoke(
|
||||||
my_mock(mocker, "cluster_is_in_maintenance_ko", 200)
|
main, ["-e", patroni_api.endpoint, "cluster_is_in_maintenance"]
|
||||||
result = runner.invoke(
|
)
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_is_in_maintenance"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -39,14 +34,14 @@ def test_cluster_is_in_maintenance_ko(
|
||||||
|
|
||||||
|
|
||||||
def test_cluster_is_in_maintenance_ok_pause_false(
|
def test_cluster_is_in_maintenance_ok_pause_false(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
with patroni_api.routes(
|
||||||
|
{"cluster": "cluster_is_in_maintenance_ok_pause_false.json"}
|
||||||
my_mock(mocker, "cluster_is_in_maintenance_ok_pause_false", 200)
|
):
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_is_in_maintenance"]
|
main, ["-e", patroni_api.endpoint, "cluster_is_in_maintenance"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
|
|
@ -1,22 +1,33 @@
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Iterator, Union
|
||||||
|
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI, cluster_api_set_replica_running
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_node_count_ok(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_node_count_ok.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_node_count_ok")
|
||||||
def test_cluster_node_count_ok(
|
def test_cluster_node_count_ok(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_node_count"])
|
||||||
|
if old_replica_state:
|
||||||
my_mock(mocker, "cluster_node_count_ok", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
|
||||||
main, ["-e", "https://10.20.199.3:8008", "cluster_node_count"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 0
|
|
||||||
if use_old_replica_state:
|
|
||||||
assert (
|
assert (
|
||||||
result.output
|
result.output
|
||||||
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3 members=3 role_leader=1 role_replica=2 state_running=3\n"
|
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3 members=3 role_leader=1 role_replica=2 state_running=3\n"
|
||||||
|
@ -26,19 +37,18 @@ def test_cluster_node_count_ok(
|
||||||
result.output
|
result.output
|
||||||
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3 members=3 role_leader=1 role_replica=2 state_running=1 state_streaming=2\n"
|
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3 members=3 role_leader=1 role_replica=2 state_running=1 state_streaming=2\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_node_count_ok")
|
||||||
def test_cluster_node_count_ok_with_thresholds(
|
def test_cluster_node_count_ok_with_thresholds(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_node_count_ok", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_node_count",
|
"cluster_node_count",
|
||||||
"--warning",
|
"--warning",
|
||||||
"@0:1",
|
"@0:1",
|
||||||
|
@ -50,8 +60,7 @@ def test_cluster_node_count_ok_with_thresholds(
|
||||||
"@0:1",
|
"@0:1",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
if old_replica_state:
|
||||||
if use_old_replica_state:
|
|
||||||
assert (
|
assert (
|
||||||
result.output
|
result.output
|
||||||
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3;@1;@2 role_leader=1 role_replica=2 state_running=3\n"
|
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3;@1;@2 role_leader=1 role_replica=2 state_running=3\n"
|
||||||
|
@ -61,19 +70,31 @@ def test_cluster_node_count_ok_with_thresholds(
|
||||||
result.output
|
result.output
|
||||||
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3;@1;@2 role_leader=1 role_replica=2 state_running=1 state_streaming=2\n"
|
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3;@1;@2 role_leader=1 role_replica=2 state_running=1 state_streaming=2\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_node_count_healthy_warning(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_node_count_healthy_warning.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_node_count_healthy_warning")
|
||||||
def test_cluster_node_count_healthy_warning(
|
def test_cluster_node_count_healthy_warning(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_node_count_healthy_warning", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_node_count",
|
"cluster_node_count",
|
||||||
"--healthy-warning",
|
"--healthy-warning",
|
||||||
"@2",
|
"@2",
|
||||||
|
@ -81,8 +102,7 @@ def test_cluster_node_count_healthy_warning(
|
||||||
"@0:1",
|
"@0:1",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 1
|
if old_replica_state:
|
||||||
if use_old_replica_state:
|
|
||||||
assert (
|
assert (
|
||||||
result.output
|
result.output
|
||||||
== "CLUSTERNODECOUNT WARNING - healthy_members is 2 (outside range @0:2) | healthy_members=2;@2;@1 members=2 role_leader=1 role_replica=1 state_running=2\n"
|
== "CLUSTERNODECOUNT WARNING - healthy_members is 2 (outside range @0:2) | healthy_members=2;@2;@1 members=2 role_leader=1 role_replica=1 state_running=2\n"
|
||||||
|
@ -92,19 +112,31 @@ def test_cluster_node_count_healthy_warning(
|
||||||
result.output
|
result.output
|
||||||
== "CLUSTERNODECOUNT WARNING - healthy_members is 2 (outside range @0:2) | healthy_members=2;@2;@1 members=2 role_leader=1 role_replica=1 state_running=1 state_streaming=1\n"
|
== "CLUSTERNODECOUNT WARNING - healthy_members is 2 (outside range @0:2) | healthy_members=2;@2;@1 members=2 role_leader=1 role_replica=1 state_running=1 state_streaming=1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 1
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_node_count_healthy_critical(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_node_count_healthy_critical.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_node_count_healthy_critical")
|
||||||
def test_cluster_node_count_healthy_critical(
|
def test_cluster_node_count_healthy_critical(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_node_count_healthy_critical", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_node_count",
|
"cluster_node_count",
|
||||||
"--healthy-warning",
|
"--healthy-warning",
|
||||||
"@2",
|
"@2",
|
||||||
|
@ -112,24 +144,35 @@ def test_cluster_node_count_healthy_critical(
|
||||||
"@0:1",
|
"@0:1",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
|
||||||
assert (
|
assert (
|
||||||
result.output
|
result.output
|
||||||
== "CLUSTERNODECOUNT CRITICAL - healthy_members is 1 (outside range @0:1) | healthy_members=1;@2;@1 members=3 role_leader=1 role_replica=2 state_running=1 state_start_failed=2\n"
|
== "CLUSTERNODECOUNT CRITICAL - healthy_members is 1 (outside range @0:1) | healthy_members=1;@2;@1 members=3 role_leader=1 role_replica=2 state_running=1 state_start_failed=2\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_node_count_warning(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_node_count_warning.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_node_count_warning")
|
||||||
def test_cluster_node_count_warning(
|
def test_cluster_node_count_warning(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_node_count_warning", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_node_count",
|
"cluster_node_count",
|
||||||
"--warning",
|
"--warning",
|
||||||
"@2",
|
"@2",
|
||||||
|
@ -137,8 +180,7 @@ def test_cluster_node_count_warning(
|
||||||
"@0:1",
|
"@0:1",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 1
|
if old_replica_state:
|
||||||
if use_old_replica_state:
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERNODECOUNT WARNING - members is 2 (outside range @0:2) | healthy_members=2 members=2;@2;@1 role_leader=1 role_replica=1 state_running=2\n"
|
== "CLUSTERNODECOUNT WARNING - members is 2 (outside range @0:2) | healthy_members=2 members=2;@2;@1 role_leader=1 role_replica=1 state_running=2\n"
|
||||||
|
@ -148,19 +190,31 @@ def test_cluster_node_count_warning(
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERNODECOUNT WARNING - members is 2 (outside range @0:2) | healthy_members=2 members=2;@2;@1 role_leader=1 role_replica=1 state_running=1 state_streaming=1\n"
|
== "CLUSTERNODECOUNT WARNING - members is 2 (outside range @0:2) | healthy_members=2 members=2;@2;@1 role_leader=1 role_replica=1 state_running=1 state_streaming=1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 1
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_node_count_critical(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_node_count_critical.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_node_count_critical")
|
||||||
def test_cluster_node_count_critical(
|
def test_cluster_node_count_critical(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "cluster_node_count_critical", 200, use_old_replica_state)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"cluster_node_count",
|
"cluster_node_count",
|
||||||
"--warning",
|
"--warning",
|
||||||
"@2",
|
"@2",
|
||||||
|
@ -168,8 +222,51 @@ def test_cluster_node_count_critical(
|
||||||
"@0:1",
|
"@0:1",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "CLUSTERNODECOUNT CRITICAL - members is 1 (outside range @0:1) | healthy_members=1 members=1;@2;@1 role_leader=1 state_running=1\n"
|
== "CLUSTERNODECOUNT CRITICAL - members is 1 (outside range @0:1) | healthy_members=1 members=1;@2;@1 role_leader=1 state_running=1\n"
|
||||||
)
|
)
|
||||||
|
assert result.exit_code == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def cluster_node_count_ko_in_archive_recovery(
|
||||||
|
patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path
|
||||||
|
) -> Iterator[None]:
|
||||||
|
cluster_path: Union[str, Path] = "cluster_node_count_ko_in_archive_recovery.json"
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json"
|
||||||
|
if old_replica_state:
|
||||||
|
cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path)
|
||||||
|
patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json"
|
||||||
|
with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("cluster_node_count_ko_in_archive_recovery")
|
||||||
|
def test_cluster_node_count_ko_in_archive_recovery(
|
||||||
|
runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool
|
||||||
|
) -> None:
|
||||||
|
result = runner.invoke(
|
||||||
|
main,
|
||||||
|
[
|
||||||
|
"-e",
|
||||||
|
patroni_api.endpoint,
|
||||||
|
"cluster_node_count",
|
||||||
|
"--healthy-warning",
|
||||||
|
"@2",
|
||||||
|
"--healthy-critical",
|
||||||
|
"@0:1",
|
||||||
|
],
|
||||||
|
)
|
||||||
|
if old_replica_state:
|
||||||
|
assert (
|
||||||
|
result.stdout
|
||||||
|
== "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3 role_replica=2 role_standby_leader=1 state_running=3\n"
|
||||||
|
)
|
||||||
|
assert result.exit_code == 0
|
||||||
|
else:
|
||||||
|
assert (
|
||||||
|
result.stdout
|
||||||
|
== "CLUSTERNODECOUNT CRITICAL - healthy_members is 1 (outside range @0:1) | healthy_members=1;@2;@1 members=3 role_replica=2 role_standby_leader=1 state_in_archive_recovery=2 state_streaming=1\n"
|
||||||
|
)
|
||||||
|
assert result.exit_code == 2
|
||||||
|
|
|
@ -1,16 +1,19 @@
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_alive_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
def test_node_is_alive_ok(
|
||||||
runner = CliRunner()
|
runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path
|
||||||
|
) -> None:
|
||||||
my_mock(mocker, None, 200)
|
liveness = tmp_path / "liveness"
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_alive"])
|
liveness.touch()
|
||||||
|
with patroni_api.routes({"liveness": liveness}):
|
||||||
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_alive"])
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -18,11 +21,8 @@ def test_node_is_alive_ok(mocker: MockerFixture, use_old_replica_state: bool) ->
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_alive_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
def test_node_is_alive_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
runner = CliRunner()
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_alive"])
|
||||||
|
|
||||||
my_mock(mocker, None, 404)
|
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_alive"])
|
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
|
|
@ -1,28 +1,37 @@
|
||||||
|
from typing import Iterator
|
||||||
|
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_leader_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
@pytest.fixture
|
||||||
runner = CliRunner()
|
def node_is_leader_ok(patroni_api: PatroniAPI) -> Iterator[None]:
|
||||||
|
with patroni_api.routes(
|
||||||
|
{
|
||||||
|
"leader": "node_is_leader_ok.json",
|
||||||
|
"standby-leader": "node_is_leader_ok_standby_leader.json",
|
||||||
|
}
|
||||||
|
):
|
||||||
|
yield None
|
||||||
|
|
||||||
my_mock(mocker, "node_is_leader_ok", 200)
|
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_leader"])
|
@pytest.mark.usefixtures("node_is_leader_ok")
|
||||||
|
def test_node_is_leader_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_leader"])
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "NODEISLEADER OK - This node is a leader node. | is_leader=1;;@0\n"
|
== "NODEISLEADER OK - This node is a leader node. | is_leader=1;;@0\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
my_mock(mocker, "node_is_leader_ok_standby_leader", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
["-e", "https://10.20.199.3:8008", "node_is_leader", "--is-standby-leader"],
|
["-e", patroni_api.endpoint, "node_is_leader", "--is-standby-leader"],
|
||||||
)
|
)
|
||||||
print(result.stdout)
|
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -30,21 +39,17 @@ def test_node_is_leader_ok(mocker: MockerFixture, use_old_replica_state: bool) -
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_leader_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
def test_node_is_leader_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
runner = CliRunner()
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_leader"])
|
||||||
|
|
||||||
my_mock(mocker, "node_is_leader_ko", 503)
|
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_leader"])
|
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
== "NODEISLEADER CRITICAL - This node is not a leader node. | is_leader=0;;@0\n"
|
== "NODEISLEADER CRITICAL - This node is not a leader node. | is_leader=0;;@0\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
my_mock(mocker, "node_is_leader_ko_standby_leader", 503)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
["-e", "https://10.20.199.3:8008", "node_is_leader", "--is-standby-leader"],
|
["-e", patroni_api.endpoint, "node_is_leader", "--is-standby-leader"],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
|
|
|
@ -1,20 +1,15 @@
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_pending_restart_ok(
|
def test_node_is_pending_restart_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
with patroni_api.routes({"patroni": "node_is_pending_restart_ok.json"}):
|
||||||
) -> None:
|
result = runner.invoke(
|
||||||
runner = CliRunner()
|
main, ["-e", patroni_api.endpoint, "node_is_pending_restart"]
|
||||||
|
)
|
||||||
my_mock(mocker, "node_is_pending_restart_ok", 200)
|
|
||||||
result = runner.invoke(
|
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -22,15 +17,11 @@ def test_node_is_pending_restart_ok(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_pending_restart_ko(
|
def test_node_is_pending_restart_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
with patroni_api.routes({"patroni": "node_is_pending_restart_ko.json"}):
|
||||||
) -> None:
|
result = runner.invoke(
|
||||||
runner = CliRunner()
|
main, ["-e", patroni_api.endpoint, "node_is_pending_restart"]
|
||||||
|
)
|
||||||
my_mock(mocker, "node_is_pending_restart_ko", 200)
|
|
||||||
result = runner.invoke(
|
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
|
|
@ -1,16 +1,13 @@
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_primary_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
def test_node_is_primary_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
runner = CliRunner()
|
with patroni_api.routes({"primary": "node_is_primary_ok.json"}):
|
||||||
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_primary"])
|
||||||
my_mock(mocker, "node_is_primary_ok", 200)
|
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_primary"])
|
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -18,11 +15,8 @@ def test_node_is_primary_ok(mocker: MockerFixture, use_old_replica_state: bool)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_primary_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
def test_node_is_primary_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
runner = CliRunner()
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_primary"])
|
||||||
|
|
||||||
my_mock(mocker, "node_is_primary_ko", 503)
|
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_primary"])
|
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
|
|
@ -1,16 +1,27 @@
|
||||||
|
from typing import Iterator
|
||||||
|
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
@pytest.fixture
|
||||||
runner = CliRunner()
|
def node_is_replica_ok(patroni_api: PatroniAPI) -> Iterator[None]:
|
||||||
|
with patroni_api.routes(
|
||||||
|
{
|
||||||
|
k: "node_is_replica_ok.json"
|
||||||
|
for k in ("replica", "synchronous", "asynchronous")
|
||||||
|
}
|
||||||
|
):
|
||||||
|
yield None
|
||||||
|
|
||||||
my_mock(mocker, "node_is_replica_ok", 200)
|
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_replica"])
|
@pytest.mark.usefixtures("node_is_replica_ok")
|
||||||
|
def test_node_is_replica_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_replica"])
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -18,11 +29,8 @@ def test_node_is_replica_ok(mocker: MockerFixture, use_old_replica_state: bool)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None:
|
def test_node_is_replica_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
runner = CliRunner()
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_replica"])
|
||||||
|
|
||||||
my_mock(mocker, "node_is_replica_ko", 503)
|
|
||||||
result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_replica"])
|
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
@ -30,15 +38,10 @@ def test_node_is_replica_ko(mocker: MockerFixture, use_old_replica_state: bool)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_ko_lag(
|
def test_node_is_replica_ko_lag(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
# We don't do the check ourselves, patroni does it and changes the return code
|
# We don't do the check ourselves, patroni does it and changes the return code
|
||||||
my_mock(mocker, "node_is_replica_ok", 503)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--max-lag", "100"]
|
main, ["-e", patroni_api.endpoint, "node_is_replica", "--max-lag", "100"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
|
@ -46,12 +49,11 @@ def test_node_is_replica_ko_lag(
|
||||||
== "NODEISREPLICA CRITICAL - This node is not a running replica with no noloadbalance tag and a lag under 100. | is_replica=0;;@0\n"
|
== "NODEISREPLICA CRITICAL - This node is not a running replica with no noloadbalance tag and a lag under 100. | is_replica=0;;@0\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
my_mock(mocker, "node_is_replica_ok", 503)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_is_replica",
|
"node_is_replica",
|
||||||
"--is-async",
|
"--is-async",
|
||||||
"--max-lag",
|
"--max-lag",
|
||||||
|
@ -65,15 +67,11 @@ def test_node_is_replica_ko_lag(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_sync_ok(
|
@pytest.mark.usefixtures("node_is_replica_ok")
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
def test_node_is_replica_sync_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
# We don't do the check ourselves, patroni does it and changes the return code
|
# We don't do the check ourselves, patroni does it and changes the return code
|
||||||
my_mock(mocker, "node_is_replica_ok", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-sync"]
|
main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-sync"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
|
@ -82,15 +80,10 @@ def test_node_is_replica_sync_ok(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_sync_ko(
|
def test_node_is_replica_sync_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
# We don't do the check ourselves, patroni does it and changes the return code
|
# We don't do the check ourselves, patroni does it and changes the return code
|
||||||
my_mock(mocker, "node_is_replica_ok", 503)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-sync"]
|
main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-sync"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
|
@ -99,15 +92,11 @@ def test_node_is_replica_sync_ko(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_async_ok(
|
@pytest.mark.usefixtures("node_is_replica_ok")
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
def test_node_is_replica_async_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
# We don't do the check ourselves, patroni does it and changes the return code
|
# We don't do the check ourselves, patroni does it and changes the return code
|
||||||
my_mock(mocker, "node_is_replica_ok", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-async"]
|
main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-async"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
assert (
|
assert (
|
||||||
|
@ -116,15 +105,10 @@ def test_node_is_replica_async_ok(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_async_ko(
|
def test_node_is_replica_async_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
# We don't do the check ourselves, patroni does it and changes the return code
|
# We don't do the check ourselves, patroni does it and changes the return code
|
||||||
my_mock(mocker, "node_is_replica_ok", 503)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-async"]
|
main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-async"]
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
assert (
|
assert (
|
||||||
|
@ -133,18 +117,14 @@ def test_node_is_replica_async_ko(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_is_replica_params(
|
@pytest.mark.usefixtures("node_is_replica_ok")
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
def test_node_is_replica_params(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
# We don't do the check ourselves, patroni does it and changes the return code
|
# We don't do the check ourselves, patroni does it and changes the return code
|
||||||
my_mock(mocker, "node_is_replica_ok", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_is_replica",
|
"node_is_replica",
|
||||||
"--is-async",
|
"--is-async",
|
||||||
"--is-sync",
|
"--is-sync",
|
||||||
|
@ -157,12 +137,11 @@ def test_node_is_replica_params(
|
||||||
)
|
)
|
||||||
|
|
||||||
# We don't do the check ourselves, patroni does it and changes the return code
|
# We don't do the check ourselves, patroni does it and changes the return code
|
||||||
my_mock(mocker, "node_is_replica_ok", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_is_replica",
|
"node_is_replica",
|
||||||
"--is-sync",
|
"--is-sync",
|
||||||
"--max-lag",
|
"--max-lag",
|
||||||
|
|
|
@ -1,22 +1,25 @@
|
||||||
|
from typing import Iterator
|
||||||
|
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
def test_node_patroni_version_ok(
|
@pytest.fixture(scope="module", autouse=True)
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
def node_patroni_version(patroni_api: PatroniAPI) -> Iterator[None]:
|
||||||
) -> None:
|
with patroni_api.routes({"patroni": "node_patroni_version.json"}):
|
||||||
runner = CliRunner()
|
yield None
|
||||||
|
|
||||||
my_mock(mocker, "node_patroni_version", 200)
|
|
||||||
|
def test_node_patroni_version_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_patroni_version",
|
"node_patroni_version",
|
||||||
"--patroni-version",
|
"--patroni-version",
|
||||||
"2.0.2",
|
"2.0.2",
|
||||||
|
@ -29,17 +32,12 @@ def test_node_patroni_version_ok(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_node_patroni_version_ko(
|
def test_node_patroni_version_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None:
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
|
||||||
) -> None:
|
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "node_patroni_version", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_patroni_version",
|
"node_patroni_version",
|
||||||
"--patroni-version",
|
"--patroni-version",
|
||||||
"1.0.0",
|
"1.0.0",
|
||||||
|
|
|
@ -1,23 +1,30 @@
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Iterator
|
||||||
|
|
||||||
import nagiosplugin
|
import nagiosplugin
|
||||||
|
import pytest
|
||||||
from click.testing import CliRunner
|
from click.testing import CliRunner
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.cli import main
|
from check_patroni.cli import main
|
||||||
|
|
||||||
from .tools import here, my_mock
|
from . import PatroniAPI
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def node_tl_has_changed(patroni_api: PatroniAPI) -> Iterator[None]:
|
||||||
|
with patroni_api.routes({"patroni": "node_tl_has_changed.json"}):
|
||||||
|
yield None
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("node_tl_has_changed")
|
||||||
def test_node_tl_has_changed_ok_with_timeline(
|
def test_node_tl_has_changed_ok_with_timeline(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "node_tl_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_tl_has_changed",
|
"node_tl_has_changed",
|
||||||
"--timeline",
|
"--timeline",
|
||||||
"58",
|
"58",
|
||||||
|
@ -30,23 +37,22 @@ def test_node_tl_has_changed_ok_with_timeline(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("node_tl_has_changed")
|
||||||
def test_node_tl_has_changed_ok_with_state_file(
|
def test_node_tl_has_changed_ok_with_state_file(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
state_file = tmp_path / "node_tl_has_changed.state_file"
|
||||||
|
with state_file.open("w") as f:
|
||||||
with open(here / "node_tl_has_changed.state_file", "w") as f:
|
|
||||||
f.write('{"timeline": 58}')
|
f.write('{"timeline": 58}')
|
||||||
|
|
||||||
my_mock(mocker, "node_tl_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_tl_has_changed",
|
"node_tl_has_changed",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "node_tl_has_changed.state_file"),
|
str(state_file),
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 0
|
assert result.exit_code == 0
|
||||||
|
@ -56,17 +62,15 @@ def test_node_tl_has_changed_ok_with_state_file(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("node_tl_has_changed")
|
||||||
def test_node_tl_has_changed_ko_with_timeline(
|
def test_node_tl_has_changed_ko_with_timeline(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
|
||||||
|
|
||||||
my_mock(mocker, "node_tl_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_tl_has_changed",
|
"node_tl_has_changed",
|
||||||
"--timeline",
|
"--timeline",
|
||||||
"700",
|
"700",
|
||||||
|
@ -79,24 +83,23 @@ def test_node_tl_has_changed_ko_with_timeline(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("node_tl_has_changed")
|
||||||
def test_node_tl_has_changed_ko_with_state_file_and_save(
|
def test_node_tl_has_changed_ko_with_state_file_and_save(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path
|
||||||
) -> None:
|
) -> None:
|
||||||
runner = CliRunner()
|
state_file = tmp_path / "node_tl_has_changed.state_file"
|
||||||
|
with state_file.open("w") as f:
|
||||||
with open(here / "node_tl_has_changed.state_file", "w") as f:
|
|
||||||
f.write('{"timeline": 700}')
|
f.write('{"timeline": 700}')
|
||||||
|
|
||||||
my_mock(mocker, "node_tl_has_changed", 200)
|
|
||||||
# test without saving the new tl
|
# test without saving the new tl
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_tl_has_changed",
|
"node_tl_has_changed",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "node_tl_has_changed.state_file"),
|
str(state_file),
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 2
|
assert result.exit_code == 2
|
||||||
|
@ -105,7 +108,7 @@ def test_node_tl_has_changed_ko_with_state_file_and_save(
|
||||||
== "NODETLHASCHANGED CRITICAL - The expected timeline was 700 got 58. | is_timeline_changed=1;;@1:1 timeline=58\n"
|
== "NODETLHASCHANGED CRITICAL - The expected timeline was 700 got 58. | is_timeline_changed=1;;@1:1 timeline=58\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
cookie = nagiosplugin.Cookie(here / "node_tl_has_changed.state_file")
|
cookie = nagiosplugin.Cookie(state_file)
|
||||||
cookie.open()
|
cookie.open()
|
||||||
new_tl = cookie.get("timeline")
|
new_tl = cookie.get("timeline")
|
||||||
cookie.close()
|
cookie.close()
|
||||||
|
@ -117,10 +120,10 @@ def test_node_tl_has_changed_ko_with_state_file_and_save(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_tl_has_changed",
|
"node_tl_has_changed",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "node_tl_has_changed.state_file"),
|
str(state_file),
|
||||||
"--save",
|
"--save",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
|
@ -130,7 +133,7 @@ def test_node_tl_has_changed_ko_with_state_file_and_save(
|
||||||
== "NODETLHASCHANGED CRITICAL - The expected timeline was 700 got 58. | is_timeline_changed=1;;@1:1 timeline=58\n"
|
== "NODETLHASCHANGED CRITICAL - The expected timeline was 700 got 58. | is_timeline_changed=1;;@1:1 timeline=58\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
cookie = nagiosplugin.Cookie(here / "node_tl_has_changed.state_file")
|
cookie = nagiosplugin.Cookie(state_file)
|
||||||
cookie.open()
|
cookie.open()
|
||||||
new_tl = cookie.get("timeline")
|
new_tl = cookie.get("timeline")
|
||||||
cookie.close()
|
cookie.close()
|
||||||
|
@ -138,23 +141,22 @@ def test_node_tl_has_changed_ko_with_state_file_and_save(
|
||||||
assert new_tl == 58
|
assert new_tl == 58
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.usefixtures("node_tl_has_changed")
|
||||||
def test_node_tl_has_changed_params(
|
def test_node_tl_has_changed_params(
|
||||||
mocker: MockerFixture, use_old_replica_state: bool
|
runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path
|
||||||
) -> None:
|
) -> None:
|
||||||
# This one is placed last because it seems like the exceptions are not flushed from stderr for the next tests.
|
# This one is placed last because it seems like the exceptions are not flushed from stderr for the next tests.
|
||||||
runner = CliRunner()
|
fake_state_file = tmp_path / "fake_file_name.state_file"
|
||||||
|
|
||||||
my_mock(mocker, "node_tl_has_changed", 200)
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(
|
||||||
main,
|
main,
|
||||||
[
|
[
|
||||||
"-e",
|
"-e",
|
||||||
"https://10.20.199.3:8008",
|
patroni_api.endpoint,
|
||||||
"node_tl_has_changed",
|
"node_tl_has_changed",
|
||||||
"--timeline",
|
"--timeline",
|
||||||
"58",
|
"58",
|
||||||
"--state-file",
|
"--state-file",
|
||||||
str(here / "fake_file_name.state_file"),
|
str(fake_state_file),
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
assert result.exit_code == 3
|
assert result.exit_code == 3
|
||||||
|
@ -163,9 +165,7 @@ def test_node_tl_has_changed_params(
|
||||||
== "NODETLHASCHANGED UNKNOWN: click.exceptions.UsageError: Either --timeline or --state-file should be provided for this service\n"
|
== "NODETLHASCHANGED UNKNOWN: click.exceptions.UsageError: Either --timeline or --state-file should be provided for this service\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
result = runner.invoke(
|
result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_tl_has_changed"])
|
||||||
main, ["-e", "https://10.20.199.3:8008", "node_tl_has_changed"]
|
|
||||||
)
|
|
||||||
assert result.exit_code == 3
|
assert result.exit_code == 3
|
||||||
assert (
|
assert (
|
||||||
result.stdout
|
result.stdout
|
||||||
|
|
|
@ -1,49 +0,0 @@
|
||||||
import json
|
|
||||||
import pathlib
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from pytest_mock import MockerFixture
|
|
||||||
|
|
||||||
from check_patroni.types import APIError, PatroniResource
|
|
||||||
|
|
||||||
here = pathlib.Path(__file__).parent
|
|
||||||
|
|
||||||
|
|
||||||
def getjson(name: str) -> Any:
|
|
||||||
path = here / "json" / f"{name}.json"
|
|
||||||
if not path.exists():
|
|
||||||
raise Exception(f"path does not exist : {path}")
|
|
||||||
|
|
||||||
with path.open() as f:
|
|
||||||
return json.load(f)
|
|
||||||
|
|
||||||
|
|
||||||
def my_mock(
|
|
||||||
mocker: MockerFixture,
|
|
||||||
json_file: str,
|
|
||||||
status: int,
|
|
||||||
use_old_replica_state: bool = False,
|
|
||||||
) -> None:
|
|
||||||
def mock_rest_api(self: PatroniResource, service: str) -> Any:
|
|
||||||
if status != 200:
|
|
||||||
raise APIError("Test en erreur pour status code 200")
|
|
||||||
if json_file:
|
|
||||||
if use_old_replica_state and (
|
|
||||||
json_file.startswith("cluster_has_replica")
|
|
||||||
or json_file.startswith("cluster_node_count")
|
|
||||||
):
|
|
||||||
return cluster_api_set_replica_running(getjson(json_file))
|
|
||||||
return getjson(json_file)
|
|
||||||
return None
|
|
||||||
|
|
||||||
mocker.resetall()
|
|
||||||
mocker.patch("check_patroni.types.PatroniResource.rest_api", mock_rest_api)
|
|
||||||
|
|
||||||
|
|
||||||
def cluster_api_set_replica_running(js: Any) -> Any:
|
|
||||||
# starting from 3.0.4 the state of replicas is streaming instead of running
|
|
||||||
for node in js["members"]:
|
|
||||||
if node["role"] in ["replica", "sync_standby"]:
|
|
||||||
if node["state"] == "streaming":
|
|
||||||
node["state"] = "running"
|
|
||||||
return js
|
|
10
tox.ini
10
tox.ini
|
@ -4,11 +4,9 @@ envlist = lint, mypy, py{37,38,39,310,311}
|
||||||
skip_missing_interpreters = True
|
skip_missing_interpreters = True
|
||||||
|
|
||||||
[testenv]
|
[testenv]
|
||||||
deps =
|
extras = test
|
||||||
pytest
|
|
||||||
pytest-mock
|
|
||||||
commands =
|
commands =
|
||||||
pytest {toxinidir}/check_patroni {toxinidir}/tests {posargs:-vv}
|
pytest {toxinidir}/check_patroni {toxinidir}/tests {posargs:-vv --log-level=debug}
|
||||||
|
|
||||||
[testenv:lint]
|
[testenv:lint]
|
||||||
skip_install = True
|
skip_install = True
|
||||||
|
@ -18,7 +16,7 @@ deps =
|
||||||
flake8
|
flake8
|
||||||
isort
|
isort
|
||||||
commands =
|
commands =
|
||||||
codespell {toxinidir}/check_patroni {toxinidir}/tests
|
codespell {toxinidir}/check_patroni {toxinidir}/tests {toxinidir}/docs/ {toxinidir}/RELEASE.md {toxinidir}/CONTRIBUTING.md
|
||||||
black --check --diff {toxinidir}/check_patroni {toxinidir}/tests
|
black --check --diff {toxinidir}/check_patroni {toxinidir}/tests
|
||||||
flake8 {toxinidir}/check_patroni {toxinidir}/tests
|
flake8 {toxinidir}/check_patroni {toxinidir}/tests
|
||||||
isort --check --diff {toxinidir}/check_patroni {toxinidir}/tests
|
isort --check --diff {toxinidir}/check_patroni {toxinidir}/tests
|
||||||
|
@ -28,7 +26,7 @@ deps =
|
||||||
mypy == 0.961
|
mypy == 0.961
|
||||||
commands =
|
commands =
|
||||||
# we need to install types-requests
|
# we need to install types-requests
|
||||||
mypy --install-types --non-interactive {toxinidir}/check_patroni
|
mypy --install-types --non-interactive
|
||||||
|
|
||||||
[testenv:build]
|
[testenv:build]
|
||||||
deps =
|
deps =
|
||||||
|
|
|
@ -100,7 +100,7 @@ http://$IP/icingaweb2/setup
|
||||||
|
|
||||||
Finish
|
Finish
|
||||||
|
|
||||||
* Screen 15: Hopefuly success
|
* Screen 15: Hopefully success
|
||||||
|
|
||||||
Login
|
Login
|
||||||
|
|
||||||
|
|
|
@ -66,7 +66,7 @@ icinga_setup(){
|
||||||
info "# Icinga setup"
|
info "# Icinga setup"
|
||||||
info "#============================================================================="
|
info "#============================================================================="
|
||||||
|
|
||||||
## this part is already done by the standart icinga install with the user icinga2
|
## this part is already done by the standard icinga install with the user icinga2
|
||||||
## and a random password, here we dont really care
|
## and a random password, here we dont really care
|
||||||
|
|
||||||
cat << __EOF__ | sudo -u postgres psql
|
cat << __EOF__ | sudo -u postgres psql
|
||||||
|
@ -83,7 +83,7 @@ __EOF__
|
||||||
icingacli setup config directory --group icingaweb2
|
icingacli setup config directory --group icingaweb2
|
||||||
icingacli setup token create
|
icingacli setup token create
|
||||||
|
|
||||||
## this part is already done by the standart icinga install with the user icinga2
|
## this part is already done by the standard icinga install with the user icinga2
|
||||||
cat << __EOF__ > /etc/icinga2/features-available/ido-pgsql.conf
|
cat << __EOF__ > /etc/icinga2/features-available/ido-pgsql.conf
|
||||||
/**
|
/**
|
||||||
* The db_ido_pgsql library implements IDO functionality
|
* The db_ido_pgsql library implements IDO functionality
|
||||||
|
@ -198,7 +198,7 @@ grafana(){
|
||||||
cat << __EOF__ > /etc/grafana/grafana.ini
|
cat << __EOF__ > /etc/grafana/grafana.ini
|
||||||
[database]
|
[database]
|
||||||
# You can configure the database connection by specifying type, host, name, user and password
|
# You can configure the database connection by specifying type, host, name, user and password
|
||||||
# as seperate properties or as on string using the url propertie.
|
# as separate properties or as on string using the url property.
|
||||||
|
|
||||||
# Either "mysql", "postgres" or "sqlite3", it's your choice
|
# Either "mysql", "postgres" or "sqlite3", it's your choice
|
||||||
type = postgres
|
type = postgres
|
||||||
|
|
Loading…
Reference in a new issue