diff --git a/.coveragerc b/.coveragerc new file mode 100644 index 0000000..94c5571 --- /dev/null +++ b/.coveragerc @@ -0,0 +1,3 @@ +[run] +include = + check_patroni/* diff --git a/.gitignore b/.gitignore index 2b5499f..f577c9b 100644 --- a/.gitignore +++ b/.gitignore @@ -1,10 +1,10 @@ __pycache__/ check_patroni.egg-info -tests/*.state_file tests/config.ini vagrant/.vagrant vagrant/*.state_file .*.swp +.coverage .venv/ .tox/ dist/ diff --git a/CHANGELOG.md b/CHANGELOG.md index 6ad4036..2c56f52 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,13 +1,37 @@ # Change log -## Unreleased +## check_patroni 2.0.0 - 2024-04-09 + +### Changed + +* In `cluster_node_count`, a healthy standby, sync replica or standby leaders cannot be "in + archive recovery" because this service doesn't check for lag and timelines. ### Added +* Add the timeline in the `cluster_has_replica` perfstats. (#50) +* Add a mention about shell completion support and shell versions in the doc. (#53) +* Add the leader type and whether it's archiving to the `cluster_has_leader` perfstats. (#58) + ### Fixed +* Add compatibility with [requests](https://requests.readthedocs.io) + version 2.25 and higher. +* Fix what `cluster_has_replica` deems a healthy replica. (#50, reported by @mbanck) +* Fix `cluster_has_replica` to display perfstats for replicas whenever it's possible (healthy or not). (#50) +* Fix `cluster_has_leader` to correctly check for standby leaders. (#58, reported by @mbanck) +* Fix `cluster_node_count` to correctly manage replication states. (#50, reported by @mbanck) + ### Misc +* Improve the documentation for `node_is_replica`. +* Improve test coverage by running an HTTP server to fake the Patroni API (#55 + by @dlax). +* Work around old pytest versions in type annotations in the test suite. +* Declare compatibility with click version 7.1 (or higher). +* In tests, work around nagiosplugin 1.3.2 not properly handling stdout + redirection. + ## check_patroni 1.0.0 - 2023-08-28 Check patroni is now tagged as Production/Stable. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 28b9a33..6b4d2f3 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -43,15 +43,14 @@ A vagrant file can be found in [this repository](https://github.com/ioguix/vagrant-patroni) to generate a patroni/etcd setup. -The `README.md` can be geneated with `./docs/make_readme.sh`. +The `README.md` can be generated with `./docs/make_readme.sh`. ## Executing Tests Crafting repeatable tests using a live Patroni cluster can be intricate. To -simplify the development process, interactions with Patroni's API are -substituted with a mock function that yields an HTTP return code and a JSON -object outlining the cluster's status. The JSON files containing this -information are housed in the `./tests/json` directory. +simplify the development process, a fake HTTP server is set up as a test +fixture and serves static files (either from `tests/json` directory or from +in-memory data). An important consideration is that there is a potential drawback: if the JSON data is incorrect or if modifications have been made to Patroni without @@ -61,21 +60,15 @@ erroneously. The tests are executed automatically for each PR using the ci (see `.github/workflow/lint.yml` and `.github/workflow/tests.yml`). -Running the tests manually: +Running the tests, -* Using patroni's nominal replica state of `streaming` (since v3.0.4): +* manually: ```bash - pytest ./tests + pytest --cov tests ``` -* Using patroni's nominal replica state of `running` (before v3.0.4): - - ```bash - pytest --use-old-replica-state ./tests - ``` - -* Using tox: +* or using tox: ```bash tox -e lint # mypy + flake8 + black + isort ° codespell @@ -83,9 +76,9 @@ Running the tests manually: tox -e py # pytests and "lint" tests for the default version of python ``` -Please note that when dealing with any service that checks the state of a node -in patroni's `cluster` endpoint, the corresponding JSON test file must be added -in `./tests/tools.py`. +Please note that when dealing with any service that checks the state of a node, +the related tests must use the `old_replica_state` fixture to test with both +old (pre 3.0.4) and new replica states. A bash script, `check_patroni.sh`, is provided to facilitate testing all services on a Patroni endpoint (`./vagrant/check_patroni.sh`). It requires one @@ -99,17 +92,3 @@ Here's an example usage: ```bash ./vagrant/check_patroni.sh http://10.20.30.51:8008 ``` - -## Release - -Update the Changelog. - -The package is generated and uploaded to pypi when a `v*` tag is created (see -`.github/workflow/publish.yml`). - -Alternatively, the release can be done manually with: - -``` -tox -e build -tox -e upload -``` diff --git a/MANIFEST.in b/MANIFEST.in index 03ee4c6..ebf59fa 100644 --- a/MANIFEST.in +++ b/MANIFEST.in @@ -2,6 +2,7 @@ include *.md include mypy.ini include pytest.ini include tox.ini +include .coveragerc include .flake8 include pyproject.toml recursive-include docs *.sh diff --git a/README.md b/README.md index af726eb..05f2cac 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,7 @@ Commands: node_is_leader Check if the node is a leader node. node_is_pending_restart Check if the node is in pending restart... node_is_primary Check if the node is the primary with the... - node_is_replica Check if the node is a running replica... + node_is_replica Check if the node is a replica with no... node_patroni_version Check if the version is equal to the input node_tl_has_changed Check if the timeline has changed. ``` @@ -60,7 +60,7 @@ $ pip install git+https://github.com/dalibo/check_patroni.git check_patroni works on python 3.6, we keep it that way because patroni also supports it and there are still lots of RH 7 variants around. That being said -python 3.6 has been EOL for age and there is no support for it in the github +python 3.6 has been EOL for ages and there is no support for it in the github CI. ## Support @@ -98,8 +98,8 @@ A match is found when: `start <= VALUE <= end`. For example, the following command will raise: -* a warning if there is less than 1 nodes, wich can be translated to outside of range [2;+INF[ -* a critical if there are no nodes, wich can be translated to outside of range [1;+INF[ +* a warning if there is less than 1 nodes, which can be translated to outside of range [2;+INF[ +* a critical if there are no nodes, which can be translated to outside of range [1;+INF[ ``` check_patroni -e https://10.20.199.3:8008 cluster_has_replica --warning 2: --critical 1: @@ -115,6 +115,30 @@ Several options are available: * `--cert_file`: your certificate or the concatenation of your certificate and private key * `--key_file`: your private key (optional) +## Shell completion + +We use the [click] library which supports shell completion natively. + +Shell completion can be added by typing the following command or adding it to +a file spécific to your shell of choice. + +* for Bash (add to `~/.bashrc`): + ``` + eval "$(_CHECK_PATRONI_COMPLETE=bash_source check_patroni)" + ``` +* for Zsh (add to `~/.zshrc`): + ``` + eval "$(_CHECK_PATRONI_COMPLETE=zsh_source check_patroni)" + ``` +* for Fish (add to `~/.config/fish/completions/check_patroni.fish`): + ``` + eval "$(_CHECK_PATRONI_COMPLETE=fish_source check_patroni)" + ``` + +Please note that shell completion is not supported far all shell versions, for +example only Bash versions older than 4.4 are supported. + +[click]: https://click.palletsprojects.com/en/8.1.x/shell-completion/ ## Cluster services @@ -152,11 +176,27 @@ Usage: check_patroni cluster_has_leader [OPTIONS] This check applies to any kind of leaders including standby leaders. + A leader is a node with the "leader" role and a "running" state. + + A standby leader is a node with a "standby_leader" role and a "streaming" or + "in archive recovery" state. Please note that log shipping could be stuck + because the WAL are not available or applicable. Patroni doesn't provide + information about the origin cluster (timeline or lag), so we cannot check + if there is a problem in that particular case. That's why we issue a warning + when the node is "in archive recovery". We suggest using other supervision + tools to do this (eg. check_pgactivity). + Check: * `OK`: if there is a leader node. - * `CRITICAL`: otherwise + * 'WARNING': if there is a stanby leader in archive mode. + * `CRITICAL`: otherwise. - Perfdata: `has_leader` is 1 if there is a leader node, 0 otherwise + Perfdata: + * `has_leader` is 1 if there is any kind of leader node, 0 otherwise + * `is_standby_leader_in_arc_rec` is 1 if the standby leader node is "in + archive recovery", 0 otherwise + * `is_standby_leader` is 1 if there is a standby leader node, 0 otherwise + * `is_leader` is 1 if there is a "classical" leader node, 0 otherwise Options: --help Show this message and exit. @@ -169,10 +209,27 @@ Usage: check_patroni cluster_has_replica [OPTIONS] Check if the cluster has healthy replicas and/or if some are sync standbies + For patroni (and this check): + * a replica is `streaming` if the `pg_stat_wal_receiver` say's so. + * a replica is `in archive recovery`, if it's not `streaming` and has a `restore_command`. + A healthy replica: - * is in running or streaming state (V3.0.4) - * has a replica or sync_standby role - * has a lag lower or equal to max_lag + * has a `replica` or `sync_standby` role + * has the same timeline as the leader and + * is in `running` state (patroni < V3.0.4) + * is in `streaming` or `in archive recovery` state (patroni >= V3.0.4) + * has a lag lower or equal to `max_lag` + + Please note that replica `in archive recovery` could be stuck because the + WAL are not available or applicable (the server's timeline has diverged for + the leader's). We already detect the latter but we will miss the former. + Therefore, it's preferable to check for the lag in addition to the healthy + state if you rely on log shipping to help lagging standbies to catch up. + + Since we require a healthy replica to have the same timeline as the leader, + it's possible that we raise alerts when the cluster is performing a + switchover or failover and the standbies are in the process of catching up + with the new leader. The alert shouldn't last long. Check: * `OK`: if the healthy_replica count and their lag are compatible with the replica count threshold. @@ -182,8 +239,9 @@ Usage: check_patroni cluster_has_replica [OPTIONS] Perfdata: * healthy_replica & unhealthy_replica count * the number of sync_replica, they are included in the previous count - * the lag of each replica labelled with "member name"_lag - * a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync + * the lag of each replica labelled with "member name"_lag + * the timeline of each replica labelled with "member name"_timeline + * a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync Options: -w, --warning TEXT Warning threshold for the number of healthy replica @@ -241,26 +299,37 @@ Usage: check_patroni cluster_node_count [OPTIONS] Count the number of nodes in the cluster. + The role refers to the role of the server in the cluster. Possible values + are: + * master or leader + * replica + * standby_leader + * sync_standby + * demoted + * promoted + * uninitialized + The state refers to the state of PostgreSQL. Possible values are: * initializing new cluster, initdb failed * running custom bootstrap script, custom bootstrap failed * starting, start failed * restarting, restart failed - * running, streaming (for a replica V3.0.4) + * running, streaming, in archive recovery * stopping, stopped, stop failed * creating replica * crashed - The role refers to the role of the server in the cluster. Possible values - are: - * master or leader (V3.0.0+) - * replica - * demoted - * promoted - * uninitialized + The "healthy" checks only ensures that: + * a leader has the running state + * a standby_leader has the running or streaming (V3.0.4) state + * a replica or sync-standby has the running or streaming (V3.0.4) state + + Since we dont check the lag or timeline, "in archive recovery" is not + considered a valid state for this service. See cluster_has_leader and + cluster_has_replica for specialized checks. Check: - * Compares the number of nodes against the normal and healthy (running + streaming) nodes warning and critical thresholds. + * Compares the number of nodes against the normal and healthy nodes warning and critical thresholds. * `OK`: If they are not provided. Perfdata: @@ -307,7 +376,7 @@ Usage: check_patroni node_is_pending_restart [OPTIONS] Check if the node is in pending restart state. - This situation can arise if the configuration has been modified but requiers + This situation can arise if the configuration has been modified but requires a restart of PostgreSQL to take effect. Check: @@ -368,12 +437,21 @@ Options: ``` Usage: check_patroni node_is_replica [OPTIONS] - Check if the node is a running replica with no noloadbalance tag. + Check if the node is a replica with no noloadbalance tag. It is possible to check if the node is synchronous or asynchronous. If - nothing is specified any kind of replica is accepted. When checking for a + nothing is specified any kind of replica is accepted. When checking for a synchronous replica, it's not possible to specify a lag. + This service is using the following Patroni endpoints: replica, asynchronous + and synchronous. The first two implement the `lag` tag. For these endpoints + the state of a replica node doesn't reflect the replication state + (`streaming` or `in archive recovery`), we only know if it's `running`. The + timeline is also not checked. + + Therefore, if a cluster is using asynchronous replication, it is recommended + to check for the lag to detect a divegence as soon as possible. + Check: * `OK`: if the node is a running replica with noloadbalance tag and the lag is under the maximum threshold. * `CRITICAL`: otherwise diff --git a/RELEASE.md b/RELEASE.md new file mode 100644 index 0000000..822e65d --- /dev/null +++ b/RELEASE.md @@ -0,0 +1,38 @@ +# Release HOW TO + +## Preparatory changes + +* Review the **Unreleased** section, if any, in `CHANGELOG.md` possibly adding + any missing item from closed issues, merged pull requests, or directly the + git history[^git-changes], +* Rename the **Unreleased** section according to the version to be released, + with a date, +* Bump the version in `check_patroni/__init__.py`, +* Rebuild the `README.md` (`cd docs; ./make_readme.sh`), +* Commit these changes (either on a dedicated branch, before submitting a pull + request or directly on the `master` branch) with the commit message `release + X.Y.Z`. +* Then, when changes landed in the `master` branch, create an annotated (and + possibly signed) tag, as `git tag -a [-s] -m 'release X.Y.Z' vX.Y.Z`, + and, +* Push with `--follow-tags`. + +[^git-changes]: Use `git log $(git describe --tags --abbrev=0).. --format=%s + --reverse` to get commits from the previous tag. + +## PyPI package + +The package is generated and uploaded to pypi when a `v*` tag is created (see +`.github/workflow/publish.yml`). + +Alternatively, the release can be done manually with: + +``` +tox -e build +tox -e upload +``` + +## GitHub release + +Draft a new release from the release page, choosing the tag just pushed and +copy the relevant change log section as a description. diff --git a/check_patroni/__init__.py b/check_patroni/__init__.py index bd406c9..edb8bfa 100644 --- a/check_patroni/__init__.py +++ b/check_patroni/__init__.py @@ -1,5 +1,5 @@ import logging -__version__ = "1.0.0" +__version__ = "2.0.0" _log: logging.Logger = logging.getLogger(__name__) diff --git a/check_patroni/cli.py b/check_patroni/cli.py index d249219..81628d6 100644 --- a/check_patroni/cli.py +++ b/check_patroni/cli.py @@ -226,29 +226,40 @@ def cluster_node_count( ) -> None: """Count the number of nodes in the cluster. - \b - The state refers to the state of PostgreSQL. Possible values are: - * initializing new cluster, initdb failed - * running custom bootstrap script, custom bootstrap failed - * starting, start failed - * restarting, restart failed - * running, streaming (for a replica V3.0.4) - * stopping, stopped, stop failed - * creating replica - * crashed - \b The role refers to the role of the server in the cluster. Possible values are: - * master or leader (V3.0.0+) + * master or leader * replica + * standby_leader + * sync_standby * demoted * promoted * uninitialized + \b + The state refers to the state of PostgreSQL. Possible values are: + * initializing new cluster, initdb failed + * running custom bootstrap script, custom bootstrap failed + * starting, start failed + * restarting, restart failed + * running, streaming, in archive recovery + * stopping, stopped, stop failed + * creating replica + * crashed + + \b + The "healthy" checks only ensures that: + * a leader has the running state + * a standby_leader has the running or streaming (V3.0.4) state + * a replica or sync-standby has the running or streaming (V3.0.4) state + + Since we dont check the lag or timeline, "in archive recovery" is not considered a valid state + for this service. See cluster_has_leader and cluster_has_replica for specialized checks. + \b Check: - * Compares the number of nodes against the normal and healthy (running + streaming) nodes warning and critical thresholds. + * Compares the number of nodes against the normal and healthy nodes warning and critical thresholds. * `OK`: If they are not provided. \b @@ -285,17 +296,38 @@ def cluster_has_leader(ctx: click.Context) -> None: This check applies to any kind of leaders including standby leaders. + A leader is a node with the "leader" role and a "running" state. + + A standby leader is a node with a "standby_leader" role and a "streaming" + or "in archive recovery" state. Please note that log shipping could be + stuck because the WAL are not available or applicable. Patroni doesn't + provide information about the origin cluster (timeline or lag), so we + cannot check if there is a problem in that particular case. That's why we + issue a warning when the node is "in archive recovery". We suggest using + other supervision tools to do this (eg. check_pgactivity). + \b Check: * `OK`: if there is a leader node. - * `CRITICAL`: otherwise + * 'WARNING': if there is a stanby leader in archive mode. + * `CRITICAL`: otherwise. + + \b + Perfdata: + * `has_leader` is 1 if there is any kind of leader node, 0 otherwise + * `is_standby_leader_in_arc_rec` is 1 if the standby leader node is "in + archive recovery", 0 otherwise + * `is_standby_leader` is 1 if there is a standby leader node, 0 otherwise + * `is_leader` is 1 if there is a "classical" leader node, 0 otherwise - Perfdata: `has_leader` is 1 if there is a leader node, 0 otherwise """ check = nagiosplugin.Check() check.add( ClusterHasLeader(ctx.obj.connection_info), nagiosplugin.ScalarContext("has_leader", None, "@0:0"), + nagiosplugin.ScalarContext("is_standby_leader_in_arc_rec", "@1:1", None), + nagiosplugin.ScalarContext("is_leader", None, None), + nagiosplugin.ScalarContext("is_standby_leader", None, None), ClusterHasLeaderSummary(), ) check.main(verbose=ctx.obj.verbose, timeout=ctx.obj.timeout) @@ -341,11 +373,29 @@ def cluster_has_replica( ) -> None: """Check if the cluster has healthy replicas and/or if some are sync standbies + \b + For patroni (and this check): + * a replica is `streaming` if the `pg_stat_wal_receiver` say's so. + * a replica is `in archive recovery`, if it's not `streaming` and has a `restore_command`. + \b A healthy replica: - * is in running or streaming state (V3.0.4) - * has a replica or sync_standby role - * has a lag lower or equal to max_lag + * has a `replica` or `sync_standby` role + * has the same timeline as the leader and + * is in `running` state (patroni < V3.0.4) + * is in `streaming` or `in archive recovery` state (patroni >= V3.0.4) + * has a lag lower or equal to `max_lag` + + Please note that replica `in archive recovery` could be stuck because the WAL + are not available or applicable (the server's timeline has diverged for the + leader's). We already detect the latter but we will miss the former. + Therefore, it's preferable to check for the lag in addition to the healthy + state if you rely on log shipping to help lagging standbies to catch up. + + Since we require a healthy replica to have the same timeline as the + leader, it's possible that we raise alerts when the cluster is performing a + switchover or failover and the standbies are in the process of catching up with + the new leader. The alert shouldn't last long. \b Check: @@ -357,8 +407,9 @@ def cluster_has_replica( Perfdata: * healthy_replica & unhealthy_replica count * the number of sync_replica, they are included in the previous count - * the lag of each replica labelled with "member name"_lag - * a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync + * the lag of each replica labelled with "member name"_lag + * the timeline of each replica labelled with "member name"_timeline + * a boolean to tell if the node is a sync stanbdy labelled with "member name"_sync """ tmax_lag = size_to_byte(max_lag) if max_lag is not None else None @@ -377,6 +428,7 @@ def cluster_has_replica( ), nagiosplugin.ScalarContext("unhealthy_replica"), nagiosplugin.ScalarContext("replica_lag"), + nagiosplugin.ScalarContext("replica_timeline"), nagiosplugin.ScalarContext("replica_sync"), ) check.main(verbose=ctx.obj.verbose, timeout=ctx.obj.timeout) @@ -569,10 +621,20 @@ def node_is_leader(ctx: click.Context, check_standby_leader: bool) -> None: def node_is_replica( ctx: click.Context, max_lag: str, check_is_sync: bool, check_is_async: bool ) -> None: - """Check if the node is a running replica with no noloadbalance tag. + """Check if the node is a replica with no noloadbalance tag. - It is possible to check if the node is synchronous or asynchronous. If nothing is specified any kind of replica is accepted. - When checking for a synchronous replica, it's not possible to specify a lag. + It is possible to check if the node is synchronous or asynchronous. If + nothing is specified any kind of replica is accepted. When checking for a + synchronous replica, it's not possible to specify a lag. + + This service is using the following Patroni endpoints: replica, asynchronous + and synchronous. The first two implement the `lag` tag. For these endpoints + the state of a replica node doesn't reflect the replication state + (`streaming` or `in archive recovery`), we only know if it's `running`. The + timeline is also not checked. + + Therefore, if a cluster is using asynchronous replication, it is + recommended to check for the lag to detect a divegence as soon as possible. \b Check: @@ -610,7 +672,7 @@ def node_is_pending_restart(ctx: click.Context) -> None: """Check if the node is in pending restart state. This situation can arise if the configuration has been modified but - requiers a restart of PostgreSQL to take effect. + requires a restart of PostgreSQL to take effect. \b Check: diff --git a/check_patroni/cluster.py b/check_patroni/cluster.py index eed5325..4598300 100644 --- a/check_patroni/cluster.py +++ b/check_patroni/cluster.py @@ -1,7 +1,7 @@ import hashlib import json from collections import Counter -from typing import Iterable, Union +from typing import Any, Iterable, Union import nagiosplugin @@ -14,25 +14,52 @@ def replace_chars(text: str) -> str: class ClusterNodeCount(PatroniResource): - def probe(self: "ClusterNodeCount") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: + def debug_member(member: Any, health: str) -> None: + _log.debug( + "Node %(node_name)s is %(health)s: role %(role)s state %(state)s.", + { + "node_name": member["name"], + "health": health, + "role": member["role"], + "state": member["state"], + }, + ) + + # get the cluster info item_dict = self.rest_api("cluster") + role_counters: Counter[str] = Counter() roles = [] status_counters: Counter[str] = Counter() statuses = [] + healthy_member = 0 for member in item_dict["members"]: - roles.append(replace_chars(member["role"])) - statuses.append(replace_chars(member["state"])) + state, role = member["state"], member["role"] + roles.append(replace_chars(role)) + statuses.append(replace_chars(state)) + + if role == "leader" and state == "running": + healthy_member += 1 + debug_member(member, "healthy") + continue + + if role in ["standby_leader", "replica", "sync_standby"] and ( + (self.has_detailed_states() and state == "streaming") + or (not self.has_detailed_states() and state == "running") + ): + healthy_member += 1 + debug_member(member, "healthy") + continue + + debug_member(member, "unhealthy") role_counters.update(roles) status_counters.update(statuses) # The actual check: members, healthy_members yield nagiosplugin.Metric("members", len(item_dict["members"])) - yield nagiosplugin.Metric( - "healthy_members", - status_counters["running"] + status_counters.get("streaming", 0), - ) + yield nagiosplugin.Metric("healthy_members", healthy_member) # The performance data : role for role in role_counters: @@ -48,74 +75,149 @@ class ClusterNodeCount(PatroniResource): class ClusterHasLeader(PatroniResource): - def probe(self: "ClusterHasLeader") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: item_dict = self.rest_api("cluster") is_leader_found = False + is_standby_leader_found = False + is_standby_leader_in_arc_rec = False for member in item_dict["members"]: - if ( - member["role"] in ("leader", "standby_leader") - and member["state"] == "running" - ): + if member["role"] == "leader" and member["state"] == "running": is_leader_found = True break + if member["role"] == "standby_leader": + if member["state"] not in ["streaming", "in archive recovery"]: + # for patroni >= 3.0.4 any state would be wrong + # for patroni < 3.0.4 a state different from running would be wrong + if self.has_detailed_states() or member["state"] != "running": + continue + + if member["state"] in ["in archive recovery"]: + is_standby_leader_in_arc_rec = True + + is_standby_leader_found = True + break return [ nagiosplugin.Metric( "has_leader", + 1 if is_leader_found or is_standby_leader_found else 0, + ), + nagiosplugin.Metric( + "is_standby_leader_in_arc_rec", + 1 if is_standby_leader_in_arc_rec else 0, + ), + nagiosplugin.Metric( + "is_standby_leader", + 1 if is_standby_leader_found else 0, + ), + nagiosplugin.Metric( + "is_leader", 1 if is_leader_found else 0, - ) + ), ] class ClusterHasLeaderSummary(nagiosplugin.Summary): - def ok(self: "ClusterHasLeaderSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return "The cluster has a running leader." @handle_unknown - def problem(self: "ClusterHasLeaderSummary", results: nagiosplugin.Result) -> str: - return "The cluster has no running leader." + def problem(self, results: nagiosplugin.Result) -> str: + return "The cluster has no running leader or the standby leader is in archive recovery." class ClusterHasReplica(PatroniResource): - def __init__( - self: "ClusterHasReplica", - connection_info: ConnectionInfo, - max_lag: Union[int, None], - ): + def __init__(self, connection_info: ConnectionInfo, max_lag: Union[int, None]): super().__init__(connection_info) self.max_lag = max_lag - def probe(self: "ClusterHasReplica") -> Iterable[nagiosplugin.Metric]: - item_dict = self.rest_api("cluster") + def probe(self) -> Iterable[nagiosplugin.Metric]: + def debug_member(member: Any, health: str) -> None: + _log.debug( + "Node %(node_name)s is %(health)s: lag %(lag)s, state %(state)s, tl %(tl)s.", + { + "node_name": member["name"], + "health": health, + "lag": member["lag"], + "state": member["state"], + "tl": member["timeline"], + }, + ) + + # get the cluster info + cluster_item_dict = self.rest_api("cluster") replicas = [] healthy_replica = 0 unhealthy_replica = 0 sync_replica = 0 - for member in item_dict["members"]: - # FIXME are there other acceptable states + leader_tl = None + + # Look for replicas + for member in cluster_item_dict["members"]: if member["role"] in ["replica", "sync_standby"]: - # patroni 3.0.4 changed the standby state from running to streaming - if ( - member["state"] in ["running", "streaming"] - and member["lag"] != "unknown" - ): + if member["lag"] == "unknown": + # This could happen if the node is stopped + # nagiosplugin doesn't handle strings in perfstats + # so we have to ditch all the stats in that case + debug_member(member, "unhealthy") + unhealthy_replica += 1 + continue + else: replicas.append( { "name": member["name"], "lag": member["lag"], + "timeline": member["timeline"], "sync": 1 if member["role"] == "sync_standby" else 0, } ) - if member["role"] == "sync_standby": - sync_replica += 1 + # Get the leader tl if we haven't already + if leader_tl is None: + # If there are no leaders, we will loop here for all + # members because leader_tl will remain None. it's not + # a big deal since having no leader is rare. + for tmember in cluster_item_dict["members"]: + if tmember["role"] == "leader": + leader_tl = int(tmember["timeline"]) + break - if self.max_lag is None or self.max_lag >= int(member["lag"]): - healthy_replica += 1 - continue - unhealthy_replica += 1 + _log.debug( + "Patroni's leader_timeline is %(leader_tl)s", + { + "leader_tl": leader_tl, + }, + ) + + # Test for an unhealthy replica + if ( + self.has_detailed_states() + and not ( + member["state"] in ["streaming", "in archive recovery"] + and int(member["timeline"]) == leader_tl + ) + ) or ( + not self.has_detailed_states() + and not ( + member["state"] == "running" + and int(member["timeline"]) == leader_tl + ) + ): + debug_member(member, "unhealthy") + unhealthy_replica += 1 + continue + + if member["role"] == "sync_standby": + sync_replica += 1 + + if self.max_lag is None or self.max_lag >= int(member["lag"]): + debug_member(member, "healthy") + healthy_replica += 1 + else: + debug_member(member, "unhealthy") + unhealthy_replica += 1 # The actual check yield nagiosplugin.Metric("healthy_replica", healthy_replica) @@ -127,6 +229,11 @@ class ClusterHasReplica(PatroniResource): yield nagiosplugin.Metric( f"{replica['name']}_lag", replica["lag"], context="replica_lag" ) + yield nagiosplugin.Metric( + f"{replica['name']}_timeline", + replica["timeline"], + context="replica_timeline", + ) yield nagiosplugin.Metric( f"{replica['name']}_sync", replica["sync"], context="replica_sync" ) @@ -140,7 +247,7 @@ class ClusterHasReplica(PatroniResource): class ClusterConfigHasChanged(PatroniResource): def __init__( - self: "ClusterConfigHasChanged", + self, connection_info: ConnectionInfo, config_hash: str, # Always contains the old hash state_file: str, # Only used to update the hash in the state_file (when needed) @@ -151,7 +258,7 @@ class ClusterConfigHasChanged(PatroniResource): self.config_hash = config_hash self.save = save - def probe(self: "ClusterConfigHasChanged") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: item_dict = self.rest_api("config") new_hash = hashlib.md5(json.dumps(item_dict).encode()).hexdigest() @@ -183,23 +290,21 @@ class ClusterConfigHasChanged(PatroniResource): class ClusterConfigHasChangedSummary(nagiosplugin.Summary): - def __init__(self: "ClusterConfigHasChangedSummary", config_hash: str) -> None: + def __init__(self, config_hash: str) -> None: self.old_config_hash = config_hash # Note: It would be helpful to display the old / new hash here. Unfortunately, it's not a metric. # So we only have the old / expected one. - def ok(self: "ClusterConfigHasChangedSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return f"The hash of patroni's dynamic configuration has not changed ({self.old_config_hash})." @handle_unknown - def problem( - self: "ClusterConfigHasChangedSummary", results: nagiosplugin.Result - ) -> str: + def problem(self, results: nagiosplugin.Result) -> str: return f"The hash of patroni's dynamic configuration has changed. The old hash was {self.old_config_hash}." class ClusterIsInMaintenance(PatroniResource): - def probe(self: "ClusterIsInMaintenance") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: item_dict = self.rest_api("cluster") # The actual check @@ -212,7 +317,7 @@ class ClusterIsInMaintenance(PatroniResource): class ClusterHasScheduledAction(PatroniResource): - def probe(self: "ClusterIsInMaintenance") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: item_dict = self.rest_api("cluster") scheduled_switchover = 0 diff --git a/check_patroni/node.py b/check_patroni/node.py index df50cff..27f93f0 100644 --- a/check_patroni/node.py +++ b/check_patroni/node.py @@ -7,7 +7,7 @@ from .types import APIError, ConnectionInfo, PatroniResource, handle_unknown class NodeIsPrimary(PatroniResource): - def probe(self: "NodeIsPrimary") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: try: self.rest_api("primary") except APIError: @@ -16,24 +16,22 @@ class NodeIsPrimary(PatroniResource): class NodeIsPrimarySummary(nagiosplugin.Summary): - def ok(self: "NodeIsPrimarySummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return "This node is the primary with the leader lock." @handle_unknown - def problem(self: "NodeIsPrimarySummary", results: nagiosplugin.Result) -> str: + def problem(self, results: nagiosplugin.Result) -> str: return "This node is not the primary with the leader lock." class NodeIsLeader(PatroniResource): def __init__( - self: "NodeIsLeader", - connection_info: ConnectionInfo, - check_is_standby_leader: bool, + self, connection_info: ConnectionInfo, check_is_standby_leader: bool ) -> None: super().__init__(connection_info) self.check_is_standby_leader = check_is_standby_leader - def probe(self: "NodeIsLeader") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: apiname = "leader" if self.check_is_standby_leader: apiname = "standby-leader" @@ -46,26 +44,23 @@ class NodeIsLeader(PatroniResource): class NodeIsLeaderSummary(nagiosplugin.Summary): - def __init__( - self: "NodeIsLeaderSummary", - check_is_standby_leader: bool, - ) -> None: + def __init__(self, check_is_standby_leader: bool) -> None: if check_is_standby_leader: self.leader_kind = "standby leader" else: self.leader_kind = "leader" - def ok(self: "NodeIsLeaderSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return f"This node is a {self.leader_kind} node." @handle_unknown - def problem(self: "NodeIsLeaderSummary", results: nagiosplugin.Result) -> str: + def problem(self, results: nagiosplugin.Result) -> str: return f"This node is not a {self.leader_kind} node." class NodeIsReplica(PatroniResource): def __init__( - self: "NodeIsReplica", + self, connection_info: ConnectionInfo, max_lag: str, check_is_sync: bool, @@ -76,7 +71,7 @@ class NodeIsReplica(PatroniResource): self.check_is_sync = check_is_sync self.check_is_async = check_is_async - def probe(self: "NodeIsReplica") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: try: if self.check_is_sync: api_name = "synchronous" @@ -95,12 +90,7 @@ class NodeIsReplica(PatroniResource): class NodeIsReplicaSummary(nagiosplugin.Summary): - def __init__( - self: "NodeIsReplicaSummary", - lag: str, - check_is_sync: bool, - check_is_async: bool, - ) -> None: + def __init__(self, lag: str, check_is_sync: bool, check_is_async: bool) -> None: self.lag = lag if check_is_sync: self.replica_kind = "synchronous replica" @@ -109,7 +99,7 @@ class NodeIsReplicaSummary(nagiosplugin.Summary): else: self.replica_kind = "replica" - def ok(self: "NodeIsReplicaSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: if self.lag is None: return ( f"This node is a running {self.replica_kind} with no noloadbalance tag." @@ -117,14 +107,14 @@ class NodeIsReplicaSummary(nagiosplugin.Summary): return f"This node is a running {self.replica_kind} with no noloadbalance tag and the lag is under {self.lag}." @handle_unknown - def problem(self: "NodeIsReplicaSummary", results: nagiosplugin.Result) -> str: + def problem(self, results: nagiosplugin.Result) -> str: if self.lag is None: return f"This node is not a running {self.replica_kind} with no noloadbalance tag." return f"This node is not a running {self.replica_kind} with no noloadbalance tag and a lag under {self.lag}." class NodeIsPendingRestart(PatroniResource): - def probe(self: "NodeIsPendingRestart") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: item_dict = self.rest_api("patroni") is_pending_restart = item_dict.get("pending_restart", False) @@ -137,19 +127,17 @@ class NodeIsPendingRestart(PatroniResource): class NodeIsPendingRestartSummary(nagiosplugin.Summary): - def ok(self: "NodeIsPendingRestartSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return "This node doesn't have the pending restart flag." @handle_unknown - def problem( - self: "NodeIsPendingRestartSummary", results: nagiosplugin.Result - ) -> str: + def problem(self, results: nagiosplugin.Result) -> str: return "This node has the pending restart flag." class NodeTLHasChanged(PatroniResource): def __init__( - self: "NodeTLHasChanged", + self, connection_info: ConnectionInfo, timeline: str, # Always contains the old timeline state_file: str, # Only used to update the timeline in the state_file (when needed) @@ -160,7 +148,7 @@ class NodeTLHasChanged(PatroniResource): self.timeline = timeline self.save = save - def probe(self: "NodeTLHasChanged") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: item_dict = self.rest_api("patroni") new_tl = item_dict["timeline"] @@ -193,27 +181,23 @@ class NodeTLHasChanged(PatroniResource): class NodeTLHasChangedSummary(nagiosplugin.Summary): - def __init__(self: "NodeTLHasChangedSummary", timeline: str) -> None: + def __init__(self, timeline: str) -> None: self.timeline = timeline - def ok(self: "NodeTLHasChangedSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return f"The timeline is still {self.timeline}." @handle_unknown - def problem(self: "NodeTLHasChangedSummary", results: nagiosplugin.Result) -> str: + def problem(self, results: nagiosplugin.Result) -> str: return f"The expected timeline was {self.timeline} got {results['timeline'].metric}." class NodePatroniVersion(PatroniResource): - def __init__( - self: "NodePatroniVersion", - connection_info: ConnectionInfo, - patroni_version: str, - ) -> None: + def __init__(self, connection_info: ConnectionInfo, patroni_version: str) -> None: super().__init__(connection_info) self.patroni_version = patroni_version - def probe(self: "NodePatroniVersion") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: item_dict = self.rest_api("patroni") version = item_dict["patroni"]["version"] @@ -232,21 +216,21 @@ class NodePatroniVersion(PatroniResource): class NodePatroniVersionSummary(nagiosplugin.Summary): - def __init__(self: "NodePatroniVersionSummary", patroni_version: str) -> None: + def __init__(self, patroni_version: str) -> None: self.patroni_version = patroni_version - def ok(self: "NodePatroniVersionSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return f"Patroni's version is {self.patroni_version}." @handle_unknown - def problem(self: "NodePatroniVersionSummary", results: nagiosplugin.Result) -> str: + def problem(self, results: nagiosplugin.Result) -> str: # FIXME find a way to make the following work, check is perf data can be strings # return f"The expected patroni version was {self.patroni_version} got {results['patroni_version'].metric}." return f"Patroni's version is not {self.patroni_version}." class NodeIsAlive(PatroniResource): - def probe(self: "NodeIsAlive") -> Iterable[nagiosplugin.Metric]: + def probe(self) -> Iterable[nagiosplugin.Metric]: try: self.rest_api("liveness") except APIError: @@ -255,9 +239,9 @@ class NodeIsAlive(PatroniResource): class NodeIsAliveSummary(nagiosplugin.Summary): - def ok(self: "NodeIsAliveSummary", results: nagiosplugin.Result) -> str: + def ok(self, results: nagiosplugin.Result) -> str: return "This node is alive (patroni is running)." @handle_unknown - def problem(self: "NodeIsAliveSummary", results: nagiosplugin.Result) -> str: + def problem(self, results: nagiosplugin.Result) -> str: return "This node is not alive (patroni is not running)." diff --git a/check_patroni/types.py b/check_patroni/types.py index 16789a7..5f08dd4 100644 --- a/check_patroni/types.py +++ b/check_patroni/types.py @@ -1,3 +1,5 @@ +import json +from functools import lru_cache from typing import Any, Callable, List, Optional, Tuple, Union from urllib.parse import urlparse @@ -28,11 +30,11 @@ class Parameters: verbose: int -@attr.s(auto_attribs=True, slots=True) +@attr.s(auto_attribs=True, eq=False, slots=True) class PatroniResource(nagiosplugin.Resource): conn_info: ConnectionInfo - def rest_api(self: "PatroniResource", service: str) -> Any: + def rest_api(self, service: str) -> Any: """Try to connect to all the provided endpoints for the requested service""" for endpoint in self.conn_info.endpoints: cert: Optional[Union[Tuple[str, str], str]] = None @@ -71,10 +73,31 @@ class PatroniResource(nagiosplugin.Resource): try: return r.json() - except requests.exceptions.JSONDecodeError: + except (json.JSONDecodeError, ValueError): return None raise nagiosplugin.CheckError("Connection failed for all provided endpoints") + @lru_cache(maxsize=None) + def has_detailed_states(self) -> bool: + # get patroni's version to find out if the "streaming" and "in archive recovery" states are available + patroni_item_dict = self.rest_api("patroni") + + if tuple( + int(v) for v in patroni_item_dict["patroni"]["version"].split(".", 2) + ) >= (3, 0, 4): + _log.debug( + "Patroni's version is %(version)s, more detailed states can be used to check for the health of replicas.", + {"version": patroni_item_dict["patroni"]["version"]}, + ) + + return True + + _log.debug( + "Patroni's version is %(version)s, the running state and the timelines must be used to check for the health of replicas.", + {"version": patroni_item_dict["patroni"]["version"]}, + ) + return False + HandleUnknown = Callable[[nagiosplugin.Summary, nagiosplugin.Results], Any] diff --git a/docs/make_readme.sh b/docs/make_readme.sh index 8c5dad9..2c8f6d4 100755 --- a/docs/make_readme.sh +++ b/docs/make_readme.sh @@ -42,7 +42,7 @@ $ pip install git+https://github.com/dalibo/check_patroni.git check_patroni works on python 3.6, we keep it that way because patroni also supports it and there are still lots of RH 7 variants around. That being said -python 3.6 has been EOL for age and there is no support for it in the github +python 3.6 has been EOL for ages and there is no support for it in the github CI. ## Support @@ -80,8 +80,8 @@ A match is found when: `start <= VALUE <= end`. For example, the following command will raise: -* a warning if there is less than 1 nodes, wich can be translated to outside of range [2;+INF[ -* a critical if there are no nodes, wich can be translated to outside of range [1;+INF[ +* a warning if there is less than 1 nodes, which can be translated to outside of range [2;+INF[ +* a critical if there are no nodes, which can be translated to outside of range [1;+INF[ ``` check_patroni -e https://10.20.199.3:8008 cluster_has_replica --warning 2: --critical 1: @@ -97,6 +97,30 @@ Several options are available: * `--cert_file`: your certificate or the concatenation of your certificate and private key * `--key_file`: your private key (optional) +## Shell completion + +We use the [click] library which supports shell completion natively. + +Shell completion can be added by typing the following command or adding it to +a file spécific to your shell of choice. + +* for Bash (add to `~/.bashrc`): + ``` + eval "$(_CHECK_PATRONI_COMPLETE=bash_source check_patroni)" + ``` +* for Zsh (add to `~/.zshrc`): + ``` + eval "$(_CHECK_PATRONI_COMPLETE=zsh_source check_patroni)" + ``` +* for Fish (add to `~/.config/fish/completions/check_patroni.fish`): + ``` + eval "$(_CHECK_PATRONI_COMPLETE=fish_source check_patroni)" + ``` + +Please note that shell completion is not supported far all shell versions, for +example only Bash versions older than 4.4 are supported. + +[click]: https://click.palletsprojects.com/en/8.1.x/shell-completion/ _EOF_ readme readme "## Cluster services" diff --git a/mypy.ini b/mypy.ini index e72d338..a60f543 100644 --- a/mypy.ini +++ b/mypy.ini @@ -1,4 +1,5 @@ [mypy] +files = . show_error_codes = true strict = true exclude = build/ diff --git a/requirements-dev.txt b/requirements-dev.txt index f53c628..ac12458 100644 --- a/requirements-dev.txt +++ b/requirements-dev.txt @@ -4,7 +4,7 @@ isort flake8 mypy==0.961 pytest -pytest-mock +pytest-cov types-requests setuptools tox diff --git a/setup.py b/setup.py index ad4c0f2..31825ed 100644 --- a/setup.py +++ b/setup.py @@ -41,12 +41,12 @@ setup( "attrs >= 17, !=21.1", "requests", "nagiosplugin >= 1.3.2", - "click >= 8.0.1", + "click >= 7.1", ], extras_require={ "test": [ - "pytest", - "pytest-mock", + "importlib_metadata; python_version < '3.8'", + "pytest >= 6.0.2", ], }, entry_points={ @@ -56,4 +56,3 @@ setup( }, zip_safe=False, ) - diff --git a/tests/__init__.py b/tests/__init__.py index e69de29..3b82360 100644 --- a/tests/__init__.py +++ b/tests/__init__.py @@ -0,0 +1,65 @@ +import json +import logging +import shutil +from contextlib import contextmanager +from functools import partial +from http.server import HTTPServer, SimpleHTTPRequestHandler +from pathlib import Path +from typing import Any, Iterator, Mapping, Union + +logger = logging.getLogger(__name__) + + +class PatroniAPI(HTTPServer): + def __init__(self, directory: Path, *, datadir: Path) -> None: + self.directory = directory + self.datadir = datadir + handler_cls = partial(SimpleHTTPRequestHandler, directory=str(directory)) + super().__init__(("", 0), handler_cls) + + def serve_forever(self, *args: Any) -> None: + logger.info( + "starting fake Patroni API at %s (directory=%s)", + self.endpoint, + self.directory, + ) + return super().serve_forever(*args) + + @property + def endpoint(self) -> str: + return f"http://{self.server_name}:{self.server_port}" + + @contextmanager + def routes(self, mapping: Mapping[str, Union[Path, str]]) -> Iterator[None]: + """Temporarily install specified files in served directory, thus + building "routes" from given mapping. + + The 'mapping' defines target route paths as keys and files to be + installed in served directory as values. Mapping values of type 'str' + are assumed be relative file path to the 'datadir'. + """ + for route_path, fpath in mapping.items(): + if isinstance(fpath, str): + fpath = self.datadir / fpath + shutil.copy(fpath, self.directory / route_path) + try: + yield None + finally: + for fname in mapping: + (self.directory / fname).unlink() + + +def cluster_api_set_replica_running(in_json: Path, target_dir: Path) -> Path: + # starting from 3.0.4 the state of replicas is streaming or in archive recovery + # instead of running + with in_json.open() as f: + js = json.load(f) + for node in js["members"]: + if node["role"] in ["replica", "sync_standby", "standby_leader"]: + if node["state"] in ["streaming", "in archive recovery"]: + node["state"] = "running" + assert target_dir.is_dir() + out_json = target_dir / in_json.name + with out_json.open("w") as f: + json.dump(js, f) + return out_json diff --git a/tests/conftest.py b/tests/conftest.py index 207ec2c..3a071c0 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -1,12 +1,76 @@ -def pytest_addoption(parser): - """ - Add CLI options to `pytest` to pass those options to the test cases. - These options are used in `pytest_generate_tests`. - """ - parser.addoption("--use-old-replica-state", action="store_true", default=False) +import logging +import sys +from pathlib import Path +from threading import Thread +from typing import Any, Iterator, Tuple +from unittest.mock import patch + +if sys.version_info >= (3, 8): + from importlib.metadata import version as metadata_version +else: + from importlib_metadata import version as metadata_version + +import pytest +from click.testing import CliRunner + +from . import PatroniAPI + +logger = logging.getLogger(__name__) -def pytest_generate_tests(metafunc): - metafunc.parametrize( - "use_old_replica_state", [metafunc.config.getoption("use_old_replica_state")] - ) +def numversion(pkgname: str) -> Tuple[int, ...]: + version = metadata_version(pkgname) + return tuple(int(v) for v in version.split(".", 3)) + + +if numversion("pytest") >= (6, 2): + TempPathFactory = pytest.TempPathFactory +else: + from _pytest.tmpdir import TempPathFactory + + +@pytest.fixture(scope="session", autouse=True) +def nagioplugin_runtime_stdout() -> Iterator[None]: + # work around https://github.com/mpounsett/nagiosplugin/issues/24 when + # nagiosplugin is older than 1.3.3 + if numversion("nagiosplugin") < (1, 3, 3): + target = "nagiosplugin.runtime.Runtime.stdout" + with patch(target, None): + logger.warning("patching %r", target) + yield None + else: + yield None + + +@pytest.fixture( + params=[False, True], + ids=lambda v: "new-replica-state" if v else "old-replica-state", +) +def old_replica_state(request: Any) -> Any: + return request.param + + +@pytest.fixture(scope="session") +def datadir() -> Path: + return Path(__file__).parent / "json" + + +@pytest.fixture(scope="session") +def patroni_api( + tmp_path_factory: TempPathFactory, datadir: Path +) -> Iterator[PatroniAPI]: + """A fake HTTP server for the Patroni API serving files from a temporary + directory. + """ + httpd = PatroniAPI(tmp_path_factory.mktemp("api"), datadir=datadir) + t = Thread(target=httpd.serve_forever) + t.start() + yield httpd + httpd.shutdown() + t.join() + + +@pytest.fixture +def runner() -> CliRunner: + """A CliRunner with stdout and stderr not mixed.""" + return CliRunner(mix_stderr=False) diff --git a/tests/json/cluster_has_leader_ko_standby_leader.json b/tests/json/cluster_has_leader_ko_standby_leader.json new file mode 100644 index 0000000..3302838 --- /dev/null +++ b/tests/json/cluster_has_leader_ko_standby_leader.json @@ -0,0 +1,33 @@ +{ + "members": [ + { + "name": "srv1", + "role": "standby_leader", + "state": "stopped", + "api_url": "https://10.20.199.3:8008/patroni", + "host": "10.20.199.3", + "port": 5432, + "timeline": 51 + }, + { + "name": "srv2", + "role": "replica", + "state": "streaming", + "api_url": "https://10.20.199.4:8008/patroni", + "host": "10.20.199.4", + "port": 5432, + "timeline": 51, + "lag": 0 + }, + { + "name": "srv3", + "role": "replica", + "state": "streaming", + "api_url": "https://10.20.199.5:8008/patroni", + "host": "10.20.199.5", + "port": 5432, + "timeline": 51, + "lag": 0 + } + ] +} diff --git a/tests/json/cluster_has_leader_ko_standby_leader_archiving.json b/tests/json/cluster_has_leader_ko_standby_leader_archiving.json new file mode 100644 index 0000000..a91ea2d --- /dev/null +++ b/tests/json/cluster_has_leader_ko_standby_leader_archiving.json @@ -0,0 +1,33 @@ +{ + "members": [ + { + "name": "srv1", + "role": "standby_leader", + "state": "in archive recovery", + "api_url": "https://10.20.199.3:8008/patroni", + "host": "10.20.199.3", + "port": 5432, + "timeline": 51 + }, + { + "name": "srv2", + "role": "replica", + "state": "streaming", + "api_url": "https://10.20.199.4:8008/patroni", + "host": "10.20.199.4", + "port": 5432, + "timeline": 51, + "lag": 0 + }, + { + "name": "srv3", + "role": "replica", + "state": "streaming", + "api_url": "https://10.20.199.5:8008/patroni", + "host": "10.20.199.5", + "port": 5432, + "timeline": 51, + "lag": 0 + } + ] +} diff --git a/tests/json/cluster_has_leader_ok_standby_leader.json b/tests/json/cluster_has_leader_ok_standby_leader.json index baca225..b5c4844 100644 --- a/tests/json/cluster_has_leader_ok_standby_leader.json +++ b/tests/json/cluster_has_leader_ok_standby_leader.json @@ -3,7 +3,7 @@ { "name": "srv1", "role": "standby_leader", - "state": "running", + "state": "streaming", "api_url": "https://10.20.199.3:8008/patroni", "host": "10.20.199.3", "port": 5432, diff --git a/tests/json/cluster_has_replica_ko_all_replica.json b/tests/json/cluster_has_replica_ko_all_replica.json new file mode 100644 index 0000000..fe82d32 --- /dev/null +++ b/tests/json/cluster_has_replica_ko_all_replica.json @@ -0,0 +1,35 @@ +{ + "members": [ + { + "name": "srv1", + "role": "replica", + "state": "running", + "api_url": "https://10.20.199.3:8008/patroni", + "host": "10.20.199.3", + "port": 5432, + "timeline": 51, + "lag": 0 + }, + { + "name": "srv2", + "role": "replica", + "state": "running", + "api_url": "https://10.20.199.4:8008/patroni", + "host": "10.20.199.4", + "port": 5432, + "timeline": 51, + "lag": 0 + }, + { + "name": "srv3", + "role": "replica", + "state": "running", + "api_url": "https://10.20.199.5:8008/patroni", + "host": "10.20.199.5", + "port": 5432, + "timeline": 51, + "lag": 0 + + } + ] +} diff --git a/tests/json/cluster_has_replica_ko_wrong_tl.json b/tests/json/cluster_has_replica_ko_wrong_tl.json new file mode 100644 index 0000000..6889484 --- /dev/null +++ b/tests/json/cluster_has_replica_ko_wrong_tl.json @@ -0,0 +1,33 @@ +{ + "members": [ + { + "name": "srv1", + "role": "leader", + "state": "running", + "api_url": "https://10.20.199.3:8008/patroni", + "host": "10.20.199.3", + "port": 5432, + "timeline": 51 + }, + { + "name": "srv2", + "role": "replica", + "state": "running", + "api_url": "https://10.20.199.4:8008/patroni", + "host": "10.20.199.4", + "port": 5432, + "timeline": 50, + "lag": 1000000 + }, + { + "name": "srv3", + "role": "replica", + "state": "streaming", + "api_url": "https://10.20.199.5:8008/patroni", + "host": "10.20.199.5", + "port": 5432, + "timeline": 51, + "lag": 0 + } + ] +} diff --git a/tests/json/cluster_has_replica_ok.json b/tests/json/cluster_has_replica_ok.json index 44535e0..181ed4f 100644 --- a/tests/json/cluster_has_replica_ok.json +++ b/tests/json/cluster_has_replica_ok.json @@ -12,7 +12,7 @@ { "name": "srv2", "role": "replica", - "state": "streaming", + "state": "in archive recovery", "api_url": "https://10.20.199.4:8008/patroni", "host": "10.20.199.4", "port": 5432, diff --git a/tests/json/cluster_has_replica_patroni_verion_3.0.0.json b/tests/json/cluster_has_replica_patroni_verion_3.0.0.json new file mode 100644 index 0000000..9c922b8 --- /dev/null +++ b/tests/json/cluster_has_replica_patroni_verion_3.0.0.json @@ -0,0 +1,26 @@ +{ + "state": "running", + "postmaster_start_time": "2021-08-11 07:02:20.732 UTC", + "role": "master", + "server_version": 110012, + "cluster_unlocked": false, + "xlog": { + "location": 1174407088 + }, + "timeline": 51, + "replication": [ + { + "usename": "replicator", + "application_name": "srv1", + "client_addr": "10.20.199.3", + "state": "streaming", + "sync_state": "async", + "sync_priority": 0 + } + ], + "database_system_identifier": "6965971025273547206", + "patroni": { + "version": "3.0.0", + "scope": "patroni-demo" + } +} diff --git a/tests/json/cluster_has_replica_patroni_verion_3.1.0.json b/tests/json/cluster_has_replica_patroni_verion_3.1.0.json new file mode 100644 index 0000000..91e4348 --- /dev/null +++ b/tests/json/cluster_has_replica_patroni_verion_3.1.0.json @@ -0,0 +1,26 @@ +{ + "state": "running", + "postmaster_start_time": "2021-08-11 07:02:20.732 UTC", + "role": "master", + "server_version": 110012, + "cluster_unlocked": false, + "xlog": { + "location": 1174407088 + }, + "timeline": 51, + "replication": [ + { + "usename": "replicator", + "application_name": "srv1", + "client_addr": "10.20.199.3", + "state": "streaming", + "sync_state": "async", + "sync_priority": 0 + } + ], + "database_system_identifier": "6965971025273547206", + "patroni": { + "version": "3.1.0", + "scope": "patroni-demo" + } +} diff --git a/tests/json/cluster_node_count_ko_in_archive_recovery.json b/tests/json/cluster_node_count_ko_in_archive_recovery.json new file mode 100644 index 0000000..14600a5 --- /dev/null +++ b/tests/json/cluster_node_count_ko_in_archive_recovery.json @@ -0,0 +1,33 @@ +{ + "members": [ + { + "name": "srv1", + "role": "standby_leader", + "state": "in archive recovery", + "api_url": "https://10.20.199.3:8008/patroni", + "host": "10.20.199.3", + "port": 5432, + "timeline": 51 + }, + { + "name": "srv2", + "role": "replica", + "state": "in archive recovery", + "api_url": "https://10.20.199.4:8008/patroni", + "host": "10.20.199.4", + "port": 5432, + "timeline": 51, + "lag": 0 + }, + { + "name": "srv3", + "role": "replica", + "state": "streaming", + "api_url": "https://10.20.199.5:8008/patroni", + "host": "10.20.199.5", + "port": 5432, + "timeline": 51, + "lag": 0 + } + ] +} diff --git a/tests/test_api.py b/tests/test_api.py index 117962b..a1996d3 100644 --- a/tests/test_api.py +++ b/tests/test_api.py @@ -1,30 +1,20 @@ from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI -def test_api_status_code_200( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - - my_mock(mocker, "node_is_pending_restart_ok", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"] - ) +def test_api_status_code_200(runner: CliRunner, patroni_api: PatroniAPI) -> None: + with patroni_api.routes({"patroni": "node_is_pending_restart_ok.json"}): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "node_is_pending_restart"] + ) assert result.exit_code == 0 -def test_api_status_code_404( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - - my_mock(mocker, "Fake test", 404) +def test_api_status_code_404(runner: CliRunner, patroni_api: PatroniAPI) -> None: result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"] + main, ["-e", patroni_api.endpoint, "node_is_pending_restart"] ) assert result.exit_code == 3 diff --git a/tests/test_cluster_config_has_changed.py b/tests/test_cluster_config_has_changed.py index 0f1077c..606e3e4 100644 --- a/tests/test_cluster_config_has_changed.py +++ b/tests/test_cluster_config_has_changed.py @@ -1,23 +1,29 @@ +from pathlib import Path +from typing import Iterator + import nagiosplugin +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import here, my_mock +from . import PatroniAPI + + +@pytest.fixture(scope="module", autouse=True) +def cluster_config_has_changed(patroni_api: PatroniAPI) -> Iterator[None]: + with patroni_api.routes({"config": "cluster_config_has_changed.json"}): + yield None def test_cluster_config_has_changed_ok_with_hash( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_config_has_changed", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_config_has_changed", "--hash", "96b12d82571473d13e890b893734e731", @@ -31,22 +37,20 @@ def test_cluster_config_has_changed_ok_with_hash( def test_cluster_config_has_changed_ok_with_state_file( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path ) -> None: - runner = CliRunner() - - with open(here / "cluster_config_has_changed.state_file", "w") as f: + state_file = tmp_path / "cluster_config_has_changed.state_file" + with state_file.open("w") as f: f.write('{"hash": "96b12d82571473d13e890b893734e731"}') - my_mock(mocker, "cluster_config_has_changed", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_config_has_changed", "--state-file", - str(here / "cluster_config_has_changed.state_file"), + str(state_file), ], ) assert result.exit_code == 0 @@ -57,16 +61,13 @@ def test_cluster_config_has_changed_ok_with_state_file( def test_cluster_config_has_changed_ko_with_hash( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_config_has_changed", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_config_has_changed", "--hash", "96b12d82571473d13e890b8937ffffff", @@ -80,24 +81,21 @@ def test_cluster_config_has_changed_ko_with_hash( def test_cluster_config_has_changed_ko_with_state_file_and_save( - mocker: MockerFixture, - use_old_replica_state: bool, + runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path ) -> None: - runner = CliRunner() - - with open(here / "cluster_config_has_changed.state_file", "w") as f: + state_file = tmp_path / "cluster_config_has_changed.state_file" + with state_file.open("w") as f: f.write('{"hash": "96b12d82571473d13e890b8937ffffff"}') - my_mock(mocker, "cluster_config_has_changed", 200) # test without saving the new hash result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_config_has_changed", "--state-file", - str(here / "cluster_config_has_changed.state_file"), + str(state_file), ], ) assert result.exit_code == 2 @@ -106,7 +104,8 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save( == "CLUSTERCONFIGHASCHANGED CRITICAL - The hash of patroni's dynamic configuration has changed. The old hash was 96b12d82571473d13e890b8937ffffff. | is_configuration_changed=1;;@1:1\n" ) - cookie = nagiosplugin.Cookie(here / "cluster_config_has_changed.state_file") + state_file = tmp_path / "cluster_config_has_changed.state_file" + cookie = nagiosplugin.Cookie(state_file) cookie.open() new_config_hash = cookie.get("hash") cookie.close() @@ -118,10 +117,10 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_config_has_changed", "--state-file", - str(here / "cluster_config_has_changed.state_file"), + str(state_file), "--save", ], ) @@ -131,7 +130,7 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save( == "CLUSTERCONFIGHASCHANGED CRITICAL - The hash of patroni's dynamic configuration has changed. The old hash was 96b12d82571473d13e890b8937ffffff. | is_configuration_changed=1;;@1:1\n" ) - cookie = nagiosplugin.Cookie(here / "cluster_config_has_changed.state_file") + cookie = nagiosplugin.Cookie(state_file) cookie.open() new_config_hash = cookie.get("hash") cookie.close() @@ -140,22 +139,20 @@ def test_cluster_config_has_changed_ko_with_state_file_and_save( def test_cluster_config_has_changed_params( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path ) -> None: # This one is placed last because it seems like the exceptions are not flushed from stderr for the next tests. - runner = CliRunner() - - my_mock(mocker, "cluster_config_has_changed", 200) + fake_state_file = tmp_path / "fake_file_name.state_file" result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_config_has_changed", "--hash", "640df9f0211c791723f18fc3ed9dbb95", "--state-file", - str(here / "fake_file_name.state_file"), + str(fake_state_file), ], ) assert result.exit_code == 3 diff --git a/tests/test_cluster_has_leader.py b/tests/test_cluster_has_leader.py index 3015ce3..4319208 100644 --- a/tests/test_cluster_has_leader.py +++ b/tests/test_cluster_has_leader.py @@ -1,54 +1,139 @@ +from pathlib import Path +from typing import Iterator, Union + +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI, cluster_api_set_replica_running -def test_cluster_has_leader_ok( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() +@pytest.fixture +def cluster_has_leader_ok( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_leader_ok.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None - my_mock(mocker, "cluster_has_leader_ok", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_has_leader"] - ) - assert result.exit_code == 0 + +@pytest.mark.usefixtures("cluster_has_leader_ok") +def test_cluster_has_leader_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"]) assert ( result.stdout - == "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0\n" + == "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0 is_leader=1 is_standby_leader=0 is_standby_leader_in_arc_rec=0;@1:1\n" ) + assert result.exit_code == 0 +@pytest.fixture +def cluster_has_leader_ok_standby_leader( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_leader_ok_standby_leader.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_leader_ok_standby_leader") def test_cluster_has_leader_ok_standby_leader( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_leader_ok_standby_leader", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_has_leader"] - ) - assert result.exit_code == 0 + result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"]) assert ( result.stdout - == "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0\n" + == "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0 is_leader=0 is_standby_leader=1 is_standby_leader_in_arc_rec=0;@1:1\n" ) + assert result.exit_code == 0 -def test_cluster_has_leader_ko( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() +@pytest.fixture +def cluster_has_leader_ko( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_leader_ko.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None - my_mock(mocker, "cluster_has_leader_ko", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_has_leader"] + +@pytest.mark.usefixtures("cluster_has_leader_ko") +def test_cluster_has_leader_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"]) + assert ( + result.stdout + == "CLUSTERHASLEADER CRITICAL - The cluster has no running leader or the standby leader is in archive recovery. | has_leader=0;;@0 is_leader=0 is_standby_leader=0 is_standby_leader_in_arc_rec=0;@1:1\n" ) assert result.exit_code == 2 + + +@pytest.fixture +def cluster_has_leader_ko_standby_leader( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_leader_ko_standby_leader.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_leader_ko_standby_leader") +def test_cluster_has_leader_ko_standby_leader( + runner: CliRunner, patroni_api: PatroniAPI +) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"]) assert ( result.stdout - == "CLUSTERHASLEADER CRITICAL - The cluster has no running leader. | has_leader=0;;@0\n" + == "CLUSTERHASLEADER CRITICAL - The cluster has no running leader or the standby leader is in archive recovery. | has_leader=0;;@0 is_leader=0 is_standby_leader=0 is_standby_leader_in_arc_rec=0;@1:1\n" ) + assert result.exit_code == 2 + + +@pytest.fixture +def cluster_has_leader_ko_standby_leader_archiving( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = ( + "cluster_has_leader_ko_standby_leader_archiving.json" + ) + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_leader_ko_standby_leader_archiving") +def test_cluster_has_leader_ko_standby_leader_archiving( + runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool +) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_leader"]) + if old_replica_state: + assert ( + result.stdout + == "CLUSTERHASLEADER OK - The cluster has a running leader. | has_leader=1;;@0 is_leader=0 is_standby_leader=1 is_standby_leader_in_arc_rec=0;@1:1\n" + ) + assert result.exit_code == 0 + else: + assert ( + result.stdout + == "CLUSTERHASLEADER WARNING - The cluster has no running leader or the standby leader is in archive recovery. | has_leader=1;;@0 is_leader=0 is_standby_leader=1 is_standby_leader_in_arc_rec=1;@1:1\n" + ) + assert result.exit_code == 1 diff --git a/tests/test_cluster_has_replica.py b/tests/test_cluster_has_replica.py index d2a4e7b..a6a88c0 100644 --- a/tests/test_cluster_has_replica.py +++ b/tests/test_cluster_has_replica.py @@ -1,39 +1,46 @@ +from pathlib import Path +from typing import Iterator, Union + +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI, cluster_api_set_replica_running -# TODO Lag threshold tests -def test_cluster_has_relica_ok( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() +@pytest.fixture +def cluster_has_replica_ok( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_replica_ok.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None - my_mock(mocker, "cluster_has_replica_ok", 200, use_old_replica_state) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_has_replica"] - ) - assert result.exit_code == 0 + +@pytest.mark.usefixtures("cluster_has_replica_ok") +def test_cluster_has_relica_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_has_replica"]) assert ( result.stdout - == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv3_lag=0 srv3_sync=1 sync_replica=1 unhealthy_replica=0\n" + == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=1 srv3_timeline=51 sync_replica=1 unhealthy_replica=0\n" ) + assert result.exit_code == 0 +@pytest.mark.usefixtures("cluster_has_replica_ok") def test_cluster_has_replica_ok_with_count_thresholds( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_replica_ok", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_has_replica", "--warning", "@1", @@ -41,48 +48,56 @@ def test_cluster_has_replica_ok_with_count_thresholds( "@0", ], ) - assert result.exit_code == 0 assert ( result.stdout - == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=0 srv2_sync=0 srv3_lag=0 srv3_sync=1 sync_replica=1 unhealthy_replica=0\n" + == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=1 srv3_timeline=51 sync_replica=1 unhealthy_replica=0\n" ) + assert result.exit_code == 0 +@pytest.mark.usefixtures("cluster_has_replica_ok") def test_cluster_has_replica_ok_with_sync_count_thresholds( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_replica_ok", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_has_replica", "--sync-warning", "1:", ], ) - assert result.exit_code == 0 assert ( result.stdout - == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv3_lag=0 srv3_sync=1 sync_replica=1;1: unhealthy_replica=0\n" + == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=1 srv3_timeline=51 sync_replica=1;1: unhealthy_replica=0\n" ) + assert result.exit_code == 0 +@pytest.fixture +def cluster_has_replica_ok_lag( + patroni_api: PatroniAPI, datadir: Path, tmp_path: Path, old_replica_state: bool +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_replica_ok_lag.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_replica_ok_lag") def test_cluster_has_replica_ok_with_count_thresholds_lag( - mocker: MockerFixture, - use_old_replica_state: bool, + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_replica_ok_lag", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_has_replica", "--warning", "@1", @@ -92,24 +107,35 @@ def test_cluster_has_replica_ok_with_count_thresholds_lag( "1MB", ], ) - assert result.exit_code == 0 assert ( result.stdout - == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=1024 srv2_sync=0 srv3_lag=0 srv3_sync=0 sync_replica=0 unhealthy_replica=0\n" + == "CLUSTERHASREPLICA OK - healthy_replica is 2 | healthy_replica=2;@1;@0 srv2_lag=1024 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=0\n" ) + assert result.exit_code == 0 +@pytest.fixture +def cluster_has_replica_ko( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_replica_ko.json" + patroni_path: Union[str, Path] = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_replica_ko") def test_cluster_has_replica_ko_with_count_thresholds( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_replica_ko", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_has_replica", "--warning", "@1", @@ -117,24 +143,22 @@ def test_cluster_has_replica_ko_with_count_thresholds( "@0", ], ) - assert result.exit_code == 1 assert ( result.stdout - == "CLUSTERHASREPLICA WARNING - healthy_replica is 1 (outside range @0:1) | healthy_replica=1;@1;@0 srv3_lag=0 srv3_sync=0 sync_replica=0 unhealthy_replica=1\n" + == "CLUSTERHASREPLICA WARNING - healthy_replica is 1 (outside range @0:1) | healthy_replica=1;@1;@0 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=1\n" ) + assert result.exit_code == 1 +@pytest.mark.usefixtures("cluster_has_replica_ko") def test_cluster_has_replica_ko_with_sync_count_thresholds( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_replica_ko", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_has_replica", "--sync-warning", "2:", @@ -142,25 +166,36 @@ def test_cluster_has_replica_ko_with_sync_count_thresholds( "1:", ], ) - assert result.exit_code == 2 + # The lag on srv2 is "unknown". We don't handle string in perfstats so we have to scratch all the second node stats assert ( result.stdout - == "CLUSTERHASREPLICA CRITICAL - sync_replica is 0 (outside range 1:) | healthy_replica=1 srv3_lag=0 srv3_sync=0 sync_replica=0;2:;1: unhealthy_replica=1\n" + == "CLUSTERHASREPLICA CRITICAL - sync_replica is 0 (outside range 1:) | healthy_replica=1 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0;2:;1: unhealthy_replica=1\n" ) + assert result.exit_code == 2 +@pytest.fixture +def cluster_has_replica_ko_lag( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_replica_ko_lag.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_replica_ko_lag") def test_cluster_has_replica_ko_with_count_thresholds_and_lag( - mocker: MockerFixture, - use_old_replica_state: bool, + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_replica_ko_lag", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_has_replica", "--warning", "@1", @@ -170,8 +205,84 @@ def test_cluster_has_replica_ko_with_count_thresholds_and_lag( "1MB", ], ) - assert result.exit_code == 2 assert ( result.stdout - == "CLUSTERHASREPLICA CRITICAL - healthy_replica is 0 (outside range @0:0) | healthy_replica=0;@1;@0 srv2_lag=10241024 srv2_sync=0 srv3_lag=20000000 srv3_sync=0 sync_replica=0 unhealthy_replica=2\n" + == "CLUSTERHASREPLICA CRITICAL - healthy_replica is 0 (outside range @0:0) | healthy_replica=0;@1;@0 srv2_lag=10241024 srv2_sync=0 srv2_timeline=51 srv3_lag=20000000 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=2\n" ) + assert result.exit_code == 2 + + +@pytest.fixture +def cluster_has_replica_ko_wrong_tl( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_replica_ko_wrong_tl.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_replica_ko_wrong_tl") +def test_cluster_has_replica_ko_wrong_tl( + runner: CliRunner, patroni_api: PatroniAPI +) -> None: + result = runner.invoke( + main, + [ + "-e", + patroni_api.endpoint, + "cluster_has_replica", + "--warning", + "@1", + "--critical", + "@0", + "--max-lag", + "1MB", + ], + ) + assert ( + result.stdout + == "CLUSTERHASREPLICA WARNING - healthy_replica is 1 (outside range @0:1) | healthy_replica=1;@1;@0 srv2_lag=1000000 srv2_sync=0 srv2_timeline=50 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=1\n" + ) + assert result.exit_code == 1 + + +@pytest.fixture +def cluster_has_replica_ko_all_replica( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_has_replica_ko_all_replica.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_has_replica_ko_all_replica") +def test_cluster_has_replica_ko_all_replica( + runner: CliRunner, patroni_api: PatroniAPI +) -> None: + result = runner.invoke( + main, + [ + "-e", + patroni_api.endpoint, + "cluster_has_replica", + "--warning", + "@1", + "--critical", + "@0", + "--max-lag", + "1MB", + ], + ) + assert ( + result.stdout + == "CLUSTERHASREPLICA CRITICAL - healthy_replica is 0 (outside range @0:0) | healthy_replica=0;@1;@0 srv1_lag=0 srv1_sync=0 srv1_timeline=51 srv2_lag=0 srv2_sync=0 srv2_timeline=51 srv3_lag=0 srv3_sync=0 srv3_timeline=51 sync_replica=0 unhealthy_replica=3\n" + ) + assert result.exit_code == 2 diff --git a/tests/test_cluster_has_scheduled_action.py b/tests/test_cluster_has_scheduled_action.py index 604393a..d1eacc1 100644 --- a/tests/test_cluster_has_scheduled_action.py +++ b/tests/test_cluster_has_scheduled_action.py @@ -1,20 +1,17 @@ from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI def test_cluster_has_scheduled_action_ok( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_scheduled_action_ok", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_has_scheduled_action"] - ) + with patroni_api.routes({"cluster": "cluster_has_scheduled_action_ok.json"}): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "cluster_has_scheduled_action"] + ) assert result.exit_code == 0 assert ( result.stdout @@ -23,14 +20,14 @@ def test_cluster_has_scheduled_action_ok( def test_cluster_has_scheduled_action_ko_switchover( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_scheduled_action_ko_switchover", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_has_scheduled_action"] - ) + with patroni_api.routes( + {"cluster": "cluster_has_scheduled_action_ko_switchover.json"} + ): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "cluster_has_scheduled_action"] + ) assert result.exit_code == 2 assert ( result.stdout @@ -39,14 +36,14 @@ def test_cluster_has_scheduled_action_ko_switchover( def test_cluster_has_scheduled_action_ko_restart( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_has_scheduled_action_ko_restart", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_has_scheduled_action"] - ) + with patroni_api.routes( + {"cluster": "cluster_has_scheduled_action_ko_restart.json"} + ): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "cluster_has_scheduled_action"] + ) assert result.exit_code == 2 assert ( result.stdout diff --git a/tests/test_cluster_is_in_maintenance.py b/tests/test_cluster_is_in_maintenance.py index 9ae94b3..fb9cb5d 100644 --- a/tests/test_cluster_is_in_maintenance.py +++ b/tests/test_cluster_is_in_maintenance.py @@ -1,20 +1,17 @@ from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI def test_cluster_is_in_maintenance_ok( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_is_in_maintenance_ok", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_is_in_maintenance"] - ) + with patroni_api.routes({"cluster": "cluster_is_in_maintenance_ok.json"}): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "cluster_is_in_maintenance"] + ) assert result.exit_code == 0 assert ( result.stdout @@ -23,14 +20,12 @@ def test_cluster_is_in_maintenance_ok( def test_cluster_is_in_maintenance_ko( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_is_in_maintenance_ko", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_is_in_maintenance"] - ) + with patroni_api.routes({"cluster": "cluster_is_in_maintenance_ko.json"}): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "cluster_is_in_maintenance"] + ) assert result.exit_code == 2 assert ( result.stdout @@ -39,14 +34,14 @@ def test_cluster_is_in_maintenance_ko( def test_cluster_is_in_maintenance_ok_pause_false( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_is_in_maintenance_ok_pause_false", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_is_in_maintenance"] - ) + with patroni_api.routes( + {"cluster": "cluster_is_in_maintenance_ok_pause_false.json"} + ): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "cluster_is_in_maintenance"] + ) assert result.exit_code == 0 assert ( result.stdout diff --git a/tests/test_cluster_node_count.py b/tests/test_cluster_node_count.py index 93926c3..519f205 100644 --- a/tests/test_cluster_node_count.py +++ b/tests/test_cluster_node_count.py @@ -1,22 +1,33 @@ +from pathlib import Path +from typing import Iterator, Union + +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI, cluster_api_set_replica_running +@pytest.fixture +def cluster_node_count_ok( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_node_count_ok.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_node_count_ok") def test_cluster_node_count_ok( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_node_count_ok", 200, use_old_replica_state) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "cluster_node_count"] - ) - assert result.exit_code == 0 - if use_old_replica_state: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "cluster_node_count"]) + if old_replica_state: assert ( result.output == "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3 members=3 role_leader=1 role_replica=2 state_running=3\n" @@ -26,19 +37,18 @@ def test_cluster_node_count_ok( result.output == "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3 members=3 role_leader=1 role_replica=2 state_running=1 state_streaming=2\n" ) + assert result.exit_code == 0 +@pytest.mark.usefixtures("cluster_node_count_ok") def test_cluster_node_count_ok_with_thresholds( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_node_count_ok", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_node_count", "--warning", "@0:1", @@ -50,8 +60,7 @@ def test_cluster_node_count_ok_with_thresholds( "@0:1", ], ) - assert result.exit_code == 0 - if use_old_replica_state: + if old_replica_state: assert ( result.output == "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3;@1;@2 role_leader=1 role_replica=2 state_running=3\n" @@ -61,19 +70,31 @@ def test_cluster_node_count_ok_with_thresholds( result.output == "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3;@1;@2 role_leader=1 role_replica=2 state_running=1 state_streaming=2\n" ) + assert result.exit_code == 0 +@pytest.fixture +def cluster_node_count_healthy_warning( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_node_count_healthy_warning.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_node_count_healthy_warning") def test_cluster_node_count_healthy_warning( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_node_count_healthy_warning", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_node_count", "--healthy-warning", "@2", @@ -81,8 +102,7 @@ def test_cluster_node_count_healthy_warning( "@0:1", ], ) - assert result.exit_code == 1 - if use_old_replica_state: + if old_replica_state: assert ( result.output == "CLUSTERNODECOUNT WARNING - healthy_members is 2 (outside range @0:2) | healthy_members=2;@2;@1 members=2 role_leader=1 role_replica=1 state_running=2\n" @@ -92,19 +112,31 @@ def test_cluster_node_count_healthy_warning( result.output == "CLUSTERNODECOUNT WARNING - healthy_members is 2 (outside range @0:2) | healthy_members=2;@2;@1 members=2 role_leader=1 role_replica=1 state_running=1 state_streaming=1\n" ) + assert result.exit_code == 1 +@pytest.fixture +def cluster_node_count_healthy_critical( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_node_count_healthy_critical.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_node_count_healthy_critical") def test_cluster_node_count_healthy_critical( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_node_count_healthy_critical", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_node_count", "--healthy-warning", "@2", @@ -112,24 +144,35 @@ def test_cluster_node_count_healthy_critical( "@0:1", ], ) - assert result.exit_code == 2 assert ( result.output == "CLUSTERNODECOUNT CRITICAL - healthy_members is 1 (outside range @0:1) | healthy_members=1;@2;@1 members=3 role_leader=1 role_replica=2 state_running=1 state_start_failed=2\n" ) + assert result.exit_code == 2 +@pytest.fixture +def cluster_node_count_warning( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_node_count_warning.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_node_count_warning") def test_cluster_node_count_warning( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_node_count_warning", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_node_count", "--warning", "@2", @@ -137,8 +180,7 @@ def test_cluster_node_count_warning( "@0:1", ], ) - assert result.exit_code == 1 - if use_old_replica_state: + if old_replica_state: assert ( result.stdout == "CLUSTERNODECOUNT WARNING - members is 2 (outside range @0:2) | healthy_members=2 members=2;@2;@1 role_leader=1 role_replica=1 state_running=2\n" @@ -148,19 +190,31 @@ def test_cluster_node_count_warning( result.stdout == "CLUSTERNODECOUNT WARNING - members is 2 (outside range @0:2) | healthy_members=2 members=2;@2;@1 role_leader=1 role_replica=1 state_running=1 state_streaming=1\n" ) + assert result.exit_code == 1 +@pytest.fixture +def cluster_node_count_critical( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_node_count_critical.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_node_count_critical") def test_cluster_node_count_critical( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "cluster_node_count_critical", 200, use_old_replica_state) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "cluster_node_count", "--warning", "@2", @@ -168,8 +222,51 @@ def test_cluster_node_count_critical( "@0:1", ], ) - assert result.exit_code == 2 assert ( result.stdout == "CLUSTERNODECOUNT CRITICAL - members is 1 (outside range @0:1) | healthy_members=1 members=1;@2;@1 role_leader=1 state_running=1\n" ) + assert result.exit_code == 2 + + +@pytest.fixture +def cluster_node_count_ko_in_archive_recovery( + patroni_api: PatroniAPI, old_replica_state: bool, datadir: Path, tmp_path: Path +) -> Iterator[None]: + cluster_path: Union[str, Path] = "cluster_node_count_ko_in_archive_recovery.json" + patroni_path = "cluster_has_replica_patroni_verion_3.1.0.json" + if old_replica_state: + cluster_path = cluster_api_set_replica_running(datadir / cluster_path, tmp_path) + patroni_path = "cluster_has_replica_patroni_verion_3.0.0.json" + with patroni_api.routes({"cluster": cluster_path, "patroni": patroni_path}): + yield None + + +@pytest.mark.usefixtures("cluster_node_count_ko_in_archive_recovery") +def test_cluster_node_count_ko_in_archive_recovery( + runner: CliRunner, patroni_api: PatroniAPI, old_replica_state: bool +) -> None: + result = runner.invoke( + main, + [ + "-e", + patroni_api.endpoint, + "cluster_node_count", + "--healthy-warning", + "@2", + "--healthy-critical", + "@0:1", + ], + ) + if old_replica_state: + assert ( + result.stdout + == "CLUSTERNODECOUNT OK - members is 3 | healthy_members=3;@2;@1 members=3 role_replica=2 role_standby_leader=1 state_running=3\n" + ) + assert result.exit_code == 0 + else: + assert ( + result.stdout + == "CLUSTERNODECOUNT CRITICAL - healthy_members is 1 (outside range @0:1) | healthy_members=1;@2;@1 members=3 role_replica=2 role_standby_leader=1 state_in_archive_recovery=2 state_streaming=1\n" + ) + assert result.exit_code == 2 diff --git a/tests/test_node_is_alive.py b/tests/test_node_is_alive.py index c1aca26..b8bd8ed 100644 --- a/tests/test_node_is_alive.py +++ b/tests/test_node_is_alive.py @@ -1,16 +1,19 @@ +from pathlib import Path + from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI -def test_node_is_alive_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() - - my_mock(mocker, None, 200) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_alive"]) +def test_node_is_alive_ok( + runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path +) -> None: + liveness = tmp_path / "liveness" + liveness.touch() + with patroni_api.routes({"liveness": liveness}): + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_alive"]) assert result.exit_code == 0 assert ( result.stdout @@ -18,11 +21,8 @@ def test_node_is_alive_ok(mocker: MockerFixture, use_old_replica_state: bool) -> ) -def test_node_is_alive_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() - - my_mock(mocker, None, 404) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_alive"]) +def test_node_is_alive_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_alive"]) assert result.exit_code == 2 assert ( result.stdout diff --git a/tests/test_node_is_leader.py b/tests/test_node_is_leader.py index 2822307..aeba6a6 100644 --- a/tests/test_node_is_leader.py +++ b/tests/test_node_is_leader.py @@ -1,28 +1,37 @@ +from typing import Iterator + +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI -def test_node_is_leader_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() +@pytest.fixture +def node_is_leader_ok(patroni_api: PatroniAPI) -> Iterator[None]: + with patroni_api.routes( + { + "leader": "node_is_leader_ok.json", + "standby-leader": "node_is_leader_ok_standby_leader.json", + } + ): + yield None - my_mock(mocker, "node_is_leader_ok", 200) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_leader"]) + +@pytest.mark.usefixtures("node_is_leader_ok") +def test_node_is_leader_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_leader"]) assert result.exit_code == 0 assert ( result.stdout == "NODEISLEADER OK - This node is a leader node. | is_leader=1;;@0\n" ) - my_mock(mocker, "node_is_leader_ok_standby_leader", 200) result = runner.invoke( main, - ["-e", "https://10.20.199.3:8008", "node_is_leader", "--is-standby-leader"], + ["-e", patroni_api.endpoint, "node_is_leader", "--is-standby-leader"], ) - print(result.stdout) assert result.exit_code == 0 assert ( result.stdout @@ -30,21 +39,17 @@ def test_node_is_leader_ok(mocker: MockerFixture, use_old_replica_state: bool) - ) -def test_node_is_leader_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() - - my_mock(mocker, "node_is_leader_ko", 503) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_leader"]) +def test_node_is_leader_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_leader"]) assert result.exit_code == 2 assert ( result.stdout == "NODEISLEADER CRITICAL - This node is not a leader node. | is_leader=0;;@0\n" ) - my_mock(mocker, "node_is_leader_ko_standby_leader", 503) result = runner.invoke( main, - ["-e", "https://10.20.199.3:8008", "node_is_leader", "--is-standby-leader"], + ["-e", patroni_api.endpoint, "node_is_leader", "--is-standby-leader"], ) assert result.exit_code == 2 assert ( diff --git a/tests/test_node_is_pending_restart.py b/tests/test_node_is_pending_restart.py index 8fc601f..98420c7 100644 --- a/tests/test_node_is_pending_restart.py +++ b/tests/test_node_is_pending_restart.py @@ -1,20 +1,15 @@ from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI -def test_node_is_pending_restart_ok( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - - my_mock(mocker, "node_is_pending_restart_ok", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"] - ) +def test_node_is_pending_restart_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: + with patroni_api.routes({"patroni": "node_is_pending_restart_ok.json"}): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "node_is_pending_restart"] + ) assert result.exit_code == 0 assert ( result.stdout @@ -22,15 +17,11 @@ def test_node_is_pending_restart_ok( ) -def test_node_is_pending_restart_ko( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - - my_mock(mocker, "node_is_pending_restart_ko", 200) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_pending_restart"] - ) +def test_node_is_pending_restart_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: + with patroni_api.routes({"patroni": "node_is_pending_restart_ko.json"}): + result = runner.invoke( + main, ["-e", patroni_api.endpoint, "node_is_pending_restart"] + ) assert result.exit_code == 2 assert ( result.stdout diff --git a/tests/test_node_is_primary.py b/tests/test_node_is_primary.py index ac10966..7467d67 100644 --- a/tests/test_node_is_primary.py +++ b/tests/test_node_is_primary.py @@ -1,16 +1,13 @@ from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI -def test_node_is_primary_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() - - my_mock(mocker, "node_is_primary_ok", 200) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_primary"]) +def test_node_is_primary_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: + with patroni_api.routes({"primary": "node_is_primary_ok.json"}): + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_primary"]) assert result.exit_code == 0 assert ( result.stdout @@ -18,11 +15,8 @@ def test_node_is_primary_ok(mocker: MockerFixture, use_old_replica_state: bool) ) -def test_node_is_primary_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() - - my_mock(mocker, "node_is_primary_ko", 503) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_primary"]) +def test_node_is_primary_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_primary"]) assert result.exit_code == 2 assert ( result.stdout diff --git a/tests/test_node_is_replica.py b/tests/test_node_is_replica.py index 0f9d21e..0cb7741 100644 --- a/tests/test_node_is_replica.py +++ b/tests/test_node_is_replica.py @@ -1,16 +1,27 @@ +from typing import Iterator + +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI -def test_node_is_replica_ok(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() +@pytest.fixture +def node_is_replica_ok(patroni_api: PatroniAPI) -> Iterator[None]: + with patroni_api.routes( + { + k: "node_is_replica_ok.json" + for k in ("replica", "synchronous", "asynchronous") + } + ): + yield None - my_mock(mocker, "node_is_replica_ok", 200) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_replica"]) + +@pytest.mark.usefixtures("node_is_replica_ok") +def test_node_is_replica_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_replica"]) assert result.exit_code == 0 assert ( result.stdout @@ -18,11 +29,8 @@ def test_node_is_replica_ok(mocker: MockerFixture, use_old_replica_state: bool) ) -def test_node_is_replica_ko(mocker: MockerFixture, use_old_replica_state: bool) -> None: - runner = CliRunner() - - my_mock(mocker, "node_is_replica_ko", 503) - result = runner.invoke(main, ["-e", "https://10.20.199.3:8008", "node_is_replica"]) +def test_node_is_replica_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_is_replica"]) assert result.exit_code == 2 assert ( result.stdout @@ -30,15 +38,10 @@ def test_node_is_replica_ko(mocker: MockerFixture, use_old_replica_state: bool) ) -def test_node_is_replica_ko_lag( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - +def test_node_is_replica_ko_lag(runner: CliRunner, patroni_api: PatroniAPI) -> None: # We don't do the check ourselves, patroni does it and changes the return code - my_mock(mocker, "node_is_replica_ok", 503) result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--max-lag", "100"] + main, ["-e", patroni_api.endpoint, "node_is_replica", "--max-lag", "100"] ) assert result.exit_code == 2 assert ( @@ -46,12 +49,11 @@ def test_node_is_replica_ko_lag( == "NODEISREPLICA CRITICAL - This node is not a running replica with no noloadbalance tag and a lag under 100. | is_replica=0;;@0\n" ) - my_mock(mocker, "node_is_replica_ok", 503) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_is_replica", "--is-async", "--max-lag", @@ -65,15 +67,11 @@ def test_node_is_replica_ko_lag( ) -def test_node_is_replica_sync_ok( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - +@pytest.mark.usefixtures("node_is_replica_ok") +def test_node_is_replica_sync_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: # We don't do the check ourselves, patroni does it and changes the return code - my_mock(mocker, "node_is_replica_ok", 200) result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-sync"] + main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-sync"] ) assert result.exit_code == 0 assert ( @@ -82,15 +80,10 @@ def test_node_is_replica_sync_ok( ) -def test_node_is_replica_sync_ko( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - +def test_node_is_replica_sync_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: # We don't do the check ourselves, patroni does it and changes the return code - my_mock(mocker, "node_is_replica_ok", 503) result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-sync"] + main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-sync"] ) assert result.exit_code == 2 assert ( @@ -99,15 +92,11 @@ def test_node_is_replica_sync_ko( ) -def test_node_is_replica_async_ok( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - +@pytest.mark.usefixtures("node_is_replica_ok") +def test_node_is_replica_async_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: # We don't do the check ourselves, patroni does it and changes the return code - my_mock(mocker, "node_is_replica_ok", 200) result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-async"] + main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-async"] ) assert result.exit_code == 0 assert ( @@ -116,15 +105,10 @@ def test_node_is_replica_async_ok( ) -def test_node_is_replica_async_ko( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - +def test_node_is_replica_async_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: # We don't do the check ourselves, patroni does it and changes the return code - my_mock(mocker, "node_is_replica_ok", 503) result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_is_replica", "--is-async"] + main, ["-e", patroni_api.endpoint, "node_is_replica", "--is-async"] ) assert result.exit_code == 2 assert ( @@ -133,18 +117,14 @@ def test_node_is_replica_async_ko( ) -def test_node_is_replica_params( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - +@pytest.mark.usefixtures("node_is_replica_ok") +def test_node_is_replica_params(runner: CliRunner, patroni_api: PatroniAPI) -> None: # We don't do the check ourselves, patroni does it and changes the return code - my_mock(mocker, "node_is_replica_ok", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_is_replica", "--is-async", "--is-sync", @@ -157,12 +137,11 @@ def test_node_is_replica_params( ) # We don't do the check ourselves, patroni does it and changes the return code - my_mock(mocker, "node_is_replica_ok", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_is_replica", "--is-sync", "--max-lag", diff --git a/tests/test_node_patroni_version.py b/tests/test_node_patroni_version.py index 333b79b..a7622ab 100644 --- a/tests/test_node_patroni_version.py +++ b/tests/test_node_patroni_version.py @@ -1,22 +1,25 @@ +from typing import Iterator + +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import my_mock +from . import PatroniAPI -def test_node_patroni_version_ok( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() +@pytest.fixture(scope="module", autouse=True) +def node_patroni_version(patroni_api: PatroniAPI) -> Iterator[None]: + with patroni_api.routes({"patroni": "node_patroni_version.json"}): + yield None - my_mock(mocker, "node_patroni_version", 200) + +def test_node_patroni_version_ok(runner: CliRunner, patroni_api: PatroniAPI) -> None: result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_patroni_version", "--patroni-version", "2.0.2", @@ -29,17 +32,12 @@ def test_node_patroni_version_ok( ) -def test_node_patroni_version_ko( - mocker: MockerFixture, use_old_replica_state: bool -) -> None: - runner = CliRunner() - - my_mock(mocker, "node_patroni_version", 200) +def test_node_patroni_version_ko(runner: CliRunner, patroni_api: PatroniAPI) -> None: result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_patroni_version", "--patroni-version", "1.0.0", diff --git a/tests/test_node_tl_has_changed.py b/tests/test_node_tl_has_changed.py index 3b6245e..0e921e5 100644 --- a/tests/test_node_tl_has_changed.py +++ b/tests/test_node_tl_has_changed.py @@ -1,23 +1,30 @@ +from pathlib import Path +from typing import Iterator + import nagiosplugin +import pytest from click.testing import CliRunner -from pytest_mock import MockerFixture from check_patroni.cli import main -from .tools import here, my_mock +from . import PatroniAPI +@pytest.fixture +def node_tl_has_changed(patroni_api: PatroniAPI) -> Iterator[None]: + with patroni_api.routes({"patroni": "node_tl_has_changed.json"}): + yield None + + +@pytest.mark.usefixtures("node_tl_has_changed") def test_node_tl_has_changed_ok_with_timeline( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "node_tl_has_changed", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_tl_has_changed", "--timeline", "58", @@ -30,23 +37,22 @@ def test_node_tl_has_changed_ok_with_timeline( ) +@pytest.mark.usefixtures("node_tl_has_changed") def test_node_tl_has_changed_ok_with_state_file( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path ) -> None: - runner = CliRunner() - - with open(here / "node_tl_has_changed.state_file", "w") as f: + state_file = tmp_path / "node_tl_has_changed.state_file" + with state_file.open("w") as f: f.write('{"timeline": 58}') - my_mock(mocker, "node_tl_has_changed", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_tl_has_changed", "--state-file", - str(here / "node_tl_has_changed.state_file"), + str(state_file), ], ) assert result.exit_code == 0 @@ -56,17 +62,15 @@ def test_node_tl_has_changed_ok_with_state_file( ) +@pytest.mark.usefixtures("node_tl_has_changed") def test_node_tl_has_changed_ko_with_timeline( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI ) -> None: - runner = CliRunner() - - my_mock(mocker, "node_tl_has_changed", 200) result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_tl_has_changed", "--timeline", "700", @@ -79,24 +83,23 @@ def test_node_tl_has_changed_ko_with_timeline( ) +@pytest.mark.usefixtures("node_tl_has_changed") def test_node_tl_has_changed_ko_with_state_file_and_save( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path ) -> None: - runner = CliRunner() - - with open(here / "node_tl_has_changed.state_file", "w") as f: + state_file = tmp_path / "node_tl_has_changed.state_file" + with state_file.open("w") as f: f.write('{"timeline": 700}') - my_mock(mocker, "node_tl_has_changed", 200) # test without saving the new tl result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_tl_has_changed", "--state-file", - str(here / "node_tl_has_changed.state_file"), + str(state_file), ], ) assert result.exit_code == 2 @@ -105,7 +108,7 @@ def test_node_tl_has_changed_ko_with_state_file_and_save( == "NODETLHASCHANGED CRITICAL - The expected timeline was 700 got 58. | is_timeline_changed=1;;@1:1 timeline=58\n" ) - cookie = nagiosplugin.Cookie(here / "node_tl_has_changed.state_file") + cookie = nagiosplugin.Cookie(state_file) cookie.open() new_tl = cookie.get("timeline") cookie.close() @@ -117,10 +120,10 @@ def test_node_tl_has_changed_ko_with_state_file_and_save( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_tl_has_changed", "--state-file", - str(here / "node_tl_has_changed.state_file"), + str(state_file), "--save", ], ) @@ -130,7 +133,7 @@ def test_node_tl_has_changed_ko_with_state_file_and_save( == "NODETLHASCHANGED CRITICAL - The expected timeline was 700 got 58. | is_timeline_changed=1;;@1:1 timeline=58\n" ) - cookie = nagiosplugin.Cookie(here / "node_tl_has_changed.state_file") + cookie = nagiosplugin.Cookie(state_file) cookie.open() new_tl = cookie.get("timeline") cookie.close() @@ -138,23 +141,22 @@ def test_node_tl_has_changed_ko_with_state_file_and_save( assert new_tl == 58 +@pytest.mark.usefixtures("node_tl_has_changed") def test_node_tl_has_changed_params( - mocker: MockerFixture, use_old_replica_state: bool + runner: CliRunner, patroni_api: PatroniAPI, tmp_path: Path ) -> None: # This one is placed last because it seems like the exceptions are not flushed from stderr for the next tests. - runner = CliRunner() - - my_mock(mocker, "node_tl_has_changed", 200) + fake_state_file = tmp_path / "fake_file_name.state_file" result = runner.invoke( main, [ "-e", - "https://10.20.199.3:8008", + patroni_api.endpoint, "node_tl_has_changed", "--timeline", "58", "--state-file", - str(here / "fake_file_name.state_file"), + str(fake_state_file), ], ) assert result.exit_code == 3 @@ -163,9 +165,7 @@ def test_node_tl_has_changed_params( == "NODETLHASCHANGED UNKNOWN: click.exceptions.UsageError: Either --timeline or --state-file should be provided for this service\n" ) - result = runner.invoke( - main, ["-e", "https://10.20.199.3:8008", "node_tl_has_changed"] - ) + result = runner.invoke(main, ["-e", patroni_api.endpoint, "node_tl_has_changed"]) assert result.exit_code == 3 assert ( result.stdout diff --git a/tests/tools.py b/tests/tools.py deleted file mode 100644 index cad10ba..0000000 --- a/tests/tools.py +++ /dev/null @@ -1,49 +0,0 @@ -import json -import pathlib -from typing import Any - -from pytest_mock import MockerFixture - -from check_patroni.types import APIError, PatroniResource - -here = pathlib.Path(__file__).parent - - -def getjson(name: str) -> Any: - path = here / "json" / f"{name}.json" - if not path.exists(): - raise Exception(f"path does not exist : {path}") - - with path.open() as f: - return json.load(f) - - -def my_mock( - mocker: MockerFixture, - json_file: str, - status: int, - use_old_replica_state: bool = False, -) -> None: - def mock_rest_api(self: PatroniResource, service: str) -> Any: - if status != 200: - raise APIError("Test en erreur pour status code 200") - if json_file: - if use_old_replica_state and ( - json_file.startswith("cluster_has_replica") - or json_file.startswith("cluster_node_count") - ): - return cluster_api_set_replica_running(getjson(json_file)) - return getjson(json_file) - return None - - mocker.resetall() - mocker.patch("check_patroni.types.PatroniResource.rest_api", mock_rest_api) - - -def cluster_api_set_replica_running(js: Any) -> Any: - # starting from 3.0.4 the state of replicas is streaming instead of running - for node in js["members"]: - if node["role"] in ["replica", "sync_standby"]: - if node["state"] == "streaming": - node["state"] = "running" - return js diff --git a/tox.ini b/tox.ini index 9cead5b..0948f07 100644 --- a/tox.ini +++ b/tox.ini @@ -4,11 +4,9 @@ envlist = lint, mypy, py{37,38,39,310,311} skip_missing_interpreters = True [testenv] -deps = - pytest - pytest-mock +extras = test commands = - pytest {toxinidir}/check_patroni {toxinidir}/tests {posargs:-vv} + pytest {toxinidir}/check_patroni {toxinidir}/tests {posargs:-vv --log-level=debug} [testenv:lint] skip_install = True @@ -18,7 +16,7 @@ deps = flake8 isort commands = - codespell {toxinidir}/check_patroni {toxinidir}/tests + codespell {toxinidir}/check_patroni {toxinidir}/tests {toxinidir}/docs/ {toxinidir}/RELEASE.md {toxinidir}/CONTRIBUTING.md black --check --diff {toxinidir}/check_patroni {toxinidir}/tests flake8 {toxinidir}/check_patroni {toxinidir}/tests isort --check --diff {toxinidir}/check_patroni {toxinidir}/tests @@ -28,7 +26,7 @@ deps = mypy == 0.961 commands = # we need to install types-requests - mypy --install-types --non-interactive {toxinidir}/check_patroni + mypy --install-types --non-interactive [testenv:build] deps = diff --git a/vagrant/README.md b/vagrant/README.md index 81339bc..20db4e1 100644 --- a/vagrant/README.md +++ b/vagrant/README.md @@ -100,7 +100,7 @@ http://$IP/icingaweb2/setup Finish -* Screen 15: Hopefuly success +* Screen 15: Hopefully success Login diff --git a/vagrant/provision/icinga2.bash b/vagrant/provision/icinga2.bash index 023f45a..ab8ff26 100755 --- a/vagrant/provision/icinga2.bash +++ b/vagrant/provision/icinga2.bash @@ -66,7 +66,7 @@ icinga_setup(){ info "# Icinga setup" info "#=============================================================================" -## this part is already done by the standart icinga install with the user icinga2 +## this part is already done by the standard icinga install with the user icinga2 ## and a random password, here we dont really care cat << __EOF__ | sudo -u postgres psql @@ -83,7 +83,7 @@ __EOF__ icingacli setup config directory --group icingaweb2 icingacli setup token create -## this part is already done by the standart icinga install with the user icinga2 +## this part is already done by the standard icinga install with the user icinga2 cat << __EOF__ > /etc/icinga2/features-available/ido-pgsql.conf /** * The db_ido_pgsql library implements IDO functionality @@ -198,7 +198,7 @@ grafana(){ cat << __EOF__ > /etc/grafana/grafana.ini [database] # You can configure the database connection by specifying type, host, name, user and password -# as seperate properties or as on string using the url propertie. +# as separate properties or as on string using the url property. # Either "mysql", "postgres" or "sqlite3", it's your choice type = postgres