patroni/debian/design.md

8.8 KiB

Integrating Patroni with Debian

Introduction

Patroni manages PostgreSQL instances and mostly expects a blank sheet, i.e prefers to initialize and insists to start and stop the database itself. Debian on the other hand includes the postgresql-common (called pg-common in the following) framework that manages concurrent major versions of PostgreSQL and possibly database instances for each of them. This document details a design on how to integrate Patroni (in form of the Debian patroni package, preferable provided by PGDG's apt repository) with Debian's pg-common framework and policies.

pg-common nomenclature and directory layout

pg-common enhances the usual libpq environment variables like PGHOST and PGDATABASE with PGCLUSTER, that is the Debian standard way of addressing a specific version/instance combination. The PGCLUSTER environment variable consists of two parts, the PostgreSQL major version and the instance name (main being the default instance name), serarated by a '/', e.g. '10/main'. The two components are usually abbreviated or referred to as '%v' and '%c', respectively.

The versioned binaries and libraries are located in /usr/lib/postgresql/%v/{bin,lib}. The configuration files usually reside in /etc/postgresql/%v/%c/ and the default log file (set via the pg_ctlcluster wrapper) is /var/log/postgresql/postgresql-%v-%c.log. The default data directory is /var/lib/postgresql/%v/%c and the listening sockets are located in /var/run/postgresql/.

Patroni configuration

The Patroni configuration is in YAML format. Different Patroni instances or installations are identified by the scope configuration option, which mostly maps to an instance name.

As Patroni is usually started via Docker or another container runtime, there is no opinioated default file system layout or even configuratin file location.

It would be desirable to create/maintain a Patroni configuration for each pg-common instance as /etc/patroni/%v-%c.yml, i.e. e.g. /etc/patroni/10-main.yml. Another possibility would be to prefix the configuration file name with patroni-.

Automatic generation of Patroni configuration

Currently, the pg-common instance specific configuration file /etc/patroni/%v-%c.yml needs to be deployed/adopted manually as the pg-common framework and pg_createcluster have no possibilty to run external programs as hooks after instance creation.

Another possibilty is to create the configuration file via an external config management program like Ansible or Puppet.

Regardless of this, a simple program that creates the instance-specific configuration from a template (e.g. /etc/patroni/patroni.yml.in) would be desirable and could be shipped in the patroni Debian package.

Debian-adopted Patroni configuration

it is possble to mimick the default Debian layout via the following configuration parameters:

data_dir = "/var/lib/postgresql/%v/%c"
bin_dir = "/usr/lib/postgresql/%v/bin"
config_dir: "/etc/postgresql/%v/%c"

The logfile location and filename as well as the socket directory have to be explicitly set via the configuration file:

unix_socket_directories = '/var/run/postgresql/'
log_directory = '/var/log/postgresql'
log_filename = 'postgresql-%v-%c.log'

The postgresql.conf configuration file

Patroni expects to deploy a configuration file and to be able to change it and keep it in sync across nodes. If a configuration file postgresql.conf is already present in the config_dir, it renames it to postgresql.base.conf and includes it at the top of the postgresql.conf it writes instead.

Initialization of the first and standby instances

Patroni by default runs initdb on the data directory during bootrap while pg-common provides the pg_createcluster command for this purpose.

It is possible to tell Patroni to run an external bootstrap command which is passed the --scope and --datadir command-line options. This makes it possible to have a small wrapper script like the following that runs pg_createcluster instead:

#!/bin/sh

for i in "$@"
do
case $i in
    --scope=*)
    SCOPE="${i#*=}"
    shift # past argument=value
    ;;
    --datadir=*)
    DATADIR="${i#*=}"
    shift # past argument=value
    ;;
    *)
       # unknown option
    ;;
esac
done

VERSION=$(echo $SCOPE | sed -e 's/\/.*//')
CLUSTER=$(echo $SCOPE | sed -e 's/.*\///')

pg_createcluster $VERSION $CLUSTER
exit $?

This requires the following in the Patroni YAML configuratio:

bootstrap:
  # Custom bootstrap method
  method: pg_createcluster
  pg_createcluster:
    command: <custom script above>

Subsequent nodes are boostrapped from the primary via base backups. Just running pg_createcluster on them is not possible as then the cluster IDs (the 'Database system identifier' in the pg_controldata output) would differ.

Again, it is possible to provide an external clone program, which can run pg_basebackup itself. The pg-common framework currently does not provide for this, but it is possible to run pg_basebackup after pg_createcluster (which creates the configuration directory) and purging the data directory:

#!/bin/sh

for i in "$@"
do
case $i in
    --scope=*)
    SCOPE="${i#*=}"
    shift # past argument=value
    ;;
    --role=*)
    ROLE="${i#*=}"
    shift # past argument=value
    ;;
    --datadir=*)
    DATADIR="${i#*=}"
    shift # past argument=value
    ;;
    --connstring=*)
    CONNSTR="${i#*=}"
    shift # past argument=value
    ;;
    *)
       # unknown option
    ;;
esac
done

VERSION=$(echo $SCOPE | sed -e 's/\/.*//')
CLUSTER=$(echo $SCOPE | sed -e 's/.*\///')

if [ -f /etc/postgresql/$VERSION/$CLUSTER/postgresql.conf ]
then
    pg_dropcluster $VERSION $CLUSTER
fi

pg_createcluster $VERSION $CLUSTER && rm -rf $DATADIR && pg_basebackup --pgdata $DATADIR -X stream --dbname=$CONNSTR
exit $?

Those two scripts could be shipped in the patroni Debian package e.g. as /usr/share/patroni/pg_createcluster_patroni and /usr/share/patroni/pg_clonecluster_patroni.

Systemd services

The Patroni daemon/agent needs to be started for each configuration file in /etc/patroni, i.e. a systemd service unit is required for each. This can be facilitated via the patroni@.service that acts as a wild card and could have the following pg-common specific content:

[Unit]
ConditionPathExists=/etc/patroni/%i.yml

[Service]
ExecStart=/usr/bin/patroni /etc/patroni/%i.yml

This makes it possible to e.g. start Patroni for the '10/main' instance with systemctl start patroni@10/main.

DCS Debian availability

The only DCS server available in Debian 9 (stretch) is zookeeper. Both etcd and consul were blacklisted before release by the Debian security team due to being written in Go.

Current testing (buster) also includes etcd and consul.

The latest Patroni Debian package is available for both Debian stable and testing/unstable via the PGDG APT repository.

DCS setup

One outstanding issue with providing a general Patroni configuration template and possibly a program that creates a Patroni configuration file per pg-common instance is the distributed consensus store, which also needs to be configured in the Patroni YAML file.

One possibility would be to introduce a file /etc/default/patroni where the DCS and it's ip-address(es) could be configured which could be sourced by the configuration file generator.

Security

The proposed/current setup should be considered a Proof of Concept. In particular, care must be taken that the postgres db/system user password is not leaked and that the DCS access is secured/hardened.

Making patronictl pg-common aware

The patronictl CLI currently looks for a configuration file ~/.config/patronictl.yml. This could potentially be a symlink to the configuration under /etc/patroni/. In the pg-common scope, it could addtionally either display all pg-common instances and/or accept the '%v/%c' instance notation in order to select a specific instance to display or act on.

Current status

As of the postgresql-common_194 and patroni_1.5.0-9 packages, the following steps are still needed for each pg-common instance %v/%c:

  1. Setup DCS and possibly install its client python packages
  2. Install postgresql-common package
  3. Disable automatic instance creation in /etc/postgresql-common/createcluster.conf
  4. Install postgresql-%v package
  5. Setup authentication via pgpass or otherwise
  6. Create a /etc/patroni/%v-%c.yml configuration file
  7. Start the patroni systemd service for the instance

In particular, no changes are required to the Patroni or postgresql-common upstream codebases.