Configuring the DRBD Reactor Promoter Plugin Freeze Feature
This article describes how to configure your environment, DRBD® options, and promoter plugin configuration file to freeze HA resources that might take a long time to start.
You can use DRBD Reactor and its promoter plugin to manage applications and services and make them highly available. Should a cluster node hosting your application fail or lose its connection to the other cluster nodes, DRBD Reactor will start the application service on another node.
The promoter plugin’s freeze feature can be useful in cases where a service, for example, a large database, in the promoter plugin’s start list of services might take a long time to start. If a node currently hosting the database resource loses its connection to the cluster, DRBD Reactor will freeze the resource and attempt to start the resource on another node. If in the meantime the original node regains its connection, DRBD Reactor will unfreeze the resource and the node will again host the resource. This could mean for a brief network connection drop, your high-availability (HA) resource is back up and running in seconds rather than minutes.
More information about DRBD Reactor and its plugins can be found in the
DRBD User’s
Guide.
You can also get help through DRBD Reactor’s various man pages,
--help
, and at the DRBD Reactor GitHub
page.
Before you configure DRBD Reactor and its promoter plugin’s freeze feature, you will need to first verify and fulfill some requirements.
The promoter plugin’s freeze feature requires cgroups v2. On newer Linux distributions, such as RHEL 9 and Ubuntu 22.04, cgroups v2 is enabled. On older versions, you might have to manually enable it.
Verify that your nodes have cgroups v2:
# ls /sys/fs/cgroup/cgroup.controllers
If this file is not present, cgroups v2 is either disabled or your Linux version does not support cgroups v2.
You can verify that you can enable cgroups v2 on your system by entering
the command grep cgroup2 /proc/filesystems
. If the output includes
cgroup2
, then you can proceed to enable the feature, by using a kernel
command line argument.
On RHEL-based systems, you can install the grubby
package to make this
easy.
On RHEL:
# dnf -y install grubby
Next, enter the following commands to add the kernel argument and update GRUB’s configuration:
# grubby --update-kernel=ALL --args=systemd.unified_cgroup_hierarchy=1
# grub2-mkconfig -o /boot/grub2/grub.cfg
In Ubuntu, you will need edit the /etc/default/grub
file, and add the
following line to the
appropriate kernel entry block for your system. Unless you have multiple
operating systems or
kernels that you manage with GRUB, this should be the block that starts
with the
GRUB_DEFAULT=0
line.
GRUB_CMDLINE_LINUX=systemd.unified_cgroup_hierarchy=1
After adding the line to the /etc/default/grub
file, you will need to
enter the following
command to apply your changes:
# update-grub
NOTE: You can safely ignore any Device or resource busy warnings related
to device-mapper
and osprober
. If you do not want to see these messages, you can add
the
GRUB_DISABLE_OS_PROBER=true
line to your /etc/default/grub
file to
disable the operating system prober from searching for other operating
systems. You likely do not need this feature unless you need to
sometimes boot multiple operating systems.
Repeat these steps on all your nodes, then reboot your nodes and verify
that cgroups v2 is
enabled:
# ls /sys/fs/cgroup/cgroup.controllers
The DRBD Reactor promoter plugin’s freeze feature also requires the following DRBD properties set:
on-no-quorum
set tosuspend-io
;on-no-data-accessible
set tosuspend-io
;on-suspended-primary
set toforce-secondary
;- and the DRBD
net
propertyrr-conflict
set toretry-connect
.
Use LINSTOR® to set these properties on your LINSTOR resource by entering the following commands on your LINSTOR controller node:
# linstor resource-definition drbd-options --on-no-quorum suspend-io <resource-def-name>
# linstor resource-definition drbd-options --on-no-data-accessible suspend-io <resource-def-name>
# linstor resource-definition drbd-options --on-suspended-primary force-secondary <resource-def-name>
# linstor resource-definition drbd-options --rr-conflict retry-connect <resource-def-name>
You can verify that your resource has these DRBD properties set by entering the command:
# linstor resource-definition list-properties <resource-def-name>
Because DRBD Reactor’s promoter plugin will be controlling your HA
resource, disable the DRBD auto-promote
property on your resource by
using the following LINSTOR command:
# linstor resource-definition drbd-options --auto-promote no <resource-def-name>
Your resource also needs the DRBD quorum
property set to majority
but LINSTOR should have set this automatically when you spawned your
resource.
You can verify your resource’s properties by using the linstor resource-definition list-properties <resource-def-name>
command.
To enable the promoter plugin’s freeze feature, add the following lines
to the promoter plugin’s TOML snippet file, located by default in
/etc/drbd-reactor.d/
, for your HA resource:
on-drbd-demote-failure = reboot
on-quorum-loss = freeze
Created 2022/11/30 - MAT
Reviewed 2022/12/1 - MDK