Recommendations for Fencing and STONITH Devices in Pacemaker
We are often asked for recommendations on hardware that supports fencing/STONITH in Pacemaker. We try to avoid making specific recommendations since there are so many supported options, but we are aware of a few one-size-fits-all smart Power Distribution Unit (PDU) options.
Often times hardware will include some mechanism that supports fencing/STONITH, like SuperMicro’s IPMI, Dell’s iDRAC, and HPE’s ILO. If your server hardware does not have something like this built-in, one option you have is using a smart PDU or Uninterruptible Power Supply (UPS), and maybe you’re lucky and already have one of those!
If you do not have one and need to purchase one, the APC AP7900B is one that LINBIT® trusts and has seen used in 100’s of Pacemaker clusters. https://www.apc.com/shop/us/en/products/Rack-PDU-Switched-1U-15A-100-120V-8-5-15/P-AP7900B
The instruction manual probably covers this, and should be read to ensure the firmware/details have not changed, but the newer versions have been shipping with network features disabled by default for security reasons. To enable these features telnet/console in, run:
tcpip -i <ip-to-configure> -s <subnet> -g <gateway>
web -h enable
web -s enable
snmp -S enable -c1 private -a1 writeplus
snmp -S enable -c2 public -a2 writeplus
reboot -Y
Once rebooted, you should be able to reach the PDU over the network, and therefore can configure and use it in Pacemaker. Adding the PDU fencing devices requires distinct off and on actions for each outlet on each PDU. With two nodes, each with two Power Supply Units (PSUs), this translates to eight commands. The off commands will be monitored to alert us if the PDU fails for some reason. There is no reason to monitor the on actions.
# Node 1 - off
pcs stonith create fence_node-a_pdu1_off fence_apc_snmp pcmk_host_list=node-a.linbit.com ipaddr=<ip-pdu1> delay=5 action=off port=1 op monitor interval=60s
pcs stonith create fence_node-a_pdu2_off fence_apc_snmp pcmk_host_list=node-a.linbit.com ipaddr=<ip-pdu2> delay=5 action=off port=1 power_wait=5 op monitor interval=60s
# Node 1 - on
pcs stonith create fence_node-a_pdu1_on fence_apc_snmp pcmk_host_list=node-a.linbit.com ipaddr=<ip-pdu1> action=on port=1
pcs stonith create fence_node-a_pdu2_on fence_apc_snmp pcmk_host_list=node-a.linbit.com ipaddr=<ip-pdu2> action=on port=1
# Node 2 - off
pcs stonith create fence_node-b_pdu1_off fence_apc_snmp pcmk_host_list=node-b.linbit.com ipaddr=<ip-pdu1> delay=5 action=off port=2 op monitor interval=60s
pcs stonith create fence_node-b_pdu2_off fence_apc_snmp pcmk_host_list=node-b.linbit.com ipaddr=<ip-pdu2> delay=5 action=off port=2 power_wait=5 op monitor interval=60s
# Node 2 - on
pcs stonith create fence_node-b_pdu1_on fence_apc_snmp pcmk_host_list=node-b.linbit.com ipaddr=<ip-pdu1> action=on port=2
pcs stonith create fence_node-b_pdu2_on fence_apc_snmp pcmk_host_list=node-b.linbit.com ipaddr=<ip-pdu2> action=on port=2
Obviously, if you don’t have redundant power supplies in each host, you can skip the respective command in each section above. You should always, however, have more than one PDU as sharing a single PDU is adding a single point of failure to an otherwise shared nothing cluster.
MDK – 10/28/21