NMIS 8.6.6 introduces a new feature called 'polling failover' for monitoring hosts that are reachable by multiple/redundant paths. This page briefly describes this feature.
Most environments that deal with multi-homed nodes use aside 'loopback' addresses for monitoring, and let their routing infrastructure handle any failover, convergence and re-routing required.
This makes path changes transparent to NMIS and thus minimises negative effects of such a change.
If that is not an option in your environment, then you may want to use NMIS' new feature for explicit polling failover.
NMIS normally communicates with a node exclusively using the configured
If this address becomes unreachable, then NMIS attempts to switch over to the
host_backup address on the fly and continues polling like normal.
To enable this capability, simply add your host's secondary address/name in the node configuration dialog (like in the screenshot below) and run a type=update operation:
As long as at least one address remains reachable, NMIS will be able to poll the node.
Should the primary address become inaccessible, then NMIS switches polling over to the fallback
host_backup address, and raises an event called '
Node Polling Failover'; the node is also flagged as being in 'degraded' state. This event is cleared if and when the primary address becomes reachable again.
When NMIS polling has fallen back to the secondary address, the node's status shows "Node Polling Failover" as one of the reasons for the degraded state, like in the screenshot below:
If all addresses of the node are unreachable, then NMIS flags the node as 'unreachable'.
In NMIS 8.6.7 and newer, the "Node Polling Failover" event is also raised if the primary address becomes unreachable (i.e. if fpingd detects it as unpingable). Additionally, a separate event "Backup Host Down" is raised if the
host_backup address is unreachable. Either events' presence causes the node to be flagged 'degraded'.