Peer cannot be discovered
From the Primary, we can run the discover using the cli tool with debug enabled to get further information:
If there is any connectivity error, you can test the connectivity from the poller to the Primary using curl (This is an example with default credentials):
The response should be something like this:
If this is a login error, please review the credentials used are correct in the poller. You can check the file log/auth.log from the poller.
Another common error is a misconfiguration in the poller server. The poller need to be able to generate the registry, a document with discovery information. For this, it needs to have defined both properties opha_url_base and opha_hostname (Or both, can be null if you are not using https). When the registry cannot be created you can see this error in opHA logs:
You can check if the registry is created in the following url:
opHA cluster_id already exist!
Check the Primary and all other pollers nmis configuration for the cluster_id property. It can be found in nmis9/conf/Config.nmis. This one should be unique per server, so in case one of the pollers has a repeated one you can remove the property in the configuration file and nmis will generate a new one.
Please, notice that in case the server has nodes already, the nodes should be exported and imported again with localised_ids once the cluster_id was changed, as the nodes information won't have the same cluster_id attribute and they will be treated as remote nodes (They cannot be edited, or polled, as an example).
401 Error from the poller
opHA uses user/password to access the registry data from the poller, but once the poller has been discover, it uses a token for authentication. So, we should have enabled the authentication method "token" in the poller.
Check if in om/conf/opCommon.nmis we have the following (Being X 1, 2 or 3, not matter the order):
Also, the property auth_token_key should be set up in the poller configuration.
Configuring a poller over https
From the Primary, we can initiate discovery of a peer using the url https://servername. (using SSL/TLS).
But, we will not be able to query the poller as the poller will report it's url as http: . To force the Primary to use HTTPS to the poller we must have the configuration item opha_url_base set to https otherwise, it won't work.
This can be set in <omk_dir>/conf/opCommon.nmis in the poller:
If we set the url to https://servername in the discover, the poller is going to send its registry data to the Primary, and the Primary will get the correct url_base for the peer from that information.
If the opha_url_base is blank the Primary will swap the https:// URL for http://
Some data is not updated in the Primary
opHA has a new feature to synchronise only the data that has being added/modified since the last synchronisation. In case some data is not modified, we can perform a force synchronisation, adding some parameters to update only the required data types and nodes:
Two different situations have being identified causing this issue:
- If the same node name exist in more than one poller, and the configuration item opevents_auto_create_nodes is true, a new Local node will be created in the primary server. This is because, the event is just identified by a node name, and the primary cannot choose with of the remote nodes assign the event.
- If there are two Main primary servers: This situation can cause chaos in the environment, as both primaries will change the nodes from the pollers.