If you open a groups i.e. IT you have a list of devices and one of the columns is "Esc." for the escalation level on our production machine this shows "-1" when a device goes down and our test system shows "1" or "2" etc the system showing "-1" never goes over this so doesn't send emails etc when the device is down.
Can someone point me to where this may have been caused, I have checked the "escalation.nmis" and the "Command-Thresholds.nmis" and checked them against the test system and all looks OK.
None of this worked so I decided to build a new server and install NMIS from scratch
I did have a scheduled outage so I have removed the outage and a device still shows -1 when it is down, the outage I had covered all devices for 30 minutes once a week; I have removed all outages.
Not sure if this is relevant to your issue, but in NMIS 8.6.7, if a node has been put in a 'scheduled outage' condition, the Esc. column in the group table will render a -1 as you describe.
I have now also upgrade the system to the newest stable version 8.6.7G and this has not helped.
Outage handling changed in NMIS 8.6.4. Just to verify this is not a stale outage problem, look in /usr/local/nmis8/conf. if there is a outage.dat file it may be deleted as it's no longer necessary. If there is a Outages.nmis.disabled file it may be deleted. If there are not any scheduled outages the Outages.nmis file should be an empty perl hash. For example:
[root@poller-office conf]# cat Outages.nmis
%hash = ();
If this is all true and the problem still persists I would remove all events associated with the node in /usr/local/nmis8/var/events. It is save to remove the entire /usr/local/nmis8/var/events directory as NMIS will automatically recreate it; but be aware all current events will ring again again at the next collect cycle.
If the problem still persists I would remove the nodes node & view files in /usr/local/nmis8/var. These will be re-created at the next collect cycle.
I forgot to add that we should look for stale files called planned_outage_open-.json in /usr/local/nmis8/var/events/<node>/current. Not a bad to just delete everything in this directory.
Thanks for this I will start working on this now and let you know.
I tried the first option removing the outage.dat and Outages.nmis.disabled and then checked the outage.nmis and this was an empty file, this did not make any difference; I then removed the /nmis8/var/ view and node files still no change. I am reluctant to remove the events as we like to have the old data regarding events.
There was no planned_outage_open.json in the test device.
Powered by a free Atlassian Confluence Open Source Project License granted to Opmantek. Evaluate Confluence today.