Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: added escalation

...

Configuration for escalate.XYZ()

Escalation of open issues is handled flexibly in opEvents: you can specify which events should be potentially escalated, and you can formulate different policies for those escalations. Escalations in opEvents apply only to unacknowledged events.

Writing escalate.somepolicy() in a THEN clause marks the matched event for future escalation according to the escalation rules of somepolicy. An event can be subject to multiple escalation policies at the same time. All escalation policies that an event is marked for will be applied independently, and when a policy is unapplicable because of time and day restrictions, it is ignored - but only temporarily until the time and day match up again. Only when an event is acknowledged will escalation for it cease.

Escalation Policies

To formulate an escalation policy, you need to decide on your preferred escalation steps, their respective time thresholds and actions, and express that in section escalate of the config file conf/EventActions.nmis. Here is an example configuration fragment:

Code Block
'escalate' => {
 'weekday' => {
 	'name' => 'weekday',
 	'IF' => {
 		priority => '>= 0',
 		days => 'Monday,Tuesday,Wednesday,Thursday,Friday',
 		begin => '9:00',
 		end => '19:00', 
 	},
 	'60' => 'log.problem() AND script.ping_node()',
 	'300' => 'email(operations)',
 	'1200' => 'email(operations_pager) AND script.disaster()',
 	'2400' => 'email(operations_manager)',
 	'3600' => 'email(it_manager)',
 },

 'afterhours' => {
...

Your escalation policy clearly needs a name; the example uses weekday and afterhours. The two other components of the escalation policy are the IF clause, which sets the scope of the policy, and the  list of escalation steps.

Escalation Time Restrictions

The IF clause is used to determine whether a particular escalation policy should be active at a given time, and for events of a given priority. The priority setting is required and contains a comparison operator, a space and a number; if your policy is to be unrestricted simply use >= 0 (priorities range from 0 to 10). The days setting is optional, and should contain a comma-separated list of weekdays when the policy should be active. begin and end set up the daily time range for this policy. The policy will be active in the interval between begin and end, if the begin time is earlier than end (like in the example above). To invert the interval meaning, ie. for outside work hours, simply swap begin and end over. For example, a policy with begin 18:00 and end 05:00 will work after 18:00 and before 05:00. If days are not given, then the policy works on all days. No begin means "starts at midnight" and no end is interpreted as "ends at midnight".

Escalation Step Definition

The remaining components of the escalation policy are the definitions of the escalation steps; these consist of the escalation threshold, and the actions to take. The escalation threshold (in seconds) specifies the minimum age of the unacknowledged event for this escalation step to activate, and the action part works the same as the THEN expression in the action policy.

When escalations are processed, the highest new escalation step is determined based on the age of the event, the associated actions are performed and the event state is updated. When escalations are processed next, only escalation steps higher than the most recently active one will be considered for this event. Please note that different escalation polices are applied independently and each has its own active highest escalation step.

With the example weekday policy above, an event would be acted  upon after 60 unacknowledged seconds, then again once it reaches 300 unacknowledged seconds and so on. Each action would be taken at most once: if the policy becomes active for the first time if the event is already 5900 seconds unacknowledged, then only the highest escalation step (3600) would be applied.

The action part of the step definition has the same syntax and interpretation as the THEN expressions of the main action policy described earlier in this document, except that action escalate.anypolicy() from within an escalation policy makes no sense and is therefore disabledTBA.