Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: added documentation for inhibit

...

  • an event name, which specifies the name of the newly created event,
  • a list of events (more precisely, their names), which are the events to consider for correlation,
  • a (minimum) count of events that have to be detected to trigger the rule,
  • an optional list of groupby clauses, which define whether the count is interpreted globally for all named events, or separately within smaller groups,
  • optional delayedaction and autoacknowledge clauses, which define how the triggering events should be handled,
  • an optional enrich clause, which adjusts the content of the newly created event,
  • from version 2.2.0a onwards, optional copy_firstcopy_lastcopy_highest and copy_groupby clauses which further control the contents of the newly created event,
  • from version 2.2 onwards, an optional inhibit parameter, which disables correlation temporarily after a rule has fired,
  • and finally a window parameter, which defines the time window to examine.

...

In version 2.2 this limitation has been removed, and much more precise control of the event content is possible.

Content Control Directives (Version 2.2.0a and newer)

When a synthesis rule creates a new event, the following steps are performed:

...

Code Block
'1' => {
   name => "Very Sick Node",
   events => [ "Node Down", "SNMP Down", "Interface Down", "Service Down",
               "Service Degraded", "Interface Flap", "Node Flap", "WMI Down" ],
   window => 120,
   count => 3,
   groupby => [ 'node.name' ], # we want separate events for each node of course
   enrich => { stateful => "Very Sick Node", priority => 5, state => 'down', element => undef }, # new event is stateful only if stateful is set or copied by name
   copy_last => [ qr//, 'node' ], # can set from node here (all events share it)
   copy_groupby => [ 'node' ], # or from here; must set it explicitely somewhere, or the event goes to opevents_correlation_node
},

Stateful Synthetic Events (Version 2.2.0a and newer)

By default, synthetic events are not stateful events, i.e. they are not subject to deduplication and they cannot be acknowledged (or 'closed') by any future 'opposite' event.

...

The net effect is that the current events view would show only the new synthetic event as 'current' and all the underlying triggering events would be categorized as closed (and optionally acknowledged), and thus be mostly hidden.

 

Synthetic Events and Storm Control

All synthesis rules are applied independently, thus a single event could be a trigger for multiple synthetic events. This is desirable for example for detecting both per-customer problems and global issues at the same time: a few problem events can trigger a customer-specific action, while the same events could be counted together with others for detecting and reacting to a major outage.

However, great Great care has been taken to avoid event storms caused by synthetic events: When a synthesis rule fires because there were more than count matching events in the time window, then all the matching events are marked as consumed and will not be considered for any future synthesis for this rule. In other words, there is no overlap between successful synthesis time windows.

Here is a practical example for the consequences of this design: Let's assume a rule that specifies 5 event matches in 120 seconds as trigger. At some time T1 we count 25 such recent events, therefore the rule fires, a new event is created, and the 25 matches are consumed (not just the 5 that the trigger requires!). The count of triggering events thus starts from scratch at time T1. Let's assume that at T2, four seconds later, event correlation is performed next, and now only the events since T1 are considered as potential triggers. Assuming there were  3 bad events in these four seconds, no synthetic event will be created. Another 4 in the next few seconds, the count is now at 7 and the rule fires. On the other hand, if there had been 200 events between T1 and T2, then only one synthetic event would be created at time T2.

If a rule does not trigger because there are fewer than count trigger events, then naturally these events remain potential triggers until the time window moves past the events in question.

...

If a rule does not trigger because there are fewer than count trigger events, then naturally these events remain potential triggers until the time window moves past them.

However, synthetic event creation currently happens immediately as soon as a sufficient number of triggers are detected: assuming a trigger of a minimum 20 events in 60 seconds, receiving 100 events in that time frame will cause a new synthetic event for each of the 20 sufficient triggers.

Inhibiting Correlation (Version 2.2 and newer)

Version 2.2 provides a new capability for fine-tuning storm control: the inhibit timer.

If a correlation rule fires, and if that rule contains a numeric inhibit parameter greater than zero, then opEvents will temporarily disable the  rule with its particular groupby context for that many seconds.

The primary application of this feature is to stop 'nuisance' repeat synthetics if a very large number of triggers arrives in a very short time frame: it lets you tell opEvents to generate at most one instance of a particular event every inhibit seconds.

Here is an example scenario: let's assume a rule for raising a 'Group Outage' event if 20 instances of a particular event are seen within a window of 60 seconds. A major outage happens, and 100 such trigger events for group A arrive within just a seconds, and a further 25 triggers for group B.

  1. Without inhibit, after the first 20 events for group A you'll get one synthetic event for group A; another after the next 20 and so on.
    For group B, one synthetic event will be generated for the first 20; the remaining 5 are too few to trigger anything.
  2. With inhibit set to 40 seconds (for example), you'll get the very first group A synthetic event as before, but then no synthetic events for this rule and group A for the next 40s;
    After that correlation for group A resumes 'from scratch' and any events received from then onwards are counted and correlated as normal.
    For group B with its fewer triggers the inhibit behaviour doesn't change anything visibly, there's still just one synthetic event.
    Note that the inhibit timer for group A is totally independent of any inhibit for group B: inhibit applies to a particular rule and its full groupby context.