Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The delta baseline is only concerned with the amount of change in the baseline, for example from a sample of data from the last 4 hours we would see that the average of a metric is 100, we then take the current value, for example, the spike of 145 below, and we calculate the change as a percentage, which would be a change of 45% resulting in a Critical event level.

Image Added

The delta baseline configuration then allows for defining the level of the event based on the percentage of change, for the defaults, this would result in a Major, you can see the configuration in the example below, this table is how to visualize the configuration.

Change %Resulting Event Level
10Warning
20Minor
30Major
40Critical
50Fatal

If the change is below 10% the level will be normal, between 10% and 20% Minor, and so up to over 50% it will be considered fatal. 

In practicality this spike was brief and using the 15 minute threshold period (current is the average of the last 15 minutes) the value for calculating change would be 136 and the resulting change would be 36% so a Major event. The threshold period is dampening the spikes to remove brief changes and allow you to see changes which last longer.

Working with the Dynamic Baseline and Thresholding Tool

The Dynamic Baseline and Threshold Tool includes various configuration options so that you can tune the algorithm to learn differently depending on the metric being used.  The tool comes with several metrics already configured.  It is a requirement of the system that the stats modelling modeling is completed for the metric you require to be baseline, this is how the NMIS API extracts statistical information from the performance database.

...

Code Block
  'hrSystemProcesses' => {
    'baseline' => 'delta',
    'active' => 'true',
    'metric' => 'hrSystemProcesses',
    'type' => 'Host_Health',
    'nodeModel' => 'net-snmp',
    'indexed' => 'false',
    'hours' => 4,
    'threshold_period' => "-15 minutes",
    'levels' => {
      'Warning' => 10,
      'Minor' => 20,
      'Major' => 30,
      'Critical' => 40,
      'Fatal' => 50
    }
  },

...