Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Reverted from v. 9

Table of Contents

DOCUMENT DEPRECATED, see opCharts Dynamic Baseline and Threshold Tool

Why we need a Dynamic Baseline and Thresholding Tool

...

The delta baseline configuration then allows for defining the level of the event based on the percentage of change, for the defaults, this would result in a Major, you can see the configuration in the example below, this table is how to visualize the configuration.

Change %Resulting Event Level
10Warning
20Minor
30Major
40Critical
50Fatal

If the change is below 10% the level will be normal, between 10% and 20% Minor, and so up to over 50% it will be considered fatal. 

In practicality this spike was brief and using the 15 minute threshold period (current is the average of the last 15 minutes) the value for calculating change would be 136 and the resulting change would be 36% so a Major event. The threshold period is dampening the spikes to remove brief changes and allow you to see changes which last longer.

Working with the Dynamic Baseline and Thresholding Tool

The Dynamic Baseline and Threshold Tool includes various configuration options so that you can tune the algorithm to learn differently depending on the metric being used.  The tool comes with several metrics already configured.  It is a requirement of the system that the stats modeling is completed for the metric you require to be baseline, this is how the NMIS API extracts statistical information from the performance database.

Dynamic Baseline Configuration Options

Configuration of the baseline tool is done in the file /usr/local/omk/conf/Baseline.nmis the default configuration should be installed when the tool is installed.

...

Installing the Baseline Tool

Copy the file to the server and do the following, upgrading will be the same process.

Code Block
tar xvf Baseline-X.Y.tgz 
cd Baseline/
sudo ./install_baseline.sh

Working with the Dynamic Baseline and Thresholding Tool

The Dynamic Baseline and Threshold Tool includes various configuration options so that you can tune the algorithm to learn differently depending on the metric being used.  The tool comes with several metrics already configured.  It is a requirement of the system that the stats modeling is completed for the metric you require to be baseline, this is how the NMIS API extracts statistical information from the performance database.

Dynamic Baseline Configuration Options

Configuration of the baseline tool is done in the file /usr/local/omk/conf/Baseline.nmis the default configuration should be installed when the tool is installed.

Configuration OptionDescriptionExample
baselineWhich type of baseline are we using, "dynamic" or "delta", the default is dynamic, if undefined, dynamic will be used.delta
activeIs baselining this metric active or not, values are true or falsetrue
metricWhich NMIS data point or variable, equates to an RRD DSRouteNumber
typeWhich NMIS model section or metricRouteNumber
use_indexFor using with certain types where the type is not how the index is stored, e.g. the index for pkts_hc is interface, so when type=pkts_hc then use_index=interface.  A rarely used option.interface (where applicable)
sectionWhat is the section name in the node info, just run it, otherwise the section must exist.
nodeModelThis is a regex which defines which NMIS models should be matchedCiscoRouter
eventThe name of the event to use, will default to Proactive Baseline type metric if none provided.Proactive Route Number Change
indexedIs this variable indexed or notfalse
threshold_exceedsIgnored if undef otherwise the value must ALSO exceed this threshold to raise an eventundef
threshold_periodHow many minutes should the value to be baselined be averaged, e.g. -5 minutes is the last poll, -15 minutes would be the average of the last 15 minutes, -1 hour would be the last 60 minutes.-5 minutes
multiplierHow many standard deviations to vary the baseline by.1
weeksThe number of weeks to look back0
hoursThe number of hours to include in the baseline metrics8
levelsThe levels section is used by the delta baseline method to define when an amount of change will trigger an event and what level that event will be.

Same-Day Dynamic Baseline Configuration Example

Here is what the configuration file would look like, this example is a Same-Day Baseline:

Code Block
  'RouteNumber' => {
    'active' => 'true',
    'metric' => 'RouteNumber',
    'type' => 'RouteNumber',
    'nodeModel' => 'CiscoRouter',
    'event' => 'Proactive Route Number Change',
    'indexed' => 'false',
    'threshold_exceeds' => undef,
    'threshold_period' => "-5 minutes",
    'multiplier' => 1,
    'weeks' => 0,
    'hours' => 8,
  },

Multi-Day Dynamic Baseline Configuration Example

Another configuration option using the BGP Prefixes being exchanged with BGP peers, is from systemHealth modelling and this is a multi-day baseline:

Code Block
  'cbgpAcceptedPrefix

...

Same-Day Dynamic Baseline Configuration Example

Here is what the configuration file would look like, this example is a Same-Day Baseline:

Code Block
  'RouteNumber' => {
    'active' => 'true',
    'metric' => 'RouteNumbercbgpAcceptedPrefix',
    'type' => 'RouteNumberbgpPrefix',
    'nodeModelsection' => 'CiscoRouterbgpPrefix',
    'eventnodeModel' => 'Proactive Route Number ChangeCircuitMonitor|CiscoRouter',
    'indexedevent' => 'false',
Proactive BGP Peer Prefix 'threshold_exceeds' => undefChange',
    'threshold_periodindexed' => "-5 minutes"'true',
    'multiplier' => 1,
    'weeks' => 04,
    'hours' => 81,
  },

...

Delta Baseline Configuration Example

Another configuration option using the BGP Prefixes being exchanged with BGP peers, is from systemHealth modelling and this is a multi-day baseline:Currently delta baselines do not support multi-day, but the hours value can be very large if required.

Code Block
  'cbgpAcceptedPrefixhrSystemProcesses' => {
    'baseline' => 'delta',
    'active' => 'true',
    'metric' => 'cbgpAcceptedPrefix'hrSystemProcesses',
    'type' => 'Host_Health',
    'nodeModel' => 'net-snmp',
    'typeindexed' => 'bgpPrefixfalse',
    'sectionhours' => 'bgpPrefix'4,
    'nodeModelthreshold_period' => 'CircuitMonitor|CiscoRouter'"-15 minutes",
    'eventlevels' => 'Proactive BGP Peer Prefix Change'{
      'Warning' => 10,
      'indexedMinor' => 'true',
20,
      'multiplierMajor' => 130,
      'weeksCritical' => 440,
      'hoursFatal' => 1,50
    }
  },

Delta Baseline for Output Packets Discarded Configuration Example

Currently delta baselines do not support multi-day, but the hours value can be very large if required.

Code Block
  'ifOutDiscards' => {
    'baseline' => 'delta',
    'hrSystemProcessesactive' => {'true',
    'baselinemetric' => 'deltaifOutDiscards',
    'activetype' => 'truepkts_hc',
    'metricuse_index' => 'hrSystemProcessesinterface',
    'typenodeModel' => 'Host_HealthCiscoRouter',
    'nodeModelevent' => 'net-snmpProactive Output Discards (Delta)',
    'indexed' => 'falsetrue',
    'hours' => 41,
    'threshold_period' => "-15 minutes",
    'levels' => {
      'Warning' => 101,
      'Minor' => 202,
      'Major' => 303,
      'Critical' => 404,
      'Fatal' => 507
    }
  },

Running the Baseline Tool

...