You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Purpose

Demonstrate the practical application of event summarization based on location.

Related Pages:

Event Correlation - Highly recommended, a must read!

Deduplication and storm control in opEvents

Scenario

Tasked with managing a large network that is either geographically seperated or the topology is such that 'fault domains' are easy to recognize.  With this in mind it would be desirable to have a single alert that notifies us that site "X" is experiencing a problem, versus many (10 ~ 500+) alerts from individual nodes.  This not only cuts down on the noise, it also automates a component of the troubleshooting process, enabling operations to vector in on a common symptom in order to crush the problem.

Prerequisites

There needs to be a common way to identify nodes such as location, group, customer, business service, etc.  The common attribute is assigned when the node is provisioned in NMIS.  For example, if it's determined that all the nodes at the San Jose data center can be grouped into a single fault domain, then they should all have a the same location attribute of 'San_Jose_Data_Center'.  This gives opEvents something to grip onto for correlation.

 

 

  • No labels