Performance data export pipeline

All performance data exported from NMIS runs through a pipeline from NMIS to MySQL database.

  1. nmis_performance_export.pl grabs data from different parts of NMIS (-node files, RRD's, etc) and creates one file per node per table.  One file per 5 minute period (per node pertable) is created, if you run the command more than once it will simply replace the existing files.  These files are placed in a directory corresponding to the 5 minute period.

  2. An opExport pull request is made on the MySQL server
  3. The NMIS server omkd find the oldest files that have been exported ('opexport_max_performance_datasets_per_load' => 3, of them), loads them and sends them back to the MySQL omkd.
  4. omkd on the MySQL server saves contents to MySQL
  5. If no errors occured omkd on MySQL sends a recepit back to omkd on NMIS saying all ok
  6. omk on NMIS logs response to opExport.log and removes succesfully inserted files 

Null data in performance tables

Given the definition of the performance data export pipeline, the logical place to start looking when null values appear is at step 1.  Find a file in /usr/local/omk/var/perf/<time_period>/<table>-<time_period>-node.nmis, view it in a text editor (like vi).

Is the null column name you are looking for defined in the file? (note, column names in this file can be re-mapped to different column names using the schema, so double check the schema for the name you should be looking for if you can't find it)

If the data is in the export file you can try these:

Some Data Not Updating

The symptoms were seen that some data was not updating, e.g. data for a particular table was not updating, server1, nodeStatus.  The data is copied from the opExport pollers, to the opExport DB server and stored temporalily in the location /usr/local/omk/var/save_queue, before the file is loaded it is cached in the sub-directory /usr/local/omk/var/save_queue/data

You can check the receipts for the data in /usr/local/omk/var, then ls -1 *SERVERIP* you will see this.

-rw------- 1 root 505 160533 Dec 19 17:53 receipt-SERVERIP_cbqosPerformance
-rw------- 1 root 505 111295 Dec 19 17:52 receipt-SERVERIP_ciscoConfig
-rw------- 1 root 505 111386 Dec 19 17:51 receipt-SERVERIP_diskIOTable
-rw------- 1 root 505 111292 Dec 19 17:55 receipt-SERVERIP_interface
-rw------- 1 root 505 171639 Dec 19 17:51 receipt-SERVERIP_interfacePerformance
-rw------- 1 root 505 111298 Dec 19 17:51 receipt-SERVERIP_interfaceStatus
-rw------- 1 root 505    300 Dec 19 17:53 receipt-SERVERIP_ipslaPerformance
-rw------- 1 root 505 111294 Dec 19 17:52 receipt-SERVERIP_nmisConfig
-rw------- 1 root 505 111298 Dec 19 17:52 receipt-SERVERIP_nodeProperties
-rw------- 1 root 505    205 Dec 19 17:52 receipt-SERVERIP_nodes
-rw------- 1 root 505 222386 Dec 19 17:54 receipt-SERVERIP_nodeStatus
-rw------- 1 root 505 111383 Dec 19 17:53 receipt-SERVERIP_services
-rw------- 1 root 505 111382 Dec 19 17:50 receipt-SERVERIP_storage
-rw------- 1 root 505 111290 Dec 19 17:52 receipt-SERVERIP_system
-rw------- 1 root 505 494335 Dec 19 17:53 receipt-SERVERIP_systemPerformance
-rw------- 1 root 505    298 Dec 19 17:52 receipt-SERVERIP_upsPerformance

If any of these dates is not current time (within last 5 minutes), then

Files were found in this folder which prevented opExport from streaming new data, moving the file, e.g. stream-nodeStatus-SERVERIP-localhost.data out of the way, means that opExport can get back to business.

It is likely these files are left behind when a process times-out or fails before the stream data has been processed.