1
0
-1

Running the NMIS 8.6.7G appliance, I'm trying to find how to restart NMIS without restarting the entire VM, nothing found by searching. Thank you

    CommentAdd your comment...

    3 answers

    1.  
      2
      1
      0

      Hello PHil,

      The process outlined below should resolve the CPU issue you're seeing and improve overall polling performance -

       

      If you are installing NMIS 8.6.7G and you have a busy server, then it is important that you modify the cron entry for

      NMIS and change it. The setting is found in /etc/cron.d/nmis and the default is this:

       

      * * * * * root /usr/local/nmis8/bin/nmis.pl type=collect mthread=true ; /usr/local/nmis8/bin/nmis.pl type=services mthread=true

       

      Opmantek recommend changing that to:

       

      */1 * * * * root /usr/local/nmis8/bin/nmis.pl type=collect mthread=true
      
      */2 * * * * root /usr/local/nmis8/bin/nmis.pl type=services mthread=true

       

      It is important to note that the NMIS polling engine had an overhaul in NMIS 8.6.6 and NMIS 8.6.7 to improve how

      parallel threads were handled and that polling was kept up to date. So when NMIS starts a poll every 1 minute, not all nodes will be polled, it will poll as many as it can in that time and then leave the others for the next poll cycle, this has the result of spreading the polling (and load on the server) out over 5 minutes.

      If you are not getting all nodes polled in 5 minutes, you will need more threads.

      If you are polling many nodes every 1 minute, then you will need to size the server accordingly.

      Best,

      Mark H

      1. Phil

        thank you, I just made these changes!

      CommentAdd your comment...
    2.  
      1
      0
      -1

      The "metrics" section of NMIS dashboard keeps showing a warning about CPU is over 60% and when I run top I noticed a nmis.pl with a large chunk of cpu use, up to 99-100% otherwise everything works fine. I was just trying to find a way to get the metrics section working again...

      1. Mark Henry

        Phil, How much CPU and RAM do you have assigned to the server, what OS is installed, and how many devices are you polling? Also, check System -> Host Diagnostics -> NMIS Runtime Graph, you want to make sure your Collect time is less than the polling cycle. So, if you're polling every 5 minutes the Collect time needs to be < 300s. Next, check Reports -> Current -> Collect/Update Time, first sort this by the Collect Time column and look for devices with the highest Collect times. Anything higher than 30s or so need to be investigated. What's the latency, how many hops to the device, is the device overloaded itself? After that, sort on Update Time column; any devices not updating?

      2. Phil

        Collect Time = 31.33 seconds I found one higher than 30s, it's 164+ seconds. It's a physical server running docker. Will look at that system. thanks

      3. Phil

        Oh and the virtual appliance is running CentOS 6 with 6 vCPU & 10GB of memory with @ 130 systems in NMIS

      4. Mark Henry

        You should not be having ANY CPU issues with those specs and only 130 devices in NMIA. Has the CentOS been patched and updated? Should be running 6.9 I think? How is storage space (df -h) and swap space on the server?

      5. Phil

        The 1 system with the high Collect time is under heavy load, may be causing perf issues in NMIS. OS is 6.9 Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_nmis64-lv_root 16G 4.1G 11G 28% / tmpfs 4.9G 0 4.9G 0% /dev/shm /dev/sda1 477M 182M 270M 41% /boot /dev/mapper/vg_nmis64_data-lv_data 40G 23G 15G 60% /data /dev/mapper/vg_nmis64-lv_var 20G 2.6G 16G 14% /var

      CommentAdd your comment...
    3.  
      1
      0
      -1

      The processes NMIS uses to collect device fault and performance data are all started and stopped by a cron job located in /etc/cron.d/nmis

      These processes should not need to be restarted as NMIS monitors their performance and kills processes that overrun collection time or have stopped responding.

      What are you seeing that leads you to want to stop or restart these?

      Best,

      Mark H

        CommentAdd your comment...