Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Start with these basic checks:

  • Ensure that SNMPd is running.
  • If you’re using SNMPv1 or v2: Is the device configured with the correct community string in LogicMonitor (either at the global, group or device level)? If no community string is set, LogicMonitor defaults to using public. Note: Some Linux distributions significantly restrict which metrics are exposed if the community string is set to “public”. Therefore, we recommend you set your community string to something else.  See the section below to verify that your device has the correct community string set.
  • If you’re using SNMPv3: Is the device configured with the correct authpass, privpass and username (either at the global, group or device level)? See the section below to verify that your device has the correct v3 credentials set.
  • Can queries from the collector device reach the monitored device? You can check this by running tcpdump on the monitored host. If the queries are not reaching the device, there may be a firewall issue.
  • Is the monitored device replying to the queries from the collector?

...

  • If you are receiving the common error message “Agent did not return variable bindings in lexicographic order”, set the snmp.ignore.lexicographic.order Collector setting to TRUE. As discussed in Editing the Collector Config Files, this setting must be updated from the Collector’s agent.conf file.

Ports/rules required by the snmpd service.

SNMP operates at the application layer of the Internet protocol suite (layer 7 of the OSI model).

The ports commonly used for SNMP are as follows:

Number Description
161SNMP
162SNMP-trap

for more references click here



snmpd daemon status validation

Procedure to validate if the snmpd daemon is correctly found on the NMIS server.

NMIS server snmp configuration

Tutorial on how to configure SNMP to monitor our server, we will focus on CentOS as it is one of the most widespread distributions for servers. Except for the installation, the rest is similar in other distributions.

configuration steps.


Snmp queries to devices

The most widely used SNMP versions are SNMP version 1 (SNMPv1) and SNMP version 2 (SNMPv2). SNMP version 3 (SNMPv3) includes important changes with respect to previous versions, especially in security issues; however, its acceptance has been very low due to some implementation problems and incompatibilities.
The snmpwalk command will be used for these queries.

Examples of command execution.





  1. Identify the problem. The first step in troubleshooting a device issue is to identify the problem, you have to consider if the issue is in NMIS8 or NMIS9 products.
    1. Add to the support the case the product version and the servers/devices/models involved.
  2. What kind of problem are you observing. A device issue can be affected for the next reasons.
    1. Network performance, latency in the network, layer 1,2, and 3 issues.
    2. Device configuration, connectivity, SNMP configuration, and others. 
    3. Server hardware requirements, high resource utilization parameters in the server.
    4. Server configuration options, missing configuration items for server tunning.
    5. Disk performance, slow write/read times for the device collection. 
  3. Gather information, collect all the graphs, images, behaviors that can explain what the problem is.
    1. Collect support tool files The Opmantek Support Tool
      1. Execute the collect command for the support tool

        Code Block
        #General collection.
        /usr/local/nmis8/admin/support.pl action=collect  
        
        #If the file is big, we can add the next parameter.
        /usr/local/nmis8/admin/support.pl action=collect maxzipsize=900000000
        
        #Device collection.
        /usr/local/nmis8/admin/support.pl action=collect node=<node_name> 


    2. If you are using NMIS8, provide the /usr/local/nmis8/var files
      1. go to /usr/local/nmis8/var directory and collect the next files

        Code Block
        -rw-rw----   1 nmis   nmis    4292 Apr  5 18:26 <node_name>-node.json
        -rw-rw----   1 nmis   nmis    2695 Apr  5 18:26 <node_name>-view.json


      2. obtain update/collect outputs this information will upload to the support case:

        Code Block
        /usr/local/nmis8/bin/nmis.pl type=update node=<node_name> model=true debug=9 force=true > /tmp/node_name_update_$(hostname).log
        /usr/local/nmis8/bin/nmis.pl type=collect node=<node_name> model=true debug=9 force=true > /tmp/node_name_collect_$(hostname).log


    3. If you are using NMIS9, include the dump files.


      Code Block
      /usr/local/nmis9/admin/node_admin.pl act=dump
      
      {node=nodeX|uuid=nodeUUID}
      file=<MY PATH> everything=1


  4. Replicate the problem. If possible you have to define, what the steps are to replicate the problem.
  5. Identify symptoms. To this point, you are able to see a specific problem and what the symptoms are.
  6. Determinate if something has changed, is important to verify with your team if something has changed, a good way to see this behavior is monitoring the performance graph for devices and server
  7. It is an individual problem? verify if this behavior is happening in a single device/server.

...