Overview of the Major Components of NMIS9

The NMIS9 daemon bin/nmisd

In NMIS9 almost all work is controlled, scheduled and executed by the nmis daemon and its worker child processes.

The nmis daemon is controllable using the typical service interface with the service name being "nmis9d"; e.g. sudo service nmis9d restart .
The daemon should be running by the end of the initial NMIS9 installation.

The primary CLI took bin/nmis-cli

The nmis-cli tool is your primary tool to interact NMIS on the command line; e.g. for querying the status of the nmis daemons, for scheduling new operations and for scheduling outages.
Besides these administrative duties the cli tool is currently the only entity that can create saved reports (which is scheduled using a minimal NMIS 9 cron job).

The Node administration CLI tool admin/node_admin.pl

Like with NMIS8, in NMIS9 nodes can be administered using the GUI or with the node_admin cli tool. NMIS9's version has a few extra features over NMIS8's but otherwise doesn't differ excessively.
The node admin tool is described in more detail on the Node Administration Tools page.

Because of its reliance on a database NMIS9 is more strict about identifying objects, which means that nodes for example are identified exclusively by UUIDs. Node names are of course still present, but as informal properties only. The relationship between these is queried most easily by the node_admin tool using the act=list_uuid operation.

The GUI

The adminstrative capabilities of the NMIS9 GUI are almost identical to how NMIS8 worked; the only major exception being that "Edit and Update Node" cannot display any logs of the Node Update operation as that's scheduled asynchronously. The NMIS9 GUI plays a slightly more passive and and limited role, i.e. only schedules certain operations for the nmis daemon to pick up - different from NMIS8 where some of these were executed directly by GUI components.

The Database

NMIS9 makes extensive use of MongoDB behind the scenes; most of the time that should be invisible to you past the initial installation stage, where you will have to interact with setup_mongodb.pl to prime the environment.

NMIS9 is much more powerful than NMIS8 when it comes to clustering; amongst other things that also means that each NMIS9 installation has to be uniquely identified by what we call its cluster_id configuration setting (which is automatically generated for you during the initial installation).

Interacting with the daemon directly

The NMIS9 daemon only accepts a small number of command line arguments, which are shown when you run it with -h or --help :

./bin/nmisd -?
Usage: nmisd [option=value...] [act=command]

 act=version: print version of this daemon and exit
 act=stop: shut down running daemon and workers
 act=abort: terminate all workers and kill running daemon

if no act argument is present: daemon starts

option foreground=1: stay in the foreground, don't daemonize
option max_workers=N: overrides the configuration
option debug=0/1: print extra debug information

option confdir=path: path to configuration files

The most commonly used ones would be act=stop and act=abort:

With stop you're instructing a running nmis daemon and all its workers to terminate gracefully, i.e. when any operations that were in progress are completed.
With abort a running nmis daemon and its workers are stopped immediately and without regard to operations that are in progress.

In both of these cases no new nmis daemon is started.

Interacting with the daemon using nmis-cli

Just like all other NMIS9 command line tools nmis-cli shows an overview of its arguments and capabilities when you run it with -h or --help (or without any arguments whatsoever):

./bin/nmis-cli 
Usage: nmis-cli [option=value...] <act=command>

 act=fixperms
 act=config-backup
 act=noderefresh
 act=daemon-status (or act=status)

 act=schedule [at=time] <job.type=activity> [job.priority=0..1] [job.X=....]
  act=schedule-help for more detailed help
 act=list-schedules [verbose=t/f] [only=active|queued] [job.X=...]
 act=delete-schedule id=<schedule_id|ALL> [job.X=...]
 act=abort id=<schedule_id>

 act=purge [simulate=t/f] [info=t/f]
 act=dbcleanup [simulate=t/f] [info=t/f]

 act=run-reports period=<day|week|month> type=<all|times|health|top10|outage|response|avail|port>

 act=list-outages [filter=X...]
 act=create-outage [outage.A=B... outage.X.Y=Z...]
 act=update-outage id=<outid> [outage.A=B... outage.X.Y=Z...]
 act={delete-outage|show-outage} id=<outid>
 act=check-outages [node=X|uuid=Y] [time=T]
  act=outage-help for more detailed help

Process Status

Queue Status

Queue Status Details

Scheduling of jobs

aborting jobs, automatic aborts, automatic scheduling

Logging and Verbosity

Standard Log Files

logs/fping.log: the fping worker process (managed by the nmis daemon) logs all its operations to this log file.
logs/auth.log: contains all authentication-related logging that the NMIS9 GUI produces, in the same format that NMIS8 used.
logs/event.log: contains all nmis node events in a machine-consumable format, identical to NMIS8.
logs/nmis.log: all log data that isn't directed elsewhere goes into this log file.

Please note that in NMIS9 all logs are written to in buffered form: information may arrive on disk a few seconds delayed, but at much less performance cost that NMIS8 incurred.

Log files are now also kept open permanently, until the nmis daemon is instructed to reopen them (by sending a SIGHUP signal the the nmis daemon process).

The format of the log files fping.log and nmis.log has changed:

[Thu Jul 25 10:38:09 2019] [info] nmisd[1325] Found 7 nodes due for services operation

Now all log messages are prefixed by time tag, severity level and the process name/role and process identifier of the process in question. In the example above the supervisor component of the nmis daemon has logged this informational announcement.

What gets logged?

NMIS9 is able to log a bit more detail than NMIS8, but much more controllable in terms of what to include when.
There are 13 verbosity levels (in increasing order of noisiness): fatal, error , warn , info , debug (or debug1), debug2, debug3 and so on to debug9 .
All messages with severities debug1 to debug9 are logged with the tag "[debug]".

When you set a particular verbosity level then all messages of higher verbosity are suppressed; e.g. at level info messages of severity fatal , error, warn and info are logged but messages belonging to severities debug1 to debug9 are suppressed.

By default the configuration property log_level controls all logging. The default value for this is info.
If you start the nmis daemon with a debug=<level> command line argument, then that will be used for this daemon and its workers.
For node-admin and nmis-cli invocations the same debug=<level> command line argument is available.
A manually scheduled daemon job can have custom verbosity and output propertes, which control verbosity and target log file for the processing of this job only.
All NMIS daemon instances can be instructed to change their verbosity levels on the fly while the processes remain running, by sending particular UNIX signals to those processes.

Per-job verbosity and custom log file

If a job schedule includes the property job.verbosity=<level> , then the job will be processed with that verbosity level. At the end of processing the previous verbosity level is restored.

The related but independent property job.output=<prefixtext> instructs the NMIS daemon to divert all logging for this one job to a different log file. The log data is saved in the normal logs directory, and the file is named <prefixtext>-<highprecision-timestamp>.log , e.g. logs/quicktest-1564031667.44838.log. When processing completes log output reverts back to the standard log file.

Adjusting process verbosity levels on the fly

All NMIS daemon processes listen for two particular UNIX signals:

When a daemon process instance receives the SIGUSR1 signal, it increments its verbosity by one level, e.g. from warn to info, or from debug2 to debug3 .
When a daemon receives the SIGUSR2 signal, it decrements its verbosity by one level.

In both cases a message is logged at the new verbosity level, e.g.:

[Thu Jul 25 12:05:06 2019] [debug] nmisd[1325] received SIGUSR1, incremented verbosity level to debug, debug to 2
[Thu Jul 25 12:05:34 2019] [info] nmisd[1325] received SIGUSR2, decremented verbosity level to info, debug to 0

How to determine which process to signal?

use nmis-cli act=status to see the list of active daemon processes and use kill with the correct process id,
or use a smarter kill-replacement like pkill and select by full daemon command line,
e.g. pkill -ef -USR2 "nmisd fping"