Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: documented shadow_file feature

...

Plain string values indicate string comparison, but regular expressions can be given too. A node is only considered for this particular command set if all filtering expressions match. You can filter on any node property, not just properties from os_info (but the default command sets only use os_info).

Please note that a command set that does not have any filter blocks is ignored and treated as disabled. To make Prior to version 3.1.1, opConfig considered command sets without filter blocks as disabled; for these versions you may want a 'wildcard' filter matching anything at all, we recommend that you add whatsoever,  which can be achieved by adding an os_info filter block with a 'match-everything' regular expression for os, e.g 'os' => '/.*/'.

Grouping commands into sessions

By default all commands for a node will be run sequentially, one after another within the same session/connection to that device.  This can save time as connecting to the device (and authenticating) often takes longer than running all the commands one by one. 

But this is not always the case, some devices have weird shell command processors which don't take well to this kind of scripted remote control; and then there are certain commands that do run for a longer period of time (for example system statistics gatherers like vmstat or top).  These commands would cause long delays in a sequential run, and opConfig lets you adjust this behaviour to your liking.

Separate sessions for some commands

You can specify which commands should have their own separate connections. Separate connections can be opened in parallel, if opconfig-cli.pl is run with mthread=true.

If you would like to change the default behavior and have opConfig run all commands in all command set in separate session/connection, simply modify opCommon.nmis like this:

Code Block
languageperl
# to run each command in it's own session, change this to true
 'opconfig_run_commands_on_separate_connection' => 'false',

It makes a lot more sense to tell opConfig to run a set of commands on their own or, in most cases, just a single command on its own.  Here is how to make a command or command set have their own sessions/connections:

Code Block
languageperl
# for all commands in a set, define this in the command set - note the S in run_commandS_...!
'scheduling_info' => {
 'run_commands_on_separate_connection' => 'true'
},

# for just a specific command, set this for the command in question - no S in run_command_...!
commands => [
 {
 'command' => 'show version',
 'run_command_on_separate_connection' => 'true',
 }]

Dealing with particular slow commands: Recovering from timeouts

opConfig doesn't block indefinitely if a command on a device doesn't return a response; instead the session times out after a configurable period (see config  item opconfig_command_timeout, default 20 seconds) and the command is marked as failed.

From version 3.0.7 onwards, opConfig handles timeouts in long-lived sessions robustly; earlier versions could, under certain circumstances, get confused into associating the wrong outputs with commands if a timeout had occurred earlier during that session.

Version 3.0.7 lets you decide how to react to a command timing out:

  • If your command set activates the  attempt_timeout_recovery option, then opConfig  will attempt to re-synchronise the session state for a little while longer.
    It does so by looking for your device's prompt for a little while. If none is forthcoming (ie. the problematic command is still blocking progress), then opConfig will send a newline to "wake" the device. This is repeated a few times, unless a prompt is found.
  • If that option is off or the resynchronisation is unsuccessful, then opConfig will close the session (and open a new one, if further commands are to be run for this node).

The value of the attempt_timeout_recovery option must be a nonnegative integer; it defines how many wait-and-wake-up cycles opConfig should perform. Each such cycle takes up to opconfig_command_timeout seconds.

You may set the attempt_timeout_recovery for all commands belonging to a set or for individual commands. Individual commands' setting override the ones for the set. The default if neither are given is zero, i.e. no extra recovery grace period.

Here is an example config set snippet, whose commands all get one extra timeout cycle for recovery, except  some_brittle_command gets up to five extra timeout periods.

Code Block
%hash = (  
  'my_first_config_set' =>  {
    'os_info' =>  {   
	# ...node conditions
    },
    # add one extra timeout period for all commands
    'scheduling_info' => {
       'attempt_timeout_recovery' => 1,
    },
    'commands' => [
       { 'command' => "uptime" },
	   # any number of commands, treated equal wrt. timeouts
       # but the following command is more prone to timing out, so we give it up to 5 x timeout
       { 'command' => 'some_brittle_command',
          'attempt_timeout_recovery' => 5,
       },

Privileged Mode

Many types of devices distinguish between a normal and a privileged/superuser/elevated mode, and allow certain commands only in privileged mode. opConfig needs to know whether that applies to your device and which commands are affected.

In opConfig 3.0.2 and newer, every credential set indicates whether it grants direct and permanent access to privileged mode (always_privileged true), or whether opConfig needs to switch between normal and privileged mode (possibly repeatedly). Older versions only support the dual, mode-switching setup.

Commands affected by this distinction need to be marked with the privileged property, in which case opConfig will attempt to gain elevated/superuser privileges before attempting to run the command.
When connected to a device in always privileged mode, opConfig ignores the privileged property.

Non-interactive commands

Before version 3.0.3 opConfig would always open an interactive session to the device, then issue commands one by one. Amongst other things this requires a working phrasebook for the type of device, and imposes certain constraints - but when it works it's very efficient.

Recently we've run into a few devices (types and versions), where remote controlling  the shell processor is unreliable - which can cause opConfig to fail to properly communicate with the device, e.g. if the prompt cannot be determined reliably.

Version 3.0.3 introduces an alternative, but for SSH only, which bypasses the shell/cli command processor on the device as much as possible.

You can adjust this behavior for a whole command set (or an individual command) with the command set property run_commands_noninteractively. Default is false; if set to true, opConfig will issue every single command 'blindly' and independently in a new SSH connection just for this command. For transport type Telnet this property is ignored.

This option is similar to what run_commands_on_separate_connection does, except interactive prompts and the like are not relevant here: opConfig starts a new ssh client for each command, and each ssh client terminates after returning the output of the command execution on the node.

  • The advantage here is that opConfig doesn't have to interact with the node's command processor in a command-response-prompt-command-response... cycle; As such it's more robust.
  • The disadvantage is that a new SSH connection must be opened for every single command, which is a relatively costly operation in terms of processing.
    Furthermore this cannot work with Telnet (which is interactive by design).

How long should revisions (and their captured command output) be kept

opConfig 2.2. (and newer) have a flexible purging subsystem which is described in detail on a separate page here. The example above shows roughly how it's controlled: a command or command set can have a  section called purging_policy which controls whether and when a revision should be removed from the database.

What constitutes a change, and when should opConfig create new revisions

Not all output is worthy of change detection; configuration information generally is while performance information generally is not. The configuration for a router, for example, is - but the usage counters on an interface likely are not. As mentioned above, opConfig can deal with both of these types of commands, the "one-shot" ones as well as the change-indicating ones.

In a command set you tell opConfig which commands belong to what class by setting (or omitting) the tag detect-change.

If the tag is present, then change detection is run on this command's output: If and only if  the output is different from the previously stored revision, a new revision is stored along with detailed information of what the changes were. If the tag is not defined, then a new revision is created if the output is different - but no detailed changes are tracked, and the opConfig GUI will only report the command's revision in the list of "commands" but not in the list of "configuration changes".

Code Block
languageperl
'tags' => ['detect-change'],

Related to this is the question of what changes that were found should be considered important or relevant.  opConfig can "ignore" unimportant differences in a command's output if you provide it with a set of command filters for a command:

Code Block
languageperl
'privileged' => 'false',
'command' => 'ifconfig -a',
'tags' => [ 'DAILY', 'configuration', 'detect-change' ],
  'command_filters' => [
    '/RX packets/',
    '/TX packets/',
    '/RX bytes/',
    '/TX bytes/'
  ]

In the example above, the output of the "ifconfig -a" command would be checked and any changed lines that match TX/RX packets or TX/RX bytes (i.e. the interface counters) are ignored. Any other changes that remain after applying the filters are used to figure out whether to create a new revision or not.

Please note that command filters are possible for both one-shot commands and commands with change-detect enabled,  and behave the same for both.

Raising Events when changes are detected

opConfig 2.2 and newer can raise an event with NMIS if a change was detected for a node and a particular command. To activate this feature you have to  give the command in question both the tags detect-change and report-change. You may also give a fixed event severity (values as defined in NMIS), like in this example:

Code Block
'privileged' => 'false',
'command' => 'chkconfig',
'tags' => [ 'DAILY', 'configuration', 'detect-change', 'report-change' ],
'report_level' => 'Minor',
Info

To enable or disable this feature in general edit /usr/local/nmis8/conf/Config.nmis. 

Code Block
title/usr/local/nmis8/conf/Config.nmis
    'log_node_configuration_events' => 'true',

If set to true the feature is enabled; if set to false the feature is disabled.

 

In this case, the Redhat/Centos command chkconfig (= list of system services to automatically start on boot) will be checked for changes, and if any are found then a "Node Configuration Change" event with the context node in question, the element "chkconfig" and the serverity "Minor" will be raised in the local NMIS.

If you want a more dynamic event severity, then you can use report_level_min_changes which selects a severity based on the number of changes that were found:

Code Block
{        
 'privileged' => 'true',
 'command' => 'vgdisplay',
 'tags' => [ 'DAILY', 'configuration', 'detect-change', 'report-change' ],
 'report_level_min_changes' => {
   1 => "Normal",
   3 => "Minor",
   10 => "Major" },
}

In this example, changes in the vgdisplay command output would result in an event of severity Normal if there are 1 or 2 changes, Minor for 3 to 9 changes, and Major for 10 or more.

Tracking Files

opConfig version 3.0.3 introduces a new capability: Arbitrary files can now be downloaded from a node (with SCP), stored and tracked by opConfig.
Here is an snippet from the example command set named file_store.nmis that ships with opConfig:

Code Block
# ...other command set structure
scheduling_info => 
{
  # indicates work to be performed by and on the opConfig host
  run_local => 'true',        
},
commands => [
  {
    command => '_download_file_ scp:///var/log/secure', 
    store_internal => 'false',
    tags => [ 'detect-change', 'other', 'custom', 'tags' ], 
  },
  {
    command => '_download_file_ scp://file_in_user_homedir',
    store_internal => 'true', # is the default,
    tags => [ 'detect-change' ],
  },
],...

To instruct opConfig to track one or more files, you have to

  1. set up a command set with the scheduling property run_local set to true,
  2. and add a separate special _download_file_ command for every file you want  to track.
  3. If you want the file data to be treated as binary, set store_internal to false.

The run_local option indicates that all commands in this command set are to be run on the opConfig server, instead of on the node in question.

The special/pseudo-command _download_file_ requires an scp URI to be given. Note that the first two "/" characters in scp:///some/abs/path.txt belong to the URI, and the path in question is /some/abs/path.txt. In the second example above, the path is a plain relative file_in_user_homedir which means scp will look for this file in the user's home directory.

If you leave the store_internal option set to true (or omit  it altogether), then the normal storage  behaviour is selected: opConfig assumes your file contents are text, treats them as the 'command output' for this special command, and stores the output in the database. Hence, you'll see the whole file contents in the GUI, and change detection will be performed line-by-line. This does not work for binary files, and cannot work for large files (above 16 megabytes) either.

On the other hand with this option set to false, opConfig stores a separate copy of the the file for each revision (under the directory configured with config option opconfig_external_store, usually /usr/local/omk/var/opconfig/external). The 'command output' is made up from the size and the SHA256 checksum of the file contents, and change detection (and the GUI) uses this data instead of the (binary or huge) file contents. This produces much more coarse change detection, but works with binary files. In the GUI you'll see the made up 'command output', and a button to download the actual file data.

From opConfig 3.1.1 onwards a command set without filter is interpreted as to apply to all nodes without any restriction.

Controlling how the command output is stored
Anchor
Shadow_File
Shadow_File

In opConfig 3.1.1 an option for shadowing command output on disk was added: if you set the property shadow_file to 1 or true (in the command's block, or in the command set's scheduling_policy section), then opConfig will store the data both in the database and also on disk, in the same location and fashion as documented in the Tracking Files section below.

Tracking Files

opConfig version 3.0.3 introduces a new capability: Arbitrary files can now be downloaded from a node (with SCP), stored and tracked by opConfig.
Here is an snippet from the example command set named file_store.nmis that ships with opConfig:

Code Block
# ...other command set structure
scheduling_info => 
{
  # indicates work to be performed by and on the opConfig host
  run_local => 'true',        
},
commands => [
  {
    command => '_download_file_ scp:///var/log/secure', 
    store_internal => 'false',
    tags => [ 'detect-change', 'other', 'custom', 'tags' ], 
  },
  {
    command => '_download_file_ scp://file_in_user_homedir',
    store_internal => 'true', # is the default,
    tags => [ 'detect-change' ],
  },
],...

To instruct opConfig to track one or more files, you have to

  1. set up a command set with the scheduling property run_local set to true,
  2. and add a separate special _download_file_ command for every file you want  to track.
  3. If you want the file data to be treated as binary, set store_internal to false.

The run_local option indicates that all commands in this command set are to be run on the opConfig server, instead of on the node in question.

The special/pseudo-command _download_file_ requires an scp URI to be given. Note that the first two "/" characters in scp:///some/abs/path.txt belong to the URI, and the path in question is /some/abs/path.txt. In the second example above, the path is a plain relative file_in_user_homedir which means scp will look for this file in the user's home directory.

If you leave the store_internal option set to true (or omit  it altogether), then the normal storage  behaviour is selected: opConfig assumes your file contents are text, treats them as the 'command output' for this special command, and stores the output in the database. Hence, you'll see the whole file contents in the GUI, and change detection will be performed line-by-line. This does not work for binary files, and cannot work for large files (above 16 megabytes) either.

On the other hand with this option set to false, opConfig stores a separate copy of the the file for each revision (under the directory configured with config option opconfig_external_store, usually /usr/local/omk/var/opconfig/external). The 'command output' is made up from the size and the SHA256 checksum of the file contents, and change detection (and the GUI) uses this data instead of the (binary or huge) file contents. This produces much more coarse change detection, but works with binary files. In the GUI you'll see the made up 'command output', and a button to download the actual file data.

All other opConfig capabilities work normally for file tracking commands; e.g. scheduling, tags, purging of old revisions, revision creation itself and so on.

The resulting  files are stored in the directory /usr/local/omk/var/opconfig/external/<node name>/<command name>/<revision>. opConfig 3.1.1 and newer also maintain a symbolic link latest that points to the most recent revision.

Furthermore, in opConfig 3.1.1 and newer, any command output that is larger than 16 megabytes is automatically demoted to being stored on disk.

Grouping commands into sessions

By default all commands for a node will be run sequentially, one after another within the same session/connection to that device.  This can save time as connecting to the device (and authenticating) often takes longer than running all the commands one by one. 

But this is not always the case, some devices have weird shell command processors which don't take well to this kind of scripted remote control; and then there are certain commands that do run for a longer period of time (for example system statistics gatherers like vmstat or top).  These commands would cause long delays in a sequential run, and opConfig lets you adjust this behaviour to your liking.

Separate sessions for some commands

You can specify which commands should have their own separate connections. Separate connections can be opened in parallel, if opconfig-cli.pl is run with mthread=true.

If you would like to change the default behavior and have opConfig run all commands in all command set in separate session/connection, simply modify opCommon.nmis like this:

Code Block
languageperl
# to run each command in it's own session, change this to true
 'opconfig_run_commands_on_separate_connection' => 'false',

It makes a lot more sense to tell opConfig to run a set of commands on their own or, in most cases, just a single command on its own.  Here is how to make a command or command set have their own sessions/connections:

Code Block
languageperl
# for all commands in a set, define this in the command set - note the S in run_commandS_...!
'scheduling_info' => {
 'run_commands_on_separate_connection' => 'true'
},

# for just a specific command, set this for the command in question - no S in run_command_...!
commands => [
 {
 'command' => 'show version',
 'run_command_on_separate_connection' => 'true',
 }]

Dealing with particular slow commands: Recovering from timeouts

opConfig doesn't block indefinitely if a command on a device doesn't return a response; instead the session times out after a configurable period (see config  item opconfig_command_timeout, default 20 seconds) and the command is marked as failed.

From version 3.0.7 onwards, opConfig handles timeouts in long-lived sessions robustly; earlier versions could, under certain circumstances, get confused into associating the wrong outputs with commands if a timeout had occurred earlier during that session.

Version 3.0.7 lets you decide how to react to a command timing out:

  • If your command set activates the  attempt_timeout_recovery option, then opConfig  will attempt to re-synchronise the session state for a little while longer.
    It does so by looking for your device's prompt for a little while. If none is forthcoming (ie. the problematic command is still blocking progress), then opConfig will send a newline to "wake" the device. This is repeated a few times, unless a prompt is found.
  • If that option is off or the resynchronisation is unsuccessful, then opConfig will close the session (and open a new one, if further commands are to be run for this node).

The value of the attempt_timeout_recovery option must be a nonnegative integer; it defines how many wait-and-wake-up cycles opConfig should perform. Each such cycle takes up to opconfig_command_timeout seconds.

You may set the attempt_timeout_recovery for all commands belonging to a set or for individual commands. Individual commands' setting override the ones for the set. The default if neither are given is zero, i.e. no extra recovery grace period.

Here is an example config set snippet, whose commands all get one extra timeout cycle for recovery, except  some_brittle_command gets up to five extra timeout periods.

Code Block
%hash = (  
  'my_first_config_set' =>  {
    'os_info' =>  {   
	# ...node conditions
    },
    # add one extra timeout period for all commands
    'scheduling_info' => {
       'attempt_timeout_recovery' => 1,
    },
    'commands' => [
       { 'command' => "uptime" },
	   # any number of commands, treated equal wrt. timeouts
       # but the following command is more prone to timing out, so we give it up to 5 x timeout
       { 'command' => 'some_brittle_command',
          'attempt_timeout_recovery' => 5,
       },

Privileged Mode

Many types of devices distinguish between a normal and a privileged/superuser/elevated mode, and allow certain commands only in privileged mode. opConfig needs to know whether that applies to your device and which commands are affected.

In opConfig 3.0.2 and newer, every credential set indicates whether it grants direct and permanent access to privileged mode (always_privileged true), or whether opConfig needs to switch between normal and privileged mode (possibly repeatedly). Older versions only support the dual, mode-switching setup.

Commands affected by this distinction need to be marked with the privileged property, in which case opConfig will attempt to gain elevated/superuser privileges before attempting to run the command.
When connected to a device in always privileged mode, opConfig ignores the privileged property.

Non-interactive commands

Before version 3.0.3 opConfig would always open an interactive session to the device, then issue commands one by one. Amongst other things this requires a working phrasebook for the type of device, and imposes certain constraints - but when it works it's very efficient.

Recently we've run into a few devices (types and versions), where remote controlling  the shell processor is unreliable - which can cause opConfig to fail to properly communicate with the device, e.g. if the prompt cannot be determined reliably.

Version 3.0.3 introduces an alternative, but for SSH only, which bypasses the shell/cli command processor on the device as much as possible.

You can adjust this behavior for a whole command set (or an individual command) with the command set property run_commands_noninteractively. Default is false; if set to true, opConfig will issue every single command 'blindly' and independently in a new SSH connection just for this command. For transport type Telnet this property is ignored.

This option is similar to what run_commands_on_separate_connection does, except interactive prompts and the like are not relevant here: opConfig starts a new ssh client for each command, and each ssh client terminates after returning the output of the command execution on the node.

  • The advantage here is that opConfig doesn't have to interact with the node's command processor in a command-response-prompt-command-response... cycle; As such it's more robust.
  • The disadvantage is that a new SSH connection must be opened for every single command, which is a relatively costly operation in terms of processing.
    Furthermore this cannot work with Telnet (which is interactive by design).

How long should revisions (and their captured command output) be kept

opConfig 2.2. (and newer) have a flexible purging subsystem which is described in detail on a separate page here. The example above shows roughly how it's controlled: a command or command set can have a  section called purging_policy which controls whether and when a revision should be removed from the database.

What constitutes a change, and when should opConfig create new revisions

Not all output is worthy of change detection; configuration information generally is while performance information generally is not. The configuration for a router, for example, is - but the usage counters on an interface likely are not. As mentioned above, opConfig can deal with both of these types of commands, the "one-shot" ones as well as the change-indicating ones.

In a command set you tell opConfig which commands belong to what class by setting (or omitting) the tag detect-change.

If the tag is present, then change detection is run on this command's output: If and only if  the output is different from the previously stored revision, a new revision is stored along with detailed information of what the changes were. If the tag is not defined, then a new revision is created if the output is different - but no detailed changes are tracked, and the opConfig GUI will only report the command's revision in the list of "commands" but not in the list of "configuration changes".

Code Block
languageperl
'tags' => ['detect-change'],

Related to this is the question of what changes that were found should be considered important or relevant.  opConfig can "ignore" unimportant differences in a command's output if you provide it with a set of command filters for a command:

Code Block
languageperl
'privileged' => 'false',
'command' => 'ifconfig -a',
'tags' => [ 'DAILY', 'configuration', 'detect-change' ],
  'command_filters' => [
    '/RX packets/',
    '/TX packets/',
    '/RX bytes/',
    '/TX bytes/'
  ]

In the example above, the output of the "ifconfig -a" command would be checked and any changed lines that match TX/RX packets or TX/RX bytes (i.e. the interface counters) are ignored. Any other changes that remain after applying the filters are used to figure out whether to create a new revision or not.

Please note that command filters are possible for both one-shot commands and commands with change-detect enabled,  and behave the same for both.

Raising Events when changes are detected

opConfig 2.2 and newer can raise an event with NMIS if a change was detected for a node and a particular command. To activate this feature you have to  give the command in question both the tags detect-change and report-change. You may also give a fixed event severity (values as defined in NMIS), like in this example:

Code Block
'privileged' => 'false',
'command' => 'chkconfig',
'tags' => [ 'DAILY', 'configuration', 'detect-change', 'report-change' ],
'report_level' => 'Minor',
Info

To enable or disable this feature in general edit /usr/local/nmis8/conf/Config.nmis. 

Code Block
title/usr/local/nmis8/conf/Config.nmis
    'log_node_configuration_events' => 'true',

If set to true the feature is enabled; if set to false the feature is disabled.

 

In this case, the Redhat/Centos command chkconfig (= list of system services to automatically start on boot) will be checked for changes, and if any are found then a "Node Configuration Change" event with the context node in question, the element "chkconfig" and the serverity "Minor" will be raised in the local NMIS.

If you want a more dynamic event severity, then you can use report_level_min_changes which selects a severity based on the number of changes that were found:

Code Block
{        
 'privileged' => 'true',
 'command' => 'vgdisplay',
 'tags' => [ 'DAILY', 'configuration', 'detect-change', 'report-change' ],
 'report_level_min_changes' => {
   1 => "Normal",
   3 => "Minor",
   10 => "Major" },
}

In this example, changes in the vgdisplay command output would result in an event of severity Normal if there are 1 or 2 changes, Minor for 3 to 9 changes, and Major for 10 or more.

 All other opConfig capabilities work normally for file tracking commands; e.g. scheduling, tags, purging of old revisions, revision creation itself and so on.

How to categorize command sets (and why)

...