Architect your solution, determine initial configuration/setup, create processes for deployment and rollback.

Identify Your Products and Solutions

  • NMIS - Deep visibility of an IT environment, providing valuable information about infrastructure performance and faults.
    • ICMP/Ping
    • SNMP Collection
    • SNMPTraps
    • WMI Collection
    • Service Monitoring (DNS/NTP/etc.)
    • Application Monitoring
  • opHA - Scale your solution horizontally and add high availability to ensure business continuity.
  • Open-AudIT - Agentless device discovery and auditing
    • Device Baselining
    • File/Folder Change Monitoring
    • Software License Usage
  • opCharts - Delivers interactive charts, custom dashboards, and network diagrams.
  • opEvents - Event Management processing Syslog, SNMP trap, NMIS events, and others.
    • Syslog Parsing
    • SNMPTraps
  • opConfig - Provides configuration backup, archiving, and change detection.
    • Configuration Backup
    • Compliance Monitoring
  • opTrend - Identify abnormal behavior and predict resource exhaustion before it happens.
  • opReports - Detailed, actionable engineering and business-related reports
  • opFlow/opFlowSP - NetFlow and IP-Fix analysis for enterprise-class businesses and service providers

System Architecture

Product Architecture

Opmantek's solutions are 3-tier applications comprised of a (shared) backend database, an application layer, and the presentation layer. While these can be decoupled and hosted on separate platforms, it is generally not necessary. In some rare situations, clients may benefit from hosting the Mongo DB on a separate server from the other applications.

Single-Server

Opmantek's solutions are designed to be deployed on a single server, as demonstrated by the Opmantek VM. However, how well this deployment will scale is dependant on several factors, including:

  • Which Opmantek products are installed and licensed on the server
  • Number and type of devices being polled
  • Number of interfaces being collected, and related interface services (i.e. BGP, QoS, etc)
  • Polling frequency
  • Hardware resources (CPU, memory, IO speed)

In most situations, the deciding factor limiting server scaling is the ability to read/write data to the storage device. Opmantek strongly recommends fast, local SSD over traditional hard drives.

Multiple Server, No High Availability

Opmantek's licensing is based on the total number of devices/interfaces being collected and not the number of servers deployed to do the collection. As a result, you can install Opmantek's products on multiple servers in order to spread the load. This results in lower resource requirements for each virtual machine (resulting in a lower probability of resource contention and oversubscription).

A common architecture for this is to split products onto separate servers by feature -

  • Server 1 - Open-AudIT
  • Server 2 - opFlow/opFlowSP
  • Server 3 - NMIS/opCharts/opEvents/opConfig/opTrend/opReports
  • Server 4 - Mongo DB (in support of server 3, and only if necessary)

Multiple Server, w/High Availability

While this is the most flexible architectural option it does require a license of Opmantek's opHA module. This option provides the ability to scale collection horizontally across multiple Pollers, and then have those devices report up to one or more Primary servers at the top-tier.

This architectural option should be considered in the following situations:

  • High lag times for remote locations
  • Support for remote locations/offices or for customers (i.e. deployment for a Managed Service Provider)
  • Desire to manage control of devices in a very granular way
  • Provide a Client Portal (requires NMIS, opCharts, and opHA)

Document Architecture Decision

Now that you've selected your system architecture you should spend a moment to document your decision. You should include all relevant factors considered in making your decision, which subnets will be covered by each server, a generalized list of devices by manufacturer and model each server will monitor, and any network or device configurations required (i.e. SNMP/WMI credentials, network security changes, etc).


Server Sizing

Server sizing is both an art and a science. However, there is a good bit of math to throw in there when determining minimum storage requirements.

Server Resources (vCPU and RAM)

Open-AudIT (Windows or Linux installation)

  • Minimum 4vCPU and 8GB-RAM
  • Individual server installs scale well up to 50k devices with increases to 24vCPU and 64GB-RAM (for scanning all devices 1x/day)
  • Installs >10k devices should make use of Open-AudIT Enterprise and Collectors in order to scale effectively

NMIS8 (with no other modules on the server)

  • Minimum 4vCPU and 4GB-RAM, 8GB-RAM recommended
  • Scales well up through 5k devices, but highly depends on latency, response, and the number of interfaces/elements collected.
  • vCPU and RAm will need to be increased as required in order to support multithreading collection and meet polling times.

opCharts or opReports (with NMIS only)

  • Minimum 4vCPU and 8GB-RAM recommended
  • The more opCharts users you have the more RAM will be needed to support Apache in preparing charts/dashboards for viewing
  • opReports will need a good bit of CPU and RAM available to generate reports. The more reports you create, the more resources are needed so as not to impact the server in other areas.

opEvents (with NMIS only)

  • Minimum 4vCPU and 8GB-RAM recommended
  • Resource requirements will depend heavily on the number of events being processed and any event actions being processed. The more RAM available to the system will allow event actions to run multithreaded and parallel.

opConfig (with NMIS only)

  • Minimum 4vCPU and 8GB-RAM recommended
  • Resource requirements will depend heavily on how often device configurations are being collected, and what (if any) additional commands are being run (i.e mtr, ping, etc).

opTrend (with NMIS only)

  • Minimum 6vCPU and 8GB-RAM, 16GB-RAM recommended
  • Processing of trend data is highly dependant on math processing (vCPU cycles) and RAM and the type and number of parameters being processed by opTrend,

opFlow/opFlowSP

  • Must be installed on a stand-alone server from NMIS
  • Minimum 6vCPU and 8GB-RAM, 16GB-RAM recommended
  • Processing of flow data requires both math processing and system memory. Performance can be monitored within the application by reviewing the time required to process the inbound flows.

Single Server, All-In-One Deployment (Open-AudIT, NMIS, opCharts, opEvents, opConfig, opReports, opTrend)

  • Not recommended for mission-critical production deployments over 1.5K devices, or 15k interfaces elements per server.
  • Minimum 8vCPU and 16GB-RAM

Storage Requirements

The biggest limiting factor in server performance is disk IO. The system must be able to have unrestricted read/write access to the storage medium. Opmantek highly recommends the use of local SSD storage over traditional hard drives.

NMIS

  • The key issue in calculating the amount of storage required by NMIS is understanding how many, and what type, of elements are being collected, and how often (polling policy).
  • An individual interface with default data retention might only consume 2.8MB for background information and 11.5MB for pkts_hc. However, adding BGP could add 5.8MB per peer, and CBQoS another 5.8MB PER CLASS.
  • Here is a handy reference: Estimating NMIS Storage Requirements

Open-AudIT

  • Even very large deployments of Open-AudIT require relatively little storage. 40GB is usually enough with default settings.
  • However, if you are storing detailed change information across all tables this should be doubled.

opCharts, opReports, opEvents, opConfig, and opTrend

  • Storage requirements cannot be estimated for these products, as usage dictates storage used.
  • Storage for these products is managed by the Mongo DB, which applies a considerable amount of compression via the Wired Tiger storage engine.
  • The best plan for moving forward is to allocate a minimum space, say 100GB, and then monitor the production of artifacts (dashboards, reports, stored configurations, etc) and adjust retention periods or increase storage as needed.

opHA

  • Storage for opHA is relatively negligible and can be included under the estimation for NMIS.

opFlow/opFlowSP

  • Storage for opFlow/opFlowSP can be capped to a maximum size; either a defined size or % of available space.
  • This approach will affect the period of data being stored, as fixed storage space will truncate time as more flows are processed and stored.
  • Under no circumstances should the opFlow/Sp database be left uncapped, as once the /data partition fills the mongod service will crash bringing down all Opmantek modules.

Opmantek Virtual Machine

  • If you have deployed the Opmantek Virtual Machine (VM) you will notice separate partitions are used for the OS and applications, and application data (/data).
  • This makes it incredibly easy to manage storage requirements, as you can simply adjust the size of the /data partition to increase storage as requirements change.
  • Steps to resize the VM's /data partition can be found HERE: Resizing the Opmantek Virtual Machine (VM)

Polling Server vs Primary Server

  • Generally speaking, simply determine which applications are running on each server, then apply the estimates from above for each.
  • The only caveat is with NMIS on a Primary server, since NMIS is only processing and storing a relatively small amount of time-related performance data you do NOT need to apply the NMIS calculations. Simply allocate 40GB for NMIS and then add in storage for any other applications on the Primary server.

Planning Deployment

Configuration

Common files and directory structures

NMIS

  • /usr/local/nmis8/conf
  • /usr/local/nmis8/Config.nmis

Open-AudIT, all OMK Modules

  • /usr/local/omk/conf
  • /usr/local/omk/opCommon.nmis

Thoughts

  • Who will select the initial configurations for each product
  • Who will manage/maintain product configurations
  • How will product configurations be maintained across multiple servers

Product Upgrades

Opmantek uses an Agile development methodology in product development and releases several product updates every quarter. It is up to you as to how often these updates are installed, but we recommend at least a semiannual, if not quarterly basis in order to obtain the most recent features and bug fixes.

If you have started with the Opmantek Virtual Machine (VM) you should check the vm's /omk page for upgrade availability.

Server Maintenance

Product Upgrades only affect Opmantek products, and any components directly required. It does NOT update or patch the server OS, or third party applications like Apache or Mongo DB. You should make arrangements with your IT or server support team to manage server patching in accordance with your company's guidelines and expectations.

Next Up

Implement - Execute your plan, test, and validate the deployment.



  • No labels