You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Overview

Dealing with NetFlow traffic analysis commonly involves large to huge amounts of data, which poses quite a few scalability challenges. Meeting these requires certain trade-offs to be made by any NetFlow analyzer, not just opFlow. This page documents the configuration options and behaviours related to making opFlow scale well.

opFlow Daemon

Database Size Capping

opFlow has supported database size capping for a long time, and it's proven a vital feature to ensure that you don't exhaust your available disk space.

In opFlow 3, the installer guides you though the initial database configuration steps and allows the database size to be capped. We highly recommend that you perform this step!

The default cap sizes are set in the configuration file (opCommon.nmis),  entries opflow_db_flows_collection_size (for the raw flows, default 5GB) and opflow_db_conversations_collection_size (for conversations, default 15GB). Once the database reaches those limits, old records are automatically removed.

If these defaults are not suitable for your environment, then yoopflowd_max_processesu can either choose to use a disk-percentage based formula during the installation procedure, or you can leave the database uncapped, adjust the parameters and manually perform the  capping using opflow-cli.pl act=setup-db.

If necessary, the capping can be repeated with different parameters later (but a re-capping operation may take quite a bit  of time), again using opflow-cli.pl act=setup-db.

High-Volume Mode

By default, opFlow 3 comes with the config option opflow_high_volume enabled.

In this mode, opflowd reads pre-aggregated conversation data from the collector application nfdump, every opflow_summarisation_interval seconds (default: 2 minutes).

Important aspects of High-Volume Mode

  • nfdump must be configured with a rotation interval that is the same as  opflow_summarisation_interval.
    This is done by setting DATA_ROTATE_INTERVAL=120 in the config file /etc/sysconfig/nfdump (CentOS/RedHat) or /etc/default/nfdump (Debian/Ubuntu).
  • In this mode, no raw flows are stored in MongoDB, and the capping size of the flows collection is unimportant. Only the capping on the conversations collection is relevant.
  • The pre-aggregation combines all flows within the respective interval and groups them by the involved endpoints, the communication protocol and application in question.
    Some granularity present in the raw flow records is sacrificed for scalability: a conversation includes a list of the port numbers involved, and cumulative counters for packets and bytes for the whole summarisation interval. Any number of network interactions between the same endpoints, using the same application and which fall into one summarisation period are lumped up into one conversation record.
  • If you use a short summarisation interval, the pre-aggregation will be less efficient at combining multiple flows into conversations.
    Because of that you will experience higher database loads and may hit insertion speed limits at a lower volume of incoming netflow records.
  • If you use a long summarisation interval then the summarisation will be maximally efficient, but the opFlow GUI will exhibit time lag and show somewhat more outdated data.

Low-Volume Mode

If you set opflow_high_volume to 0 or "false", opflowd runs in low-volume mode. In this mode, individual "raw" flow records are initially stored in the flows collection. Periodically, these raw individual flow records are then summarised into conversations in the same manner as in high-volume mode.

Important aspects of Low-Volume Mode

  • The summarisation interval should be a multiple of the nfdump rotation interval for optimal performance, but those parameters are not as closely tied in this mode.
  • Both flows and conversations collections will be used. The size capping on flows must be sized to retain records for at least the most recent summarisation interval.
  • Whenever nfdump rotates its flow collection file, opflowd picks that up and starts collection and insertion of the raw flows contained therein.
  • Inserting lots of raw flows requires more database performance (and possibly fine-tuning of the opflow_batch_insert_size config  parameter) and you will hit database limitations much earlier than in high-volume mode.
  • As long as you don't run into the size capping limits on the raw flows collection, full data of the utmost precision remains available.
  • However, as of version 3.0.2, the opFlow GUI does not expose the raw flows to the user.

Parallel Processing

In both low- and high-volume modes, opflowd can make use of parallel processing to cope with high flow volumes: the config item opflowd_max_processes (default: 4) sets the maximum number of opflowd worker processes that can be run in parallel.

opflowd will start up to that many flow consumer and summariser processes. With the default settings your opFlow installation would thus keep up with inbound flow volumes until the processing of each nfdump flow file takes four times the file's time period. 

if it detects resource exhaustion of this kind, an Operational Status record (and suitable log messages) are created to notify you of the problem. opflowd also generates statistics for every processing run which can be viewed on the Operational Status page.

Long-Term Summarisation Stages

Especially for (re-)creating traffic overview reports retrospectively, the database capping is likely to interfere and limit long-term data availability. To address this point, opFlow 3 also supports an arbitrary number of optional longer-term summarisation stages. These reside in separate database collections and can be capped independently

By default a one-hour summary stage is enabled, which furthermore collapses and combines all conversations that produced less than 1024 bytes or less than 5 packets during the respective hour. In our tests these settings have proven to provide a very high degree of compression efficiency without much loss of detail.

Both the opFlow GUI and the report generation code look for the 'best available' source of data and fall back to using summarisation stage data where required. This means that even though your main conversations may have been purged due to high incoming flow volume and size capping after just a few hours, you would still be able to access historic data reaching back to the oldest summarisation stage result (but you may have to select a longer Summarise Interval in the Avanced menu).

You can define summary stages in the configuration file, under opflow_summary_stages; a stage definition requires a name (allowed characters A-Z, a-z, 0-9, _ and -), and a period (in seconds). The summarised data will be stored in a collection named summary_<stagename>. You can optionally set up database capping for this collection (with the collection_size property, in bytes), and collapsing of unimportant conversations (with the collapse_min_bytes and/or collapse_min_pkts settings - zero or not set disables collapsing, and collapsing happens if either of the two criteria is met).

opFlow GUI Modes

The opFlow GUI in version 3 includes two different dashboard pages, one optimised for high-volume and one optimised for high precision.

By default, the high-volume mode is active and the  dashboard page shows one traffc summary sections (default is Top Applications, sorted by traffic volume in bytes). You can select the summary to display using the Advanced menu (Summary Type and Summary Field). Changing Summary Type selects a different summary section, and affects the Flows over Time chart (i.e. the charted data is grouped according to your selection).

If you set opflow_gui_high_flow_volume to 0 or "false", the opFlow GUI switches to low-volume mode.

In this mode the  dashboard shows three summary sections, Top Talkers, Top Applications and Top Applications plus Sources, again sorted by traffic volume in bytes. Again, the Advanced menu lets you select the sort field (Summary Field), but changing Summary Type changes only the Flows over Time chart.

 

 

  • No labels