Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If these defaults are not suitable for your environment, then yoopflowd_max_processesu you can either choose to use a disk-percentage based formula during the installation procedure, or you can leave the database uncapped, adjust the parameters and manually perform the  capping using opflow-cli.pl act=setup-db.

...

  • nfdump must be configured with a rotation interval that is the same as  opflow_summarisation_interval.
    This is done by setting DATA_ROTATE_INTERVAL=120 in the config file /etc/sysconfig/nfdump (CentOS/RedHat) or /etc/default/nfdump (Debian/Ubuntu).
  • In this mode, no raw flows are stored in MongoDB, and the capping size of the flows collection is unimportant. Only the capping on the conversations collection is relevant.
  • The pre-aggregation combines all flows within the respective interval and groups them by the involved endpoints, the communication protocol and application in question.
    Some granularity present in the raw flow records is sacrificed for scalability: a conversation includes a list of the port numbers involved, and cumulative counters for packets and bytes for the whole summarisation interval. Any number of network interactions between the same endpoints, using the same application and which fall into one summarisation period are lumped up into one conversation record.
  • If you use a short summarisation interval, the pre-aggregation will be less efficient at combining multiple flows into conversations.
    Because of that you will  You will also likely experience higher database loads and may hit insertion speed limits at a lower volume of incoming netflow records.
  • If you use a long summarisation interval then the summarisation will be maximally efficient, but the opFlow GUI will exhibit time lag and show somewhat more outdated data.

...

  • The summarisation interval should be a multiple of the nfdump rotation interval for optimal performance, but those parameters are not as closely tied in this mode.
  • Both flows and conversations collections will be used. The size capping on flows must be sized to retain records for at least the most recent summarisation interval.
  • Whenever nfdump rotates its flow collection file, opflowd picks that up and starts collection and insertion of the raw flows contained therein.
  • Inserting lots of raw flows requires more database performance (and possibly fine-tuning of the opflow_batch_insert_size config  parameter) and you will hit database limitations much earlier than in high-volume mode.
  • As long as you don't run into the size capping limits on the raw flows collection, full data of the utmost precision remains available.
  • However, as As of version 3.0.2, the opFlow GUI does not expose the raw flows to the user.

...

opflowd will start up to that many flow consumer and summariser processes. With the default settings your opFlow installation would thus keep up with inbound flow volumes until the processing of each nfdump flow file takes four times the file's time period. 

if it If opFlow detects resource exhaustion of this kind, an Operational Status record (and suitable log messages) are is created to notify you of the problem. Additionally, opflowd also generates statistics for every processing run which can be viewed on the Operational Status page.

Long-Term Summarisation Stages

Especially for (re-)creating traffic overview reports retrospectively, the database Database capping is likely to interfere and limit long-term data availability. To address this point, opFlow 3 also supports an arbitrary number of optional longer-term summarisation stages. These reside in separate database collections and can be capped independently. This functionality is used for (re-)creating traffic overview reports retrospectively.

By default a one-hour summary stage is enabled, which furthermore collapses and combines all conversations that produced less than 1024 bytes or less than 5 packets during the respective hour. In our tests these settings have proven to provide a very high degree of compression efficiency without much loss of detail.

Both the opFlow GUI and the report generation code look for the 'best available' source of data and fall back to using summarisation stage data where required. This means that even though your main conversations may have been purged due to high incoming flow volume and size capping after just a few hours, you would still be able to access historic data reaching back to the oldest summarisation stage result (but you may have to select a longer Summarise Interval in the Avanced Advanced menu).

You can define summary stages in the configuration file, under opflow_summary_stages; a stage definition requires a name (allowed characters A-Z, a-z, 0-9, _ and -), and a period (in seconds). The summarised data will be stored in a collection named summary_<stagename>. You can optionally set up database capping for this collection (with the collection_size property, in bytes), and collapsing of unimportant conversations (with the collapse_min_bytes and/or collapse_min_pkts settings - zero or not set disables collapsing, and collapsing happens if either of the two criteria is met).

...

By default, the high-volume mode is active and the  the dashboard page shows one traffc summary sections traffic summarised in one way (default is Top Applications, sorted by traffic volume in bytes). You can select change the summary to display displayed using the Advanced menu (Summary Type and Summary Field). Changing Summary Type selects a different summary section, and affects the Flows over Time chart (i.e. the charted data is grouped according to your selection).

...

In this mode the  dashboard shows three summary sectionsthe data summarised in three different ways, Top Talkers, Top Applications and Top Applications plus Sources, again sorted by traffic volume in bytes. Again, the Advanced menu lets you select the sort field (Summary Field), but changing Summary Type changes only the Flows over Time chart in this mode..