Bug #2726: writing large number of json events on high speed traffic results in packet drops - Suricata - Open Information Security Foundation

Actions

Copy link

Bug #2726

closed

writing large number of json events on high speed traffic results in packet drops

Added by Peter Manev over 6 years ago. Updated about 5 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Jason Ish

Target version:

6.0.0beta1

Affected Versions:

Effort:

Difficulty:

Label:

Description

Using the same set up with 4.1.0-dev (rev 683be948) on high speed live traffic i can consistently reproduce the following:

First run - eve.json log - logging only alerts (getting about 500-1000 alerts per second):

[73931] 5/12/2018 -- 00:09:23 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete                                                     
[73931] 5/12/2018 -- 00:09:23 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0':  pkts: 1178508988, drop: 4658980 (0.40%), invalid chksum: 0

Second run - eve.json log and eve-flows.json - logging alerts and flows into separate files (getting about 500-100 alerts per second and about 30 000 flows written per second):

[78187] 5/12/2018 -- 00:29:38 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete
[78187] 5/12/2018 -- 00:29:38 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0':  pkts: 1160865740, drop: 206750638 (17.81%), invalid chksum: 0

If there are a lot of events that need to be written even if log files are separated - it results in packet drops.
The test system is using SSDs - it can take high volume log writing.

Some more info about Suricata used:

This is Suricata version 4.1.0-dev (rev 683be948)
Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON TLS MAGIC RUST 
SIMD support: SSE_4_2 SSE_4_1 SSE_3 
Atomic intrisics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 4.2.1 Compatible Clang 6.0.0 (tags/RELEASE_600/final), C version 199901
compiled with _FORTIFY_SOURCE=0
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.28, linked against LibHTP v0.5.28

Suricata Configuration:
  AF_PACKET support:                       yes
  eBPF support:                            yes
  XDP support:                             yes
  PF_RING support:                         no
  NFQueue support:                         no
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no
  DAG enabled:                             no
  Napatech enabled:                        no
  WinDivert enabled:                       no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  liblzma support:                         no
  hiredis support:                         no
  hiredis async with libevent:             no
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             yes, through luajit
  libluajit:                               yes
  libgeoip:                                yes
  Non-bundled htp:                         no
  Old barnyard2 support:                   no
  Hyperscan support:                       yes
  Libnet support:                          yes
  liblz4 support:                          yes

  Rust support:                            yes (default)
  Rust strict mode:                        yes
  Rust debug mode:                         no
  Rust compiler:                           rustc 1.24.1
  Rust cargo:                              cargo 0.25.0

  Suricatasc install:                      yes

  Profiling enabled:                       no
  Profiling locks enabled:                 no

Development settings:
  Coccinelle / spatch:                     yes
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /usr
  Configuration directory:                 /etc/suricata/
  Log directory:                           /var/log/suricata/

  --prefix                                 /usr
  --sysconfdir                             /etc
  --localstatedir                          /var

  Host:                                    x86_64-pc-linux-gnu
  Compiler:                                clang-6.0 (exec name) / clang (real)
  GCC Protect enabled:                     no
  GCC march native enabled:                yes
  GCC Profile enabled:                     no
  Position Independent Executable enabled: no
  CFLAGS                                   -g -O2 -march=native -I${srcdir}/../rust/gen/c-headers
  PCAP_CFLAGS                               -I/usr/include
  SECCFLAGS

Related issues 3 (2 open — 1 closed)

Actions

Copy link

Updated by Peter Manev over 6 years ago

Subject changed from writng large number of json events on high speed traffic results in packet drops to writing large number of json events on high speed traffic results in packet drops

Actions

Copy link

Updated by Victor Julien over 6 years ago

Status changed from New to Assigned
Assignee set to Jason Ish
Target version set to 70

Overall, the generating and writing of so many events is simply and expensive operation. We can try to optimize things of course. I'm testing a branch for reducing heap allocations in logging certain fields.

Assigning ticket to Jason as he's looking into speeding up creating the JSON records.

Actions

Copy link

Updated by Andreas Herz about 6 years ago

As I also have some setups now where more and more logging result in higher drop rates is there any information that could help?

@Peter Pan did you try using sockets or even redis and compare the performance?

Actions

Copy link

Updated by Peter Manev about 6 years ago

There are other means - via socket/redis etc - the main point here is that (as Victor mentions above) - it is an expensive operation and there is a lot of contention (imagine 40 threads writing in the same file lots of events). There were a few ideas discussed earlier optimization/logfile per thread/ all writing being handled by diff set of threads etc I am guessing optimization is the first step. From what I see there is no such penalty when using Redis for example.

Actions

Copy link

Updated by Victor Julien almost 6 years ago

Target version changed from 70 to 6.0.0beta1

Actions

Copy link

Updated by Victor Julien almost 6 years ago

This was shared in a Zeek ticket: https://github.com/miloyip/nativejson-benchmark#stringify-time
Some Rust numbers https://github.com/serde-rs/json-benchmark

Actions

Copy link

Updated by Victor Julien almost 6 years ago

Related to Feature #3293: eve: per thread output files added

Actions

Copy link

Updated by Jason Ish over 5 years ago

Status changed from Assigned to In Review

Pull request: https://github.com/OISF/suricata/pull/4634

Actions

Copy link

Updated by Jason Ish about 5 years ago

Status changed from In Review to Closed

Closing. JsonBuilder is now in master. We have a tracking ticket to track conversions of sub-systems from Jansson to JsonBuilder here:
https://redmine.openinfosecfoundation.org/issues/3707

Actions

Copy link

#10