Project

General

Profile

Bug #2726

writing large number of json events on high speed traffic results in packet drops

Added by Peter Manev 9 months ago. Updated 3 months ago.

Status:
Assigned
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

Using the same set up with 4.1.0-dev (rev 683be948) on high speed live traffic i can consistently reproduce the following:

First run - eve.json log - logging only alerts (getting about 500-1000 alerts per second):

[73931] 5/12/2018 -- 00:09:23 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete                                                     
[73931] 5/12/2018 -- 00:09:23 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0':  pkts: 1178508988, drop: 4658980 (0.40%), invalid chksum: 0  

Second run - eve.json log and eve-flows.json - logging alerts and flows into separate files (getting about 500-100 alerts per second and about 30 000 flows written per second):

[78187] 5/12/2018 -- 00:29:38 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete
[78187] 5/12/2018 -- 00:29:38 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0':  pkts: 1160865740, drop: 206750638 (17.81%), invalid chksum: 0

If there are a lot of events that need to be written even if log files are separated - it results in packet drops.
The test system is using SSDs - it can take high volume log writing.

Some more info about Suricata used:

This is Suricata version 4.1.0-dev (rev 683be948)
Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON TLS MAGIC RUST 
SIMD support: SSE_4_2 SSE_4_1 SSE_3 
Atomic intrisics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 4.2.1 Compatible Clang 6.0.0 (tags/RELEASE_600/final), C version 199901
compiled with _FORTIFY_SOURCE=0
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.28, linked against LibHTP v0.5.28

Suricata Configuration:
  AF_PACKET support:                       yes
  eBPF support:                            yes
  XDP support:                             yes
  PF_RING support:                         no
  NFQueue support:                         no
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no
  DAG enabled:                             no
  Napatech enabled:                        no
  WinDivert enabled:                       no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  liblzma support:                         no
  hiredis support:                         no
  hiredis async with libevent:             no
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             yes, through luajit
  libluajit:                               yes
  libgeoip:                                yes
  Non-bundled htp:                         no
  Old barnyard2 support:                   no
  Hyperscan support:                       yes
  Libnet support:                          yes
  liblz4 support:                          yes

  Rust support:                            yes (default)
  Rust strict mode:                        yes
  Rust debug mode:                         no
  Rust compiler:                           rustc 1.24.1
  Rust cargo:                              cargo 0.25.0

  Suricatasc install:                      yes

  Profiling enabled:                       no
  Profiling locks enabled:                 no

Development settings:
  Coccinelle / spatch:                     yes
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /usr
  Configuration directory:                 /etc/suricata/
  Log directory:                           /var/log/suricata/

  --prefix                                 /usr
  --sysconfdir                             /etc
  --localstatedir                          /var

  Host:                                    x86_64-pc-linux-gnu
  Compiler:                                clang-6.0 (exec name) / clang (real)
  GCC Protect enabled:                     no
  GCC march native enabled:                yes
  GCC Profile enabled:                     no
  Position Independent Executable enabled: no
  CFLAGS                                   -g -O2 -march=native -I${srcdir}/../rust/gen/c-headers
  PCAP_CFLAGS                               -I/usr/include
  SECCFLAGS                                

History

#1

Updated by Peter Manev 9 months ago

  • Subject changed from writng large number of json events on high speed traffic results in packet drops to writing large number of json events on high speed traffic results in packet drops
#2

Updated by Victor Julien 6 months ago

  • Status changed from New to Assigned
  • Assignee set to Jason Ish
  • Target version set to Soon

Overall, the generating and writing of so many events is simply and expensive operation. We can try to optimize things of course. I'm testing a branch for reducing heap allocations in logging certain fields.

Assigning ticket to Jason as he's looking into speeding up creating the JSON records.

#3

Updated by Andreas Herz 3 months ago

As I also have some setups now where more and more logging result in higher drop rates is there any information that could help?

@Peter did you try using sockets or even redis and compare the performance?

#4

Updated by Peter Manev 3 months ago

There are other means - via socket/redis etc - the main point here is that (as Victor mentions above) - it is an expensive operation and there is a lot of contention (imagine 40 threads writing in the same file lots of events). There were a few ideas discussed earlier optimization/logfile per thread/ all writing being handled by diff set of threads etc I am guessing optimization is the first step. From what I see there is no such penalty when using Redis for example.

Also available in: Atom PDF