Project

General

Profile

Actions

Bug #2726

closed

writing large number of json events on high speed traffic results in packet drops

Added by Peter Manev almost 4 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

Using the same set up with 4.1.0-dev (rev 683be948) on high speed live traffic i can consistently reproduce the following:

First run - eve.json log - logging only alerts (getting about 500-1000 alerts per second):

[73931] 5/12/2018 -- 00:09:23 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete                                                     
[73931] 5/12/2018 -- 00:09:23 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0':  pkts: 1178508988, drop: 4658980 (0.40%), invalid chksum: 0  

Second run - eve.json log and eve-flows.json - logging alerts and flows into separate files (getting about 500-100 alerts per second and about 30 000 flows written per second):

[78187] 5/12/2018 -- 00:29:38 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete
[78187] 5/12/2018 -- 00:29:38 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0':  pkts: 1160865740, drop: 206750638 (17.81%), invalid chksum: 0

If there are a lot of events that need to be written even if log files are separated - it results in packet drops.
The test system is using SSDs - it can take high volume log writing.

Some more info about Suricata used:

This is Suricata version 4.1.0-dev (rev 683be948)
Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON TLS MAGIC RUST 
SIMD support: SSE_4_2 SSE_4_1 SSE_3 
Atomic intrisics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 4.2.1 Compatible Clang 6.0.0 (tags/RELEASE_600/final), C version 199901
compiled with _FORTIFY_SOURCE=0
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.28, linked against LibHTP v0.5.28

Suricata Configuration:
  AF_PACKET support:                       yes
  eBPF support:                            yes
  XDP support:                             yes
  PF_RING support:                         no
  NFQueue support:                         no
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no
  DAG enabled:                             no
  Napatech enabled:                        no
  WinDivert enabled:                       no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  liblzma support:                         no
  hiredis support:                         no
  hiredis async with libevent:             no
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             yes, through luajit
  libluajit:                               yes
  libgeoip:                                yes
  Non-bundled htp:                         no
  Old barnyard2 support:                   no
  Hyperscan support:                       yes
  Libnet support:                          yes
  liblz4 support:                          yes

  Rust support:                            yes (default)
  Rust strict mode:                        yes
  Rust debug mode:                         no
  Rust compiler:                           rustc 1.24.1
  Rust cargo:                              cargo 0.25.0

  Suricatasc install:                      yes

  Profiling enabled:                       no
  Profiling locks enabled:                 no

Development settings:
  Coccinelle / spatch:                     yes
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /usr
  Configuration directory:                 /etc/suricata/
  Log directory:                           /var/log/suricata/

  --prefix                                 /usr
  --sysconfdir                             /etc
  --localstatedir                          /var

  Host:                                    x86_64-pc-linux-gnu
  Compiler:                                clang-6.0 (exec name) / clang (real)
  GCC Protect enabled:                     no
  GCC march native enabled:                yes
  GCC Profile enabled:                     no
  Position Independent Executable enabled: no
  CFLAGS                                   -g -O2 -march=native -I${srcdir}/../rust/gen/c-headers
  PCAP_CFLAGS                               -I/usr/include
  SECCFLAGS                                


Related issues 3 (2 open1 closed)

Related to Feature #3293: eve: per thread output filesClosedJeff LucovskyActions
Related to Optimization #3707: Convert JSON Loggers to JsonBuilderAssignedJason IshActions
Related to Documentation #4557: Add document about JsonBuilderIn ProgressJuliana Fajardini ReichowActions
Actions #1

Updated by Peter Manev almost 4 years ago

  • Subject changed from writng large number of json events on high speed traffic results in packet drops to writing large number of json events on high speed traffic results in packet drops
Actions #2

Updated by Victor Julien almost 4 years ago

  • Status changed from New to Assigned
  • Assignee set to Jason Ish
  • Target version set to 70

Overall, the generating and writing of so many events is simply and expensive operation. We can try to optimize things of course. I'm testing a branch for reducing heap allocations in logging certain fields.

Assigning ticket to Jason as he's looking into speeding up creating the JSON records.

Actions #3

Updated by Andreas Herz over 3 years ago

As I also have some setups now where more and more logging result in higher drop rates is there any information that could help?

@Peter Pan did you try using sockets or even redis and compare the performance?

Actions #4

Updated by Peter Manev over 3 years ago

There are other means - via socket/redis etc - the main point here is that (as Victor mentions above) - it is an expensive operation and there is a lot of contention (imagine 40 threads writing in the same file lots of events). There were a few ideas discussed earlier optimization/logfile per thread/ all writing being handled by diff set of threads etc I am guessing optimization is the first step. From what I see there is no such penalty when using Redis for example.

Actions #5

Updated by Victor Julien about 3 years ago

  • Target version changed from 70 to 6.0.0beta1
Actions #7

Updated by Victor Julien about 3 years ago

Actions #8

Updated by Jason Ish almost 3 years ago

  • Status changed from Assigned to In Review
Actions #9

Updated by Jason Ish over 2 years ago

  • Status changed from In Review to Closed

Closing. JsonBuilder is now in master. We have a tracking ticket to track conversions of sub-systems from Jansson to JsonBuilder here:
https://redmine.openinfosecfoundation.org/issues/3707

Actions #10

Updated by Victor Julien over 2 years ago

Actions #11

Updated by Juliana Fajardini Reichow over 1 year ago

Actions

Also available in: Atom PDF