Bug #2726
closedwriting large number of json events on high speed traffic results in packet drops
Description
Using the same set up with 4.1.0-dev (rev 683be948) on high speed live traffic i can consistently reproduce the following:
First run - eve.json log - logging only alerts (getting about 500-1000 alerts per second):
[73931] 5/12/2018 -- 00:09:23 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete [73931] 5/12/2018 -- 00:09:23 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0': pkts: 1178508988, drop: 4658980 (0.40%), invalid chksum: 0
Second run - eve.json log and eve-flows.json - logging alerts and flows into separate files (getting about 500-100 alerts per second and about 30 000 flows written per second):
[78187] 5/12/2018 -- 00:29:38 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete [78187] 5/12/2018 -- 00:29:38 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0': pkts: 1160865740, drop: 206750638 (17.81%), invalid chksum: 0
If there are a lot of events that need to be written even if log files are separated - it results in packet drops.
The test system is using SSDs - it can take high volume log writing.
Some more info about Suricata used:
This is Suricata version 4.1.0-dev (rev 683be948)
Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON TLS MAGIC RUST
SIMD support: SSE_4_2 SSE_4_1 SSE_3
Atomic intrisics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 4.2.1 Compatible Clang 6.0.0 (tags/RELEASE_600/final), C version 199901
compiled with _FORTIFY_SOURCE=0
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.28, linked against LibHTP v0.5.28
Suricata Configuration:
AF_PACKET support: yes
eBPF support: yes
XDP support: yes
PF_RING support: no
NFQueue support: no
NFLOG support: no
IPFW support: no
Netmap support: no
DAG enabled: no
Napatech enabled: no
WinDivert enabled: no
Unix socket enabled: yes
Detection enabled: yes
Libmagic support: yes
libnss support: yes
libnspr support: yes
libjansson support: yes
liblzma support: no
hiredis support: no
hiredis async with libevent: no
Prelude support: no
PCRE jit: yes
LUA support: yes, through luajit
libluajit: yes
libgeoip: yes
Non-bundled htp: no
Old barnyard2 support: no
Hyperscan support: yes
Libnet support: yes
liblz4 support: yes
Rust support: yes (default)
Rust strict mode: yes
Rust debug mode: no
Rust compiler: rustc 1.24.1
Rust cargo: cargo 0.25.0
Suricatasc install: yes
Profiling enabled: no
Profiling locks enabled: no
Development settings:
Coccinelle / spatch: yes
Unit tests enabled: no
Debug output enabled: no
Debug validation enabled: no
Generic build parameters:
Installation prefix: /usr
Configuration directory: /etc/suricata/
Log directory: /var/log/suricata/
--prefix /usr
--sysconfdir /etc
--localstatedir /var
Host: x86_64-pc-linux-gnu
Compiler: clang-6.0 (exec name) / clang (real)
GCC Protect enabled: no
GCC march native enabled: yes
GCC Profile enabled: no
Position Independent Executable enabled: no
CFLAGS -g -O2 -march=native -I${srcdir}/../rust/gen/c-headers
PCAP_CFLAGS -I/usr/include
SECCFLAGS
Updated by Peter Manev almost 7 years ago
- Subject changed from writng large number of json events on high speed traffic results in packet drops to writing large number of json events on high speed traffic results in packet drops
Updated by Victor Julien over 6 years ago
- Status changed from New to Assigned
- Assignee set to Jason Ish
- Target version set to 70
Overall, the generating and writing of so many events is simply and expensive operation. We can try to optimize things of course. I'm testing a branch for reducing heap allocations in logging certain fields.
Assigning ticket to Jason as he's looking into speeding up creating the JSON records.
Updated by Andreas Herz over 6 years ago
As I also have some setups now where more and more logging result in higher drop rates is there any information that could help?
@Peter did you try using sockets or even redis and compare the performance?
Updated by Peter Manev over 6 years ago
There are other means - via socket/redis etc - the main point here is that (as Victor mentions above) - it is an expensive operation and there is a lot of contention (imagine 40 threads writing in the same file lots of events). There were a few ideas discussed earlier optimization/logfile per thread/ all writing being handled by diff set of threads etc I am guessing optimization is the first step. From what I see there is no such penalty when using Redis for example.
Updated by Victor Julien about 6 years ago
- Target version changed from 70 to 6.0.0beta1
Updated by Victor Julien almost 6 years ago
This was shared in a Zeek ticket: https://github.com/miloyip/nativejson-benchmark#stringify-time
Some Rust numbers https://github.com/serde-rs/json-benchmark
Updated by Victor Julien almost 6 years ago
- Related to Feature #3293: eve: per thread output files added
Updated by Jason Ish over 5 years ago
- Status changed from Assigned to In Review
Pull request: https://github.com/OISF/suricata/pull/4634
Updated by Jason Ish over 5 years ago
- Status changed from In Review to Closed
Closing. JsonBuilder is now in master. We have a tracking ticket to track conversions of sub-systems from Jansson to JsonBuilder here:
https://redmine.openinfosecfoundation.org/issues/3707
Updated by Victor Julien over 5 years ago
- Related to Optimization #3707: Convert JSON Loggers to JsonBuilder added
Updated by Juliana Fajardini Reichow over 4 years ago
- Related to Documentation #4557: Add document about JsonBuilder added