Bug #2726
closedwriting large number of json events on high speed traffic results in packet drops
Description
Using the same set up with 4.1.0-dev (rev 683be948) on high speed live traffic i can consistently reproduce the following:
First run - eve.json log - logging only alerts (getting about 500-1000 alerts per second):
[73931] 5/12/2018 -- 00:09:23 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete [73931] 5/12/2018 -- 00:09:23 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0': pkts: 1178508988, drop: 4658980 (0.40%), invalid chksum: 0
Second run - eve.json log and eve-flows.json - logging alerts and flows into separate files (getting about 500-100 alerts per second and about 30 000 flows written per second):
[78187] 5/12/2018 -- 00:29:38 - (detect-engine-build.c:1733) <Info> (SigAddressCleanupStage1) -- cleaning up signature grouping structure... complete [78187] 5/12/2018 -- 00:29:38 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'enp59s0': pkts: 1160865740, drop: 206750638 (17.81%), invalid chksum: 0
If there are a lot of events that need to be written even if log files are separated - it results in packet drops.
The test system is using SSDs - it can take high volume log writing.
Some more info about Suricata used:
This is Suricata version 4.1.0-dev (rev 683be948) Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON TLS MAGIC RUST SIMD support: SSE_4_2 SSE_4_1 SSE_3 Atomic intrisics: 1 2 4 8 16 byte(s) 64-bits, Little-endian architecture GCC version 4.2.1 Compatible Clang 6.0.0 (tags/RELEASE_600/final), C version 199901 compiled with _FORTIFY_SOURCE=0 L1 cache line size (CLS)=64 thread local storage method: __thread compiled with LibHTP v0.5.28, linked against LibHTP v0.5.28 Suricata Configuration: AF_PACKET support: yes eBPF support: yes XDP support: yes PF_RING support: no NFQueue support: no NFLOG support: no IPFW support: no Netmap support: no DAG enabled: no Napatech enabled: no WinDivert enabled: no Unix socket enabled: yes Detection enabled: yes Libmagic support: yes libnss support: yes libnspr support: yes libjansson support: yes liblzma support: no hiredis support: no hiredis async with libevent: no Prelude support: no PCRE jit: yes LUA support: yes, through luajit libluajit: yes libgeoip: yes Non-bundled htp: no Old barnyard2 support: no Hyperscan support: yes Libnet support: yes liblz4 support: yes Rust support: yes (default) Rust strict mode: yes Rust debug mode: no Rust compiler: rustc 1.24.1 Rust cargo: cargo 0.25.0 Suricatasc install: yes Profiling enabled: no Profiling locks enabled: no Development settings: Coccinelle / spatch: yes Unit tests enabled: no Debug output enabled: no Debug validation enabled: no Generic build parameters: Installation prefix: /usr Configuration directory: /etc/suricata/ Log directory: /var/log/suricata/ --prefix /usr --sysconfdir /etc --localstatedir /var Host: x86_64-pc-linux-gnu Compiler: clang-6.0 (exec name) / clang (real) GCC Protect enabled: no GCC march native enabled: yes GCC Profile enabled: no Position Independent Executable enabled: no CFLAGS -g -O2 -march=native -I${srcdir}/../rust/gen/c-headers PCAP_CFLAGS -I/usr/include SECCFLAGS
Updated by Peter Manev almost 6 years ago
- Subject changed from writng large number of json events on high speed traffic results in packet drops to writing large number of json events on high speed traffic results in packet drops
Updated by Victor Julien over 5 years ago
- Status changed from New to Assigned
- Assignee set to Jason Ish
- Target version set to 70
Overall, the generating and writing of so many events is simply and expensive operation. We can try to optimize things of course. I'm testing a branch for reducing heap allocations in logging certain fields.
Assigning ticket to Jason as he's looking into speeding up creating the JSON records.
Updated by Andreas Herz over 5 years ago
As I also have some setups now where more and more logging result in higher drop rates is there any information that could help?
@Peter Pan did you try using sockets or even redis and compare the performance?
Updated by Peter Manev over 5 years ago
There are other means - via socket/redis etc - the main point here is that (as Victor mentions above) - it is an expensive operation and there is a lot of contention (imagine 40 threads writing in the same file lots of events). There were a few ideas discussed earlier optimization/logfile per thread/ all writing being handled by diff set of threads etc I am guessing optimization is the first step. From what I see there is no such penalty when using Redis for example.
Updated by Victor Julien about 5 years ago
- Target version changed from 70 to 6.0.0beta1
Updated by Victor Julien about 5 years ago
This was shared in a Zeek ticket: https://github.com/miloyip/nativejson-benchmark#stringify-time
Some Rust numbers https://github.com/serde-rs/json-benchmark
Updated by Victor Julien about 5 years ago
- Related to Feature #3293: eve: per thread output files added
Updated by Jason Ish over 4 years ago
- Status changed from Assigned to In Review
Pull request: https://github.com/OISF/suricata/pull/4634
Updated by Jason Ish over 4 years ago
- Status changed from In Review to Closed
Closing. JsonBuilder is now in master. We have a tracking ticket to track conversions of sub-systems from Jansson to JsonBuilder here:
https://redmine.openinfosecfoundation.org/issues/3707
Updated by Victor Julien over 4 years ago
- Related to Optimization #3707: Convert JSON Loggers to JsonBuilder added
Updated by Juliana Fajardini Reichow over 3 years ago
- Related to Documentation #4557: Add document about JsonBuilder added