Project

General

Profile

Actions

Bug #1806

closed

Packet loss performance is worse in 3.1RC1 vs 3.0

Added by Chris Beverly almost 8 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Affected Versions:
Effort:
Difficulty:
Label:

Description

On a test server inspecting between 5.5 to 7 Gb/s, we've upgraded from Suricata v3.0 to v3.1RC1 and noticed that packet loss went from ~25% on v3.0 to ~45% on v3.1RC1. Suricata was built with the same compile options on both versions, which are:

--prefix=/usr --sysconfdir=/etc --localstatedir=/var --enable-gccprotect --disable-gccmarch-native --disable-coccinelle --enable-nfqueue --enable-af-packet --enable-jemalloc --with-libnspr-includes=/usr/include/nspr4 --with-libnss-includes=/usr/include/nss3 --enable-jansson --enable-geoip --enable-luajit --enable-hiredis

The exact same ruleset and config is used between both versions, and has been repeated over and over again. The config file being used is attached. The test system being used for this test is a Dell R610 with 4 physical processors of model "Intel(R) Xeon(R) CPU L5506 @ 2.13GHz" (8 with Hyperthreading, which we have enabled) and 24 GB of RAM.

Aside from the noticeable increase in packet loss, we have noticed a drastic reduction in the amount of time Suricata takes to start inspection traffic after the process starts, which has been a reduction from ~60 seconds to less than 3 seconds. It should also be noted that Suricata is running within a docker container for both 3.0 and 3.1RC1, each based on the same CentOS 7.2 base image.


Files

suricata.3.0.yaml (38 KB) suricata.3.0.yaml Chris Beverly, 06/10/2016 09:36 PM
startup-3.0.log (28.8 KB) startup-3.0.log Suricata 3.0 startup Chris Beverly, 06/13/2016 11:00 AM
startup-3.1RC1.log (28 KB) startup-3.1RC1.log Suricata 3.1RC1 startup Chris Beverly, 06/13/2016 11:00 AM
Actions #1

Updated by Victor Julien almost 8 years ago

Could you attach the startup logs from 3.0 with -v and 3.1RC1 with -vvv?

The shorter startup time is expected. It's the result of a rewrite of part of the detection engine.

Updated by Chris Beverly almost 8 years ago

Yeah, was saw the shorter startup time in the changelog (which is awesome!). Attached are the two logs for each version.

Actions #3

Updated by Victor Julien almost 8 years ago

Couple things I noticed so far:

- please see if you can address this warning {"log":"13/6/2016 -- 15:49:29 - \u003cWarning\u003e - [ERRCODE: SC_WARN_POOR_RULE(276)] - rule 8000000: SYN-only to port(s) 13337:13337 w/o direction specified, disabling for toclient direction\n","stream":"stdout","time":"2016-06-13T15:49:29.716235751Z"}

- it seems in 3.1RC1 you're using AF_PACKETv3. Please try v2 (the default in Suricata 3.0) by adding this to your afpacket config "tpacket-v3: no"

- you're using very few rules (149), so I wouldn't expect the detection engine to have a large effect on the perf

Actions #4

Updated by Chris Beverly almost 8 years ago

Turning off tpacket-v3 did make a noticeable impact on the rate of dropped packets, but there's still definitely a considerable difference between v3.0 and 3.1RC1. Traffic rates have changed a bit from the initial testing (different time of day), and here are the numbers out of "suricatasc iface-stat" for each of the versions (after letting the process run for ~300M packets):

v3.0:

iface-stat bond1

Success: {
"drop": 140401761,
"invalid-checksums": 0,
"pkts": 300891030
}

v3.1RC1

iface-stat bond1

Success: {
"drop": 169177519,
"invalid-checksums": 0,
"pkts": 303806025
}

This puts v3.0 at around 46% of packets dropped vs total, while 3.1RC1 is up closer to 56%.

Not sure if it helps or not, but just about every rule we have enabled are all thresholding rules, which look a lot like the following:

alert tcp $EXTERNAL_NET any -> $DSTVAR any (msg:"DDoS syn_by_dst [dst-protect]"; flow:stateless; flags:S; threshold:type both, track by_dst, count 15000, seconds 5; classtype:attempted-dos; sid:3000000; rev:1;)

These are the rules that caused us some very long startup times in 3.0, but 3.1RC1 definitely starts up much faster with these rules in place. I've previously confirmed that with these rules disabled in 3.0, it would start up in under 5 seconds as opposed to 60 to 120 seconds.

Actions #5

Updated by Victor Julien almost 8 years ago

Are you able to (privately) share your ruleset?

Actions #6

Updated by Chris Beverly almost 8 years ago

Absolutely, just need to know how and where to send it.

Actions #7

Updated by Victor Julien almost 8 years ago

Can you email me at ?

Actions #8

Updated by Chris Beverly almost 8 years ago

Email is sent.

Actions #9

Updated by Peter Manev almost 8 years ago

I was looking at the previously provided suricata.yaml. I notice two things that may have affected things:

1.
In the af-packet section you have both

    ring-size: 300000

    # On busy system, this could help to set it to yes to recover from a packet drop
    # phase. This will result in some packets (at max a ring flush) being non treated.
    use-emergency-flush: no
    # recv buffer size, increase value could improve performance
    # buffer-size: 32768

    buffer-size: 32768

I think with kernel > 3.2 you should have only ring-size enabled not both at the same time.

2.
Some yaml config adjustments - for example in the provided yaml in stream.reassembly.segments section differ form the default style -
https://redmine.openinfosecfoundation.org/projects/suricata/repository/revisions/master/entry/suricata.yaml.in#L1145

I have seen a few changes like those that have a "silent" unintended effect on the configuration and hence performance.

Actions #10

Updated by Chris Beverly almost 8 years ago

Changing those settings didn't seem to make any discernible difference in performance. Here are the packet drop rates for each version with and without those config options commented out, no other changes to the suricata.yaml config:

3.0 - 41.0% loss as is, 40.7% with buffer-size and the segment section commented out
3.1RC1 - 52.6% loss as is, 51.9% with buffer-size and the segment section commented out

While there seems to be a minor difference in each of the versions for the config change, this may also be variable with traffic load. There is still a very noticeable different in performance between the two versions both with and without those config options you cited.

Victor provided me with an update via email regarding worse performance due to the rules we're mostly utilizing (packet threshold by destination) and the detection engine rewrite in 3.1RC1. I'm currently waiting to hear more info back on that.

Actions #11

Updated by Peter Manev almost 8 years ago

Thanks for trying it out.

How do you run your tests exactly btw? What is your suricata start line?

Actions #12

Updated by Chris Beverly almost 8 years ago

The tests are literally just starting up Suricata and waiting for the engine to receive a total of ~300 million packets, then just do the math on dropped vs total packets. Our startup line is:

/usr/bin/suricata -c /etc/suricata/suricata.yaml --af-packet

Actions #13

Updated by Peter Manev almost 8 years ago

Could you try out something -

Change the value of max-pending-packets from the current 4096 to 65534 and run the tests again with 3.0 and 3.1RC1(v2/v3 afpacket) and see how would that affect the test results when inspecting on the 5.5-7Gbps speeds as you mention initially.

Actions #14

Updated by Chris Beverly almost 8 years ago

It doesn't seem to have made any difference. I retested with both after making the change and let the process run until around 300 million packets before calculating the drop rate. Traffic rates are currently at 7 Gb/s, the drop rates were 44.2% on 3.0 and 53.8% on 3.1RC1.

Actions #15

Updated by Victor Julien almost 8 years ago

  • Status changed from New to Assigned
  • Assignee set to Victor Julien
  • Target version set to 70

I'm hoping to address this for 3.2. I have some ideas about adding more prefilter steps/engines.

Actions #16

Updated by Alexander Gozman over 7 years ago

Actually, 3.0.1 also seems to suffer from performance loss if compared to 2.1b4. We've seen a huge performance drop in afpacket inline mode with a large set of signatures (full Emerging Threats Pro set, ~26000 sigs) with mpm-algo set to 'ac' and sgh-mpm-context set to 'auto'. Without signatures everything worked fine with no performance decrease. Playing with max-pending-packets or af-packet params hasn't changed anything.
Probably the roots of evil are the same here ;)

PS: I can provide the ruleset and/or config file and more detailed description privately, if needed.

Actions #17

Updated by Victor Julien over 7 years ago

@Alex Lasky I'd be surprised if it's the same issue. Chris is running a highly specialized ruleset, and I've identified the cause for the slow down for that set between 3.0 and 3.1. Whatever you are seeing must have happened much earlier during the 3.0 development. Would be interesting if you can pinpoint what it is. Perhaps a git bisect would be useful.

Actions #18

Updated by Victor Julien over 6 years ago

  • Status changed from Assigned to Closed
  • Assignee deleted (Victor Julien)
  • Target version deleted (70)

The more generic prefilter engines that were added in 3.2 should address this.

Actions

Also available in: Atom PDF