Optimization #1242
closedHuge performance decrease with /dev/zero traffic
Added by Andreas Herz over 11 years ago. Updated over 9 years ago.
Description
There is a huge performance decrease with /dev/zero traffic and some activated Rules.
Suricata is used in inline mode:
suricata -c /etc/suricata/suricata.yaml -q 0
The setup is built with 4 machines, 2 clients and 2 servers that connect the 2 clients and on one server is suricata running.
The rules used are:
- botcc.rules - ciarmy.rules - compromised.rules - drop.rules - dshield.rules - emerging-current_events.rules - emerging-malware.rules - emerging-mobile_malware.rules - emerging-scan.rules - emerging-trojan.rules - emerging-user_agents.rules - emerging-worm.rules
Some profiling output:
http://paste.geekosphere.org/TEb2ePqsSueyCSVu
http://paste.geekosphere.org/4HvILUUgf0IMMBq9
The test is made by creating 2 different testfiles:
dd if=/dev/zero of=testfile bs=1M count=4096 dd if=/dev/urandom of=testfile2 bs=1M count=4096
The transfer is made with netcat:
Client A:
nc -v -v -l -n -p 2222 >/dev/null
Client B:
pv -t -r -a -b testfile | nc -v -v -n $IP 2222 >/dev/null
The diff between testfile and testfile2 was 160Mbit/s to 40Mbit/s.
The same rules within snort don't decrease the performance like that.
Files
| packet_stats.log.nc (3.67 KB) packet_stats.log.nc | TCP (slow) traffic with zeros | Andreas Herz, 07/22/2014 05:08 AM | |
| packet_stats.log.http (4.48 KB) packet_stats.log.http | HTTP (fast) traffic with zeros | Andreas Herz, 07/22/2014 05:08 AM | |
| zero.rules (69.6 KB) zero.rules | Andreas Herz, 07/30/2014 04:06 AM | ||
| foobar.rules (1.71 KB) foobar.rules | Andreas Herz, 07/30/2014 06:58 AM | 
Updated by Victor Julien over 11 years ago
I suspect that one or more of these rules have |00| or |00 00| as a fast pattern. As each data byte of the stream is 00, it will trigger the more expensive 'match'-patch for each packet (and even more often in the AC matcher I think).
Updated by Andreas Herz over 11 years ago
Victor Julien wrote:
I suspect that one or more of these rules have |00| or |00 00| as a fast pattern. As each data byte of the stream is 00, it will trigger the more expensive 'match'-patch for each packet (and even more often in the AC matcher I think).
I parsed several of the rules (see pastes) that where called and didn't see thoe pattern.
In the meantime i could also confirm the test via ftp also has the 4:1 difference, but downloading the files via http didn't show such a 4:1 difference.
Updated by Andreas Herz over 11 years ago
Andreas Herz wrote:
Victor Julien wrote:
I suspect that one or more of these rules have |00| or |00 00| as a fast pattern. As each data byte of the stream is 00, it will trigger the more expensive 'match'-patch for each packet (and even more often in the AC matcher I think).
I parsed several of the rules (see pastes) that where called and didn't see thoe pattern.
In the meantime i could also confirm the test via ftp also has the 4:1 difference, but downloading the files via http didn't show such a 4:1 difference.
I don't understand why it's so much better with HTTP:
while true; do { echo -e 'HTTP/1.1 200 OK\r\n'; cat testfile; } | nc -l -p 8000; done
	and
wget http://10.0.20.89:8000/
Don't have the gap although in the sniffer it's always just zeros :/ (execept the initial connect). It's also unrelated to the port i use, since i tried also ports not in $HTTP_PORTS. Using $HTTP_PORTS for simple tcp connection still result in the decrease. So it looks like that it's decreasing with every tcp connection except HTTP.
In Profiling i also see, that the rules are still read and have kinda the same ticks.
Updated by Andreas Herz over 11 years ago
Andreas Herz wrote:
In Profiling i also see, that the rules are still read and have kinda the same ticks.
I deactivated emerging-current_events.rules and emerging-scan.rules and now i see no rule ticks in the rule_perf.log but the decrease is still there, so i doubt that it's related to specific rules.
Updated by Andreas Herz over 11 years ago
- File packet_stats.log.http packet_stats.log.http added
- File packet_stats.log.nc packet_stats.log.nc added
Andreas Herz wrote:
Andreas Herz wrote:
In Profiling i also see, that the rules are still read and have kinda the same ticks.
I deactivated emerging-current_events.rules and emerging-scan.rules and now i see no rule ticks in the rule_perf.log but the decrease is still there, so i doubt that it's related to specific rules.
As asked, the packet_stats for "fast" HTTP and "slow" TCP with the same file.
Updated by Andreas Herz about 11 years ago
I did the same test with real hardware now and this is what i got:
CPU: Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz
RAM: 2GB
NETWORK: intel e1000e and igb cards
I get 7.5MB/s (60Mbit/s) with the /dev/zero traffic and 48MB/s (384Mbit/s) with the /dev/urandom traffic.
Updated by Victor Julien about 11 years ago
Can you try narrowing down the ruleset? It could be that there are just a few rules causing this.
Updated by Andreas Herz about 11 years ago
Victor Julien wrote:
Can you try narrowing down the ruleset? It could be that there are just a few rules causing this.
I narrowed it down to "emerging-scan.rules" and "emerging-trojan.rules", with those 2 active it's still the huge gap. Although using only one of them doesn't result in the gap, so it's only the case with both active.
In the next step i would start with deleting some rule block within the files, unless you have a better idea :)
Updated by Andreas Herz about 11 years ago
- File zero.rules zero.rules added
Andreas Herz wrote:
In the next step i would start with deleting some rule block within the files, unless you have a better idea :)
See attached rule file with 200 rules with "00" in them.
Updated by Andreas Herz about 11 years ago
- File foobar.rules foobar.rules added
Andreas Herz wrote:
Andreas Herz wrote:
In the next step i would start with deleting some rule block within the files, unless you have a better idea :)
See attached rule file with 200 rules with "00" in them.
And after some annoying testing i can narrow it down to 4 rules:
2017935 2010827 2014600 2018167
I attached the 4 rules in the foobar.rules, with only those 4 rules active the test above (with /dev/zero) you can see the huge difference.
But now i would like to know, why suricata has such problems to deal with those rules.
Updated by Victor Julien about 11 years ago
Engine analysis results
Rules:
-------------------------------------------------------------------
Date: 30/7/2014 -- 14:02:27
-------------------------------------------------------------------
== Sid: 2014600 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Win32/Nitol.A Checkin"; flow:from_client,established; dsize:1028; content:"|01 00 00 00|"; depth:4; content:!"|00|"; distance:0; within:1; content:"|00|"; distance:1; within:1; content:"|00|"; distance:61; within:1; content:"Windows|20|"; distance:0; content:"|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|"; distance:0; content:"|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|"; distance:12; within:20; classtype:trojan-activity; sid:2014600; rev:5;)
    Rule matches on packets.
    Rule contains 7 content options, 0 http content options, 0 pcre options, and 0 pcre options with http modifiers.
    Fast Pattern "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" on "payload" buffer.
    No warnings for this rule.
== Sid: 2017935 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Backdoor family PCRat/Gh0st CnC traffic (OUTBOUND) 12 SET"; flow:to_server,established; dsize:8; content:"|00 00|"; offset:2; depth:2; content:"|00 00|"; distance:2; within:2; flowbits:set,ET.gh0stFmly; flowbits:noalert; reference:url,www.securelist.com/en/descriptions/10155706/Trojan-GameThief.Win32.Magania.eogz; reference:url,www.microsoft.com/security/portal/Threat/Encyclopedia/Entry.aspx?Name=Backdoor%3AWin32%2FPcClient.ZR&ThreatID=-2147325231; reference:md5,3b1abb60bafbab204aeddf8acdf58ac9; classtype:trojan-activity; sid:2017935; rev:3;)
    Rule matches on packets.
    Rule contains 2 content options, 0 http content options, 0 pcre options, and 0 pcre options with http modifiers.
    Fast Pattern "\x00\x00" on "payload" buffer.
    No warnings for this rule.
== Sid: 2010827 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET 8392 (msg:"ET TROJAN Torpig CnC Connect on port 8392"; flowbits:isset,ET.torpig.init; flow:established,to_server; content:"|00 00|"; depth:2; content:"|00 00 00|"; distance:2; within:5; flowbits:set,ET.torpig.fosure; reference:url,doc.emergingthreats.net/2010827; classtype:trojan-activity; sid:2010827; rev:3;)
    Rule matches on packets.
    Rule matches on reassembled stream.
    Rule contains 2 content options, 0 http content options, 0 pcre options, and 0 pcre options with http modifiers.
    Fast Pattern "\x00\x00\x00" on "payload and reassembled stream" buffer.
    Warning: Rule has depth/offset with raw content keywords.  Please note the offset/depth will be checked against both packet payloads and stream.  If you meant to have the offset/depth checked against just the payload, you can update the signature as "alert tcp-pkt..." 
== Sid: 2018167 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Generic CnC"; flow:established,to_server; content:" Mini BackDoor|00|"; offset:9; depth:20; reference:md5,398b6622a2c86d472a4340d3e79e654b; classtype:trojan-activity; sid:2018167; rev:1;)
    Rule matches on packets.
    Rule matches on reassembled stream.
    Rule contains 1 content options, 0 http content options, 0 pcre options, and 0 pcre options with http modifiers.
    Fast Pattern " Mini BackDoor\x00" on "payload and reassembled stream" buffer.
    Warning: Rule has depth/offset with raw content keywords.  Please note the offset/depth will be checked against both packet payloads and stream.  If you meant to have the offset/depth checked against just the payload, you can update the signature as "alert tcp-pkt..." 
	Fast-pattern:
-------------------------------------------------------------------
Date: 30/7/2014 -- 14:02:27
-------------------------------------------------------------------
== Sid: 2014600 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Win32/Nitol.A Checkin"; flow:from_client,established; dsize:1028; content:"|01 00 00 00|"; depth:4; content:!"|00|"; distance:0; within:1; content:"|00|"; distance:1; within:1; content:"|00|"; distance:61; within:1; content:"Windows|20|"; distance:0; content:"|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|"; distance:0; content:"|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|"; distance:12; within:20; classtype:trojan-activity; sid:2014600; rev:5;)
    Fast Pattern analysis:
        Fast pattern matcher: content
        Flags: Distance
        Fast pattern set: no
        Fast pattern only set: no
        Fast pattern chop set: no
        Original content: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
        Final content: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
== Sid: 2017935 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Backdoor family PCRat/Gh0st CnC traffic (OUTBOUND) 12 SET"; flow:to_server,established; dsize:8; content:"|00 00|"; offset:2; depth:2; content:"|00 00|"; distance:2; within:2; flowbits:set,ET.gh0stFmly; flowbits:noalert; reference:url,www.securelist.com/en/descriptions/10155706/Trojan-GameThief.Win32.Magania.eogz; reference:url,www.microsoft.com/security/portal/Threat/Encyclopedia/Entry.aspx?Name=Backdoor%3AWin32%2FPcClient.ZR&ThreatID=-2147325231; reference:md5,3b1abb60bafbab204aeddf8acdf58ac9; classtype:trojan-activity; sid:2017935; rev:3;)
    Fast Pattern analysis:
        Fast pattern matcher: content
        Flags: Offset Depth
        Fast pattern set: no
        Fast pattern only set: no
        Fast pattern chop set: no
        Original content: \x00\x00
        Final content: \x00\x00 
== Sid: 2010827 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET 8392 (msg:"ET TROJAN Torpig CnC Connect on port 8392"; flowbits:isset,ET.torpig.init; flow:established,to_server; content:"|00 00|"; depth:2; content:"|00 00 00|"; distance:2; within:5; flowbits:set,ET.torpig.fosure; reference:url,doc.emergingthreats.net/2010827; classtype:trojan-activity; sid:2010827; rev:3;)
    Fast Pattern analysis:
        Fast pattern matcher: content
        Flags: Within Distance
        Fast pattern set: no
        Fast pattern only set: no
        Fast pattern chop set: no
        Original content: \x00\x00\x00
        Final content: \x00\x00\x00
== Sid: 2018167 ==
drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Generic CnC"; flow:established,to_server; content:" Mini BackDoor|00|"; offset:9; depth:20; reference:md5,398b6622a2c86d472a4340d3e79e654b; classtype:trojan-activity; sid:2018167; rev:1;)
    Fast Pattern analysis:
        Fast pattern matcher: content
        Flags: Offset Depth
        Fast pattern set: no
        Fast pattern only set: no
        Fast pattern chop set: no
        Original content:  Mini BackDoor\x00
        Final content:  Mini BackDoor\x00
Updated by Victor Julien about 11 years ago
I suspect one of the things that is so costly here is that we store each AC match with an offset. For the 2-byte pattern that would mean we store it payloadlen-1 times. This has to be costly. Also, this mechanism currently doesn't take the depth of the pattern into account. It's probably not easy to have the AC matcher take depth into account, but it I think it would be possible to not store these offsets beyond the depth/dsize.
Updated by Andreas Herz about 11 years ago
Victor Julien wrote:
I suspect one of the things that is so costly here is that we store each AC match with an offset. For the 2-byte pattern that would mean we store it payloadlen-1 times. This has to be costly. Also, this mechanism currently doesn't take the depth of the pattern into account. It's probably not easy to have the AC matcher take depth into account, but it I think it would be possible to not store these offsets beyond the depth/dsize.
So your suggestion for now is to take a look at the code?
Or is there anything als i could do? Changing the rules for example.
Updated by Ken Steele about 11 years ago
For this signature:
Sid: 2014600drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Win32/Nitol.A Checkin"; flow:from_client,established; dsize:1028; content:"|01 00 00 00|";
Adding "fast_pattern;" after content:"|01 00 00 00|"; would use that as the pattern, which is more unique than all zeros.
So:
 Sid: 2014600 
drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Win32/Nitol.A Checkin"; flow:from_client,established; dsize:1028; content:"|01 00 00 00|"; fast_pattern;
Updated by Ken Steele about 11 years ago
Have you tried rule profiling? That should point out which rules are taking the most time.
Updated by Ken Steele about 11 years ago
This signature should not be firing with all zero data:
Sid: 2018167drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Generic CnC"; flow:established,to_server; content:" Mini BackDoor|00|"; offset:9; depth:20; reference:md5,398b6622a2c86d472a4340d3e79e654b; classtype:trojan-activity; sid:2018167; rev:1;)
Rule matches on packets.
Rule matches on reassembled stream.
Rule contains 1 content options, 0 http content options, 0 pcre options, and 0 pcre options with http modifiers.
Fast Pattern " Mini BackDoor\x00" on "payload and reassembled stream" buffer.
Updated by Ken Steele about 11 years ago
For this signature:
Sid: 2017935drop tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"ET TROJAN Backdoor family PCRat/Gh0st CnC traffic (OUTBOUND) 12 SET"; flow:to_server,established; dsize:8; content:"|00 00|"; offset:2; depth:2; content:"|00 00|"; distance:2; within:2; flowbits:set,ET.gh0stFmly; flowbits:noalert; reference:url,www.securelist.com/en/descriptions/10155706/Trojan-GameThief.Win32.Magania.eogz; reference:url,www.microsoft.com/security/portal/Threat/Encyclopedia/Entry.aspx?Name=Backdoor%3AWin32%2FPcClient.ZR&ThreatID=-2147325231; reference:md5,3b1abb60bafbab204aeddf8acdf58ac9; classtype:trojan-activity; sid:2017935; rev:3;)
Rule matches on packets.
Rule contains 2 content options, 0 http content options, 0 pcre options, and 0 pcre options with http modifiers.
Fast Pattern "\x00\x00" on "payload" buffer.
No warnings for this rule.
The content of "|00 00|" is not a good filter and on input data that is all zero, it will trigger on all but the first byte.
In this case, the dsize:8 would be a better filter to eliminate packets, as it should check the payload size is 8 bytes.
Updated by Andreas Herz about 11 years ago
Ken Steele wrote:
This signature should not be firing with all zero data:
Sid: 2018167
I thought so, but if i have just this rule active there is still a drop from 120mb/s down to 90mb/s, so it's at least checked.
Adding "fast_pattern;" after content:"|01 00 00 00|"; would use that as the pattern, which is more unique than all zeros.
Perfect, that helped to make this rule much faster, with this addition the performance drop down to 20mb/s (with only this rule active) got back to 120mb/s. I guess i will send this to ET, unless you want since you found this.
Have you tried rule profiling? That should point out which rules are taking the most time.
I tried but that still brings me to those 4 rules. The rules are not good as we see but still shouldn't suricata try to handle such rules faster? I'm not sure what snort does different with them.
Updated by Ken Steele about 11 years ago
Andreas Herz wrote:
Ken Steele wrote:
This signature should not be firing with all zero data:
Sid: 2018167I thought so, but if i have just this rule active there is still a drop from 120mb/s down to 90mb/s, so it's at least checked.
Adding "fast_pattern;" after content:"|01 00 00 00|"; would use that as the pattern, which is more unique than all zeros.
Perfect, that helped to make this rule much faster, with this addition the performance drop down to 20mb/s (with only this rule active) got back to 120mb/s. I guess i will send this to ET, unless you want since you found this.
Have you tried rule profiling? That should point out which rules are taking the most time.
I tried but that still brings me to those 4 rules. The rules are not good as we see but still shouldn't suricata try to handle such rules faster? I'm not sure what snort does different with them.
Yes, please report it to ET. You are welcome to mention my name, since I know several of them.
Updated by Will Metcalf about 11 years ago
We can add a set of Nulls preceding the windows match which should improve perf. Seems present in all samples.
Updated by Andreas Herz about 11 years ago
Will Metcalf wrote:
We can add a set of Nulls preceding the windows match which should improve perf. Seems present in all samples.
Can you describe this a little bit more :)?
Updated by Andreas Herz almost 10 years ago
- Assignee set to Andreas Herz
- Target version set to TBD
Updated by Andreas Herz over 9 years ago
- Status changed from New to Closed
Diff is still there with 3.0 but much much closer, so IMHO just normal behaviour in this case. Closing for now.
Updated by Victor Julien over 9 years ago
- Assignee deleted (Andreas Herz)
- Target version deleted (TBD)