Project

General

Profile

Actions

Feature #4226

open

bsize: apply as depth to patterns

Added by Brandon Murphy about 4 years ago. Updated 9 months ago.

Status:
Assigned
Priority:
Normal
Assignee:
Target version:
Effort:
Difficulty:
Label:
Needs backport

Description

When reviewing rule profiling output of comparing the speed of bsize:x; checks to using depth:x; isdataat:!1,relative; in effort to achieve an "exact match" of a buffer, it has been observed that bsize variants of rules often consume considerably more ticks than alternative methods.

Tested on 7.0.0-dev (372fc2673 2020-12-11) with the default suricata.yaml, with minor adjustments for non-json output of the rule profiling output.

command used for testing

suricata -c suricata.yaml -S test.rules -r a564fbcf-e41d-494c-9ad8-7e44e22a03d9.pcap -l /tmp/perf_test/perf_1/

pcap from https://app.any.run/tasks/a564fbcf-e41d-494c-9ad8-7e44e22a03d9/#

test signatures

alert dns any any -> any any (msg:"bsize test"; dns.query; content:"yundol0727.kro.kr"; bsize:17; sid:1;)
alert dns any any -> any any (msg:"depth and isdataat test"; dns.query; content:"yundol0727.kro.kr"; depth:17; isdataat:!1,relative; sid:2;)
alert dns any any -> any any (msg:"depth and endswith test"; dns.query; content:"yundol0727.kro.kr"; depth:17; endswith; sid:3;)

It's been observed on other buffers as well, just using dns.query as a quick example.

I ran the above signatures through on the same pcap 3 times and have recorded the output of the

First Run

  --------------------------------------------------------------------------
  Date: 12/18/2020 -- 04:12:47. Sorted by: ticks.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- -------------- 
  1        1            1        0        45045        66.10  1        1        45045       45045.00    45045.00    0.00       
  2        2            1        0        12123        17.79  1        1        12123       12123.00    12123.00    0.00       
  3        3            1        0        10983        16.12  1        1        10983       10983.00    10983.00    0.00       

Second Run

  --------------------------------------------------------------------------
  Date: 12/18/2020 -- 04:17:23. Sorted by: ticks.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- -------------- 
  1        1            1        0        26637        53.87  1        1        26637       26637.00    26637.00    0.00       
  2        2            1        0        11772        23.81  1        1        11772       11772.00    11772.00    0.00       
  3        3            1        0        11040        22.33  1        1        11040       11040.00    11040.00    0.00       

Third Run

  --------------------------------------------------------------------------
  Date: 12/18/2020 -- 04:18:08. Sorted by: ticks.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- -------------- 
  1        1            1        0        27219        53.34  1        1        27219       27219.00    27219.00    0.00       
  2        2            1        0        12696        24.88  1        1        12696       12696.00    12696.00    0.00       
  3        3            1        0        11115        21.78  1        1        11115       11115.00    11115.00    0.00       

Build Info

uricata --build-info
This is Suricata version 7.0.0-dev (372fc2673 2020-12-11)
Features: PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON PROFILING TLS TLS_C11 MAGIC RUST 
SIMD support: SSE_4_2 SSE_4_1 SSE_3 
Atomic intrinsics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 7.5.0, C version 201112
compiled with _FORTIFY_SOURCE=2
L1 cache line size (CLS)=64
thread local storage method: _Thread_local
compiled with LibHTP v0.5.36, linked against LibHTP v0.5.36

Suricata Configuration:
  AF_PACKET support:                       yes
  eBPF support:                            no
  XDP support:                             no
  PF_RING support:                         no
  NFQueue support:                         no
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no 
  DAG enabled:                             no
  Napatech enabled:                        no
  WinDivert enabled:                       no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  hiredis support:                         no
  hiredis async with libevent:             no
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             yes, through luajit
  libluajit:                               yes
  GeoIP2 support:                          no
  Non-bundled htp:                         no
  Hyperscan support:                       no
  Libnet support:                          yes
  liblz4 support:                          yes

  Rust support:                            yes
  Rust strict mode:                        no
  Rust compiler path:                      /usr/bin/rustc
  Rust compiler version:                   rustc 1.43.0
  Cargo path:                              /usr/bin/cargo
  Cargo version:                           cargo 1.43.0
  Cargo vendor:                            yes

  Python support:                          no
  Python path:                             not set
  Python distutils                         no
  Python yaml                              no
  Install suricatactl:                     requires python
  Install suricatasc:                      requires python
  Install suricata-update:                 not bundled

  Profiling enabled:                       yes
  Profiling locks enabled:                 no

  Plugin support (experimental):           yes

Development settings:
  Coccinelle / spatch:                     no
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /usr/local
  Configuration directory:                 /usr/local/etc/suricata/
  Log directory:                           /usr/local/var/log/suricata/

  --prefix                                 /usr/local
  --sysconfdir                             /usr/local/etc
  --localstatedir                          /usr/local/var
  --datarootdir                            /usr/local/share

  Host:                                    x86_64-pc-linux-gnu
  Compiler:                                gcc (exec name) / g++ (real)
  GCC Protect enabled:                     no
  GCC march native enabled:                yes
  GCC Profile enabled:                     no
  Position Independent Executable enabled: no
  CFLAGS                                   -g -O2 -std=c11 -march=native -I${srcdir}/../rust/gen -I${srcdir}/../rust/dist
  PCAP_CFLAGS                               -I/usr/include
  SECCFLAGS                                


Files


Related issues 2 (2 open0 closed)

Related to Suricata - Optimization #6375: detect: merge urilen and bsize implementationsNewOISF DevActions
Related to Suricata - Bug #6894: bsize validation FP on content negation with hex encoded 0d 0aNewOISF DevActions
Actions #1

Updated by Victor Julien almost 3 years ago

  • Tracker changed from Bug to Feature
  • Subject changed from bsize is considerably slower than depth:x; isdataat:!1,relative to bsize: apply as depth to patterns
  • Status changed from New to Assigned
  • Assignee set to Jeff Lucovsky
  • Target version set to 7.0.0-beta1
  • Affected Versions deleted (git master)
  • Label Needs backport added

If bsize setting is the exact length of a pattern, apply startwith/endswith logic. Otherwise, apply it as depth. For dsize (and urilen?) we already do this IIRC.

I think this could be backported as well. Technically its not a bug, but it would be a virtually "free" optimization that should be low risk.

Actions #2

Updated by Victor Julien about 2 years ago

  • Target version changed from 7.0.0-beta1 to 8.0.0-beta1
Actions #3

Updated by Brandon Murphy almost 2 years ago

Recently discovered that urilen is much faster than bsize applied to http.uri as well.

I'm not 100% sure if this feature, when implemented, would "solve" that.

urilen:9<>17 vs http.uri; bsize:9<>17; yielded drastically different and unexpected results in favor of urilen.

Using urilen even seems to impact the number of checks a rule experiences. Is it possible that urilen supports some prefilter element that bsize when applied to http.uri does not?

Feel free to let me know if this should be a separate request - it just seems like it might be related.

Actions #4

Updated by Jeff Lucovsky almost 2 years ago

These do seem related so let's not create a new issue yet.

Actions #5

Updated by Jeff Lucovsky almost 2 years ago

If bsize setting is the exact length of a pattern, apply startwith/endswith logic. Otherwise, apply it as depth. For dsize (and urilen?) we already do this IIRC.

Is this due to the logic in DetectUrilenApplyToContent?

Actions #6

Updated by Victor Julien about 1 year ago

Actions #7

Updated by Jeff Lucovsky 9 months ago

  • Related to Bug #6894: bsize validation FP on content negation with hex encoded 0d 0a added
Actions #8

Updated by Jeff Lucovsky 9 months ago

I've been investigating this and I think that the profiling is attributing too many ticks to the first rule.

I've modified the sids so that the rule using bsize is sid 35 -- from suricata master:

  --------------------------------------------------------------------------
  Date: 4/7/2024 -- 09:57:27. Sorted by: max ticks.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- --------------
  1        15           1        0        83365        82.15  1        1        83365       83365.00    83365.00    0.00
  2        35           1        0        7501         7.39   1        1        7501        7501.00     7501.00     0.00
  3        20           1        0        6926         6.83   1        1        6926        6926.00     6926.00     0.00
  4        30           1        0        3682         3.63   1        1        3682        3682.00     3682.00     0.00

I added sid 15 -- only dns.query and content:

alert dns any any -> any any (msg:"w/out bsize test"; dns.query; content:"yundol0727.kro.kr";  sid:15;)
alert dns any any -> any any (msg:"depth and isdataat test"; dns.query; content:"yundol0727.kro.kr"; depth:17; isdataat:!1,relative; sid:20;)
alert dns any any -> any any (msg:"depth and endswith test"; dns.query; content:"yundol0727.kro.kr"; depth:17; endswith; sid:30;)
alert dns any any -> any any (msg:"bsize test"; dns.query; content:"yundol0727.kro.kr"; bsize:17; sid:35;)

With modifications to bsize that make it a content modifier under certain circumstances (sid 35):

 --------------------------------------------------------------------------
  Date: 4/7/2024 -- 10:02:41. Sorted by: max ticks.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- --------------
  1        15           1        0        43219        68.17  1        1        43219       43219.00    43219.00    0.00
  2        20           1        0        7700         12.15  1        1        7700        7700.00     7700.00     0.00
  3        35           1        0        6618         10.44  1        1        6618        6618.00     6618.00     0.00
  4        30           1        0        5860         9.24   1        1        5860        5860.00     5860.00     0.00

Profiling using the same ruleset without sid 15

--------------------------------------------------------------------------
  Date: 4/7/2024 -- 10:14:31. Sorted by: max ticks.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- --------------
  1        20           1        0        103625       90.32  1        1        103625      103625.00   103625.00   0.00
  2        30           1        0        6522         5.68   1        1        6522        6522.00     6522.00     0.00
  3        35           1        0        4588         4.00   1        1        4588        4588.00     4588.00     0.00

I am continuing to investigate but it appears that the first sid (numerical order) may have extra ticks associated with it.

Actions #9

Updated by Brandon Murphy 9 months ago

FWIW, we have observed similar "oddness" within the profiler since the creation of this ticket. We have noticed it enough that at this point we often individually test each rule when comparing variations to determine the fastest method.

I just tested these two rules within our QA environment

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"UA depth/isdataat test"; flow:established,to_server; http.user_agent; content:"Mozilla"; depth:7; isdataat:!1,relative; sid:1; classtype:unknown; rev:1;)
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"UA bsize test"; flow:established,to_server; http.user_agent; bsize:7; content:"Mozilla"; sid:1; classtype:unknown; rev:1;)

The total ticks were within ~5% of each other when tested individually, which seems an acceptable margin of error between runs given the scale of total ticks (9.4bn for bsize and and 8.9bn for depth/isdata)


I also went back and tested the urilen vs bsize applied to http.uri and tested them individually.

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"bsize on http.uri"; flow:established,to_server; http.uri; bsize:17; content:"/js/jquery.min.js"; sid:1; classtype:unknown; rev:1;)
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"urilen no bsize"; flow:established,to_server; urilen:17; http.uri; content:"/js/jquery.min.js"; sid:1; classtype:unknown; rev:1;)
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"urilen with bsize"; flow:established,to_server; urilen:17; http.uri; bsize:17; content:"/js/jquery.min.js"; sid:1; classtype:unknown; rev:1;)

urilen had considerably less checks (334 vs 1422), though the avg ticks is actually lower with only the bsize 19040 vs 27k/28k when using urilen. total ticks were in favor of urilen, mostly contributed to the decreased number of checks.


I will note though, that in this specific case where the string being searched for is common within the buffer, but uncommon when is the exact content of the buffer, it is often MUCH faster to use a more generic buffer and matching on a more complete string, than it is to use a more specific buffer.

consider this rule, which had about 4million less checks than the other two above and total ticks were a ~98% reduction.

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"UA header test"; flow:established,to_server; http.header; content:"User-Agent|3a 20|Mozilla|0d 0a|"; sid:1; classtype:unknown; rev:1;)

----
All of that completed, I do believe the following questions still stand. Perhaps we need another ticket to investigate profiler accuracy?

1. Can you confirm that the optimization that Victor referenced is already being applied when bsize is used?

If bsize setting is the exact length of a pattern, apply startwith/endswith logic. Otherwise, apply it as depth. For dsize (and urilen?) we already do this IIRC.

2. have you been able to find anything that would explain why urilen impacts the number of checks and speed compared to bsize.

Actions #10

Updated by Victor Julien 9 months ago

I think the first entry being seemly extra expensive is probably just a start up cost. Cold data and/or instruction caches, a memory lookup, that kind of thing. Reordering rules moves the cost to the first rule being evaluated.

Actions #11

Updated by Jeff Lucovsky 9 months ago

"All of that completed, I do believe the following questions still stand. Perhaps we need another ticket to investigate profiler accuracy?

1. Can you confirm that the optimization that Victor referenced is already being applied when bsize is used?

If bsize setting is the exact length of a pattern, apply startwith/endswith logic. Otherwise, apply it as depth. For dsize (and urilen?) we already do this IIRC.

2. have you been able to find anything that would explain why urilen impacts the number of checks and speed compared to bsize."

The bsize optimization is not yet applied in master/main-7.0.x

Do you have a pcap the urilen rule example will trigger on?

Actions #12

Updated by Brandon Murphy 9 months ago

Jeff Lucovsky wrote in #note-11:

If bsize setting is the exact length of a pattern, apply startwith/endswith logic. Otherwise, apply it as depth. For dsize (and urilen?) we already do this IIRC.

Sorry, just to be clear, this is currently happening with bsize in master/7.0.x?

Actions #13

Updated by Jeff Lucovsky 9 months ago

No, not yet applied (in development).

Actions #14

Updated by Brandon Murphy 9 months ago

Jeff Lucovsky wrote in #note-11:

Do you have a pcap the urilen rule example will trigger on?

I do! Took me awhile to find one, but I was able to.

*Edit - I should say, it shows the condition, it does not alert on the example rules.

Using the 3 rules above and enabling rule profiling, the following result snippet shows that only the rule without urilen had checks (i did have to renumber the sids)

  --------------------------------------------------------------------------
  Date: 4/9/2024 -- 15:27:28. Sorted by: number of checks.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- -------------- 
  1        1            1        1        26541        100.00 1        0        26541       26541.00    0.00        26541.00   

  --------------------------------------------------------------------------
  Date: 4/9/2024 -- 15:27:28. Sorted by: number of matches.
  --------------------------------------------------------------------------
   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- -------------- 
  1        1            1        1        26541        100.00 1        0        26541       26541.00    0.00        26541.00   
Actions #15

Updated by Jeff Lucovsky 9 months ago

Thanks; the urilen is 42; with that change, all 3 rules are evaluated.

Actions #16

Updated by Brandon Murphy 9 months ago

Jeff Lucovsky wrote in #note-15:

Thanks; the urilen is 42; with that change, all 3 rules are evaluated.

Correct, to that pcap demonstrates that there is some prefilter logic within urilen that, despite the fast_pattern being the same in all the rules, prevents the rules from being evaluated when the urilen condition is not met.

Actions

Also available in: Atom PDF