Bug #4187

rs_dcerpc_udp_get_tx takes out unusual amount of CPU

Added by Peter Manev 7 months ago. Updated 4 months ago.

Target version:
Affected Versions:
Needs backport to 6.0


I noticed something unusual while chasing

It seems rs_dcerpc_udp_get_tx keeps every thread busy with about 0.5-2.5% of the total CPU, which seems a lot for a function like that.
(was having side discussion with Eric about it too)
Please see the Flame Graph (svg) attached as well for a 12 hr run.
The purple squares in the screenshot attached are from the rs_dcerpc_udp_get_tx function.


 /opt/suritest-tmp/bin/suricata --build-info 
This is Suricata version 6.0.1-dev (f3c59ef8a 2020-12-01)
SIMD support: SSE_4_2 SSE_4_1 SSE_3 
Atomic intrinsics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 4.2.1 Compatible Clang 7.0.1 (tags/RELEASE_701/final), C version 201112
compiled with _FORTIFY_SOURCE=0
L1 cache line size (CLS)=64
thread local storage method: _Thread_local
compiled with LibHTP v0.5.35, linked against LibHTP v0.5.35

Suricata Configuration:
  AF_PACKET support:                       yes
  eBPF support:                            yes
  XDP support:                             yes
  PF_RING support:                         no
  NFQueue support:                         no
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no 
  DAG enabled:                             no
  Napatech enabled:                        no
  WinDivert enabled:                       no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  hiredis support:                         no
  hiredis async with libevent:             no
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             yes, through luajit
  libluajit:                               yes
  GeoIP2 support:                          yes
  Non-bundled htp:                         no
  Hyperscan support:                       yes
  Libnet support:                          yes
  liblz4 support:                          yes

  Rust support:                            yes
  Rust strict mode:                        yes
  Rust compiler path:                      /root/.cargo/bin/rustc
  Rust compiler version:                   rustc 1.48.0 (7eac88abb 2020-11-16)
  Cargo path:                              /root/.cargo/bin/cargo
  Cargo version:                           cargo 1.48.0 (65cbdd2dc 2020-10-14)
  Cargo vendor:                            yes

  Python support:                          yes
  Python path:                             /usr/bin/python3
  Python distutils                         yes
  Python yaml                              yes
  Install suricatactl:                     yes
  Install suricatasc:                      yes
  Install suricata-update:                 not bundled

  Profiling enabled:                       no
  Profiling locks enabled:                 no

  Plugin support (experimental):           yes

Development settings:
  Coccinelle / spatch:                     no
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /opt/suritest-tmp
  Configuration directory:                 /opt/suritest-tmp/etc/suricata/
  Log directory:                           /opt/suritest-tmp/var/log/suricata/

  --prefix                                 /opt/suritest-tmp
  --sysconfdir                             /opt/suritest-tmp/etc
  --localstatedir                          /opt/suritest-tmp/var
  --datarootdir                            /opt/suritest-tmp/share

  Host:                                    x86_64-pc-linux-gnu
  Compiler:                                clang (exec name) / g++ (real)
  GCC Protect enabled:                     no
  GCC march native enabled:                yes
  GCC Profile enabled:                     no
  Position Independent Executable enabled: no
  CFLAGS                                   -ggdb3 -O0  -Wchar-subscripts -Wshadow -Wall -Wextra -Wno-unused-parameter -Wno-unused-function -Wno-unused-parameter -Wno-unused-function  -std=c11 -march=native -I${srcdir}/../rust/gen -I${srcdir}/../rust/dist
  PCAP_CFLAGS                               -I/usr/include


Related issues

Copied to Bug #4353: rs_dcerpc_udp_get_tx takes out unusual amount of CPUClosedShivani BhardwajActions

Updated by Peter Manev 7 months ago

Another perf top attempt to trace (attached)


Updated by Peter Manev 7 months ago

  • File deleted (Screenshot from 2020-12-02 23-41-52.png)

Updated by Peter Manev 7 months ago

  • Affected Versions git master added

Updated by Shivani Bhardwaj 6 months ago

  • Status changed from New to Assigned
  • Assignee set to Shivani Bhardwaj
  • Target version set to 7.0rc1

Updated by Peter Manev 6 months ago

On certain CPUs i can observe the following:


Updated by Shivani Bhardwaj 5 months ago

  • Priority changed from Normal to High

Updated by Peter Manev 5 months ago

It does not seem related to traditional ports traffic for dcerpc.
When the traditional dcerpc ports are negated, what is left is very few dcerpc logs per second - really low - it could be 40-50 events per minute.
Interesting observation is that the performance hit is related to a number of specific events appearing.
So the percentage of the function (rs_dcerpc_udp_get_tx) usage goes up and down and disappears with time and only climbs up after seeing a few - 5/15 "UNREPLIED" responses. The performance hit can be up to 70% of that thread usage it seems.



pcap shared privately.


Updated by Shivani Bhardwaj 4 months ago

  • Status changed from Assigned to Closed

Updated by Shivani Bhardwaj 4 months ago

  • Label Needs backport to 6.0 added

Updated by Shivani Bhardwaj 4 months ago

  • Copied to Bug #4353: rs_dcerpc_udp_get_tx takes out unusual amount of CPU added

Also available in: Atom PDF