Project

General

Profile

Actions

Bug #2993

closed
VJ

Suricata 5.0.0beta1 memory allocation of 4294966034 bytes failed

Bug #2993: Suricata 5.0.0beta1 memory allocation of 4294966034 bytes failed

Added by Anonymous almost 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

Today suricata stopped after running for almost 9h:
May 27 12:05:22 server suricata28762: memory allocation of 4294966034 bytes failed
May 27 12:05:25 server systemd1: suricata.service: main process exited, code=killed, status=6/ABRT
May 27 12:05:25 server systemd1: Unit suricata.service entered failed state.
May 27 12:05:25 server systemd1: suricata.service failed.

Could not find a core file though I configured a home directory and core dump in suricata.yaml
As far as I can see the system was not running out of any resources.

  1. suricata -V
    This is Suricata version 5.0.0-beta1 RELEASE
  1. ldd `which suricata`
    linux-vdso.so.1 => (0x00007ffd769c7000)
    libhtp.so.2 => /lib/libhtp.so.2 (0x00007fd563c32000)
    librt.so.1 => /lib64/librt.so.1 (0x00007fd563a2a000)
    libm.so.6 => /lib64/libm.so.6 (0x00007fd563728000)
    liblz4.so.1 => /lib64/liblz4.so.1 (0x00007fd563513000)
    libGeoIP.so.1 => /lib64/libGeoIP.so.1 (0x00007fd5632e3000)
    libmagic.so.1 => /lib64/libmagic.so.1 (0x00007fd5630c6000)
    libcap-ng.so.0 => /lib64/libcap-ng.so.0 (0x00007fd562ec0000)
    libpfring.so.1 => /usr/local/lib/libpfring.so.1 (0x00007fd562c28000)
    libpcap.so.1 => /usr/local/lib/libpcap.so.1 (0x00007fd562955000)
    libnet.so.1 => /lib64/libnet.so.1 (0x00007fd56273b000)
    libjansson.so.4 => /lib64/libjansson.so.4 (0x00007fd56252e000)
    libyaml-0.so.2 => /lib64/libyaml-0.so.2 (0x00007fd56230e000)
    libhs.so.5 => /usr/local/lib64/libhs.so.5 (0x00007fd5619f5000)
    libpcre.so.1 => /lib64/libpcre.so.1 (0x00007fd561793000)
    libz.so.1 => /lib64/libz.so.1 (0x00007fd56157d000)
    liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fd561357000)
    libssl3.so => /lib64/libssl3.so (0x00007fd561105000)
    libsmime3.so => /lib64/libsmime3.so (0x00007fd560ede000)
    libnss3.so => /lib64/libnss3.so (0x00007fd560bb1000)
    libnssutil3.so => /lib64/libnssutil3.so (0x00007fd560981000)
    libplds4.so => /lib64/libplds4.so (0x00007fd56077d000)
    libplc4.so => /lib64/libplc4.so (0x00007fd560578000)
    libnspr4.so => /lib64/libnspr4.so (0x00007fd56033a000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fd56011e000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007fd55ff1a000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fd55fd04000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fd55f937000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fd563e51000)
    libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fd55f630000)

Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 7.6 (Maipo)
Release: 7.6
Codename: Maipo
Kernel 3.10.0-957.12.1.el7.x86_64

pfring : 7.5.0 dev


Files

suricata.yaml (75.8 KB) suricata.yaml Anonymous, 05/28/2019 08:25 AM

PM Updated by Peter Manev almost 7 years ago Actions #1

Hi,
Thank you for the report!
Was this a default config run or there were changes made to the memcap/ring settings ?

Updated by Anonymous almost 7 years ago Actions #2

Hi,
  1. cat /etc/modprobe.d/pf_ring.conf
    options pf_ring transparent_mode=0 min_num_slots=32767 enable_tx_capture=0

AH Updated by Andreas Herz almost 7 years ago Actions #3

  • Assignee set to OISF Dev
  • Target version set to Support

Can you share some more details? Like suricata --build-info, the configuration file and it would be perfect if you can reproduce it with a coredump.

At https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Debugging you can find more information how to debug this.

Updated by Anonymous almost 7 years ago Actions #4

Config file attached, maybe the settings are not correct to create a core dump because I was not able to find one after the crash yesterday?


# suricata --build-info
This is Suricata version 5.0.0-beta1 RELEASE
Features: PCAP_SET_BUFF PF_RING AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LIBJANSSON TLS MAGIC RUST 
SIMD support: SSE_4_2 SSE_4_1 SSE_3 
Atomic intrisics: 1 2 4 8 16 byte(s)
64-bits, Little-endian architecture
GCC version 4.8.5 20150623 (Red Hat 4.8.5-36), C version 199901
compiled with -fstack-protector
compiled with _FORTIFY_SOURCE=2
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.30, linked against LibHTP v0.5.30

Suricata Configuration:
  AF_PACKET support:                       yes
  eBPF support:                            no
  XDP support:                             no
  PF_RING support:                         yes
  NFQueue support:                         no
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no 
  DAG enabled:                             no
  Napatech enabled:                        no
  WinDivert enabled:                       no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  liblzma support:                         yes
  hiredis support:                         no
  hiredis async with libevent:             no
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             no
  libluajit:                               no
  libgeoip:                                yes
  Non-bundled htp:                         no
  Old barnyard2 support:                   no
  Hyperscan support:                       yes
  Libnet support:                          yes
  liblz4 support:                          yes

  Rust support:                            yes
  Rust strict mode:                        no
  Rust debug mode:                         no
  Rust compiler:                           rustc 1.34.0
  Rust cargo:                              cargo 1.34.0

  Python support:                          yes
  Python path:                             /bin/python2.7
  Python version:                          
  Python distutils                         yes
  Python yaml                              yes
  Install suricatactl:                     yes
  Install suricatasc:                      yes
  Install suricata-update:                 yes

  Profiling enabled:                       no
  Profiling locks enabled:                 no

Development settings:
  Coccinelle / spatch:                     no
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /usr
  Configuration directory:                 /etc/suricata/
  Log directory:                           /var/log/suricata/

  --prefix                                 /usr
  --sysconfdir                             /etc
  --localstatedir                          /var
  --datarootdir                            /usr/share

  Host:                                    x86_64-pc-linux-gnu
  Compiler:                                gcc (exec name) / gcc (real)
  GCC Protect enabled:                     yes
  GCC march native enabled:                yes
  GCC Profile enabled:                     no
  Position Independent Executable enabled: no
  CFLAGS                                   -g -O2 -march=native -I${srcdir}/../rust/gen/c-headers
  PCAP_CFLAGS                               -I/usr/local/include
  SECCFLAGS                                -fstack-protector -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security

PM Updated by Peter Manev almost 7 years ago Actions #5

How much memory you have on the box btw in total? (and is it only suricata running or something else that can be resource intensive ?)

Updated by Anonymous almost 7 years ago Actions #6

64Gb RAM available and indeed an ELK stack consumes almost half of the available RAM, but this setup has been running for years quite stable. On the other we want/need to move the Elasticsearch setup to a newly to build cluster.

PM Updated by Peter Manev almost 7 years ago Actions #7

I think it is a good candidate reason in some conditions (aka an intense ES search) for not being able to have that needed extra allocation. In general yes - could be better if things are separated as i suspect resource contention would also occur(if not yet) in terms of CPU too.

VJ Updated by Victor Julien almost 7 years ago Actions #8

I don't think this has anything to do with the system or what else is running on it. The error suggests suri tried to alloc almost 4GB in a single malloc, which is certainly a bug. It smells like an int underflow.

Updated by Anonymous almost 7 years ago Actions #9

Indeed, but I had a look at some system resources when the crash occured, but neither Zabbix nor sar did mention lack of memory or cpu. Oke is is a busy system, but that keeps the dust away ;)

PM Updated by Peter Manev almost 7 years ago Actions #10

Does it happen repeatedly ?

Updated by Anonymous almost 7 years ago Actions #11

Running v5 since May second, it happend twice so far:

2019-05-27T12:05:22.331122+02:00 server suricata: memory allocation of 4294966034 bytes failed
2019-05-21T09:44:12.279565+02:00 server suricata: memory allocation of 4294926257 bytes failed

Suricata is restarted daily to rotate logs.

PM Updated by Peter Manev almost 7 years ago Actions #12

Is it possible to correlate the sys/ram utilization at that time ? Just to make sure it is not sys/mem related I was thinking.

Updated by Anonymous almost 7 years ago Actions #13

May 27 crash at 12:05 :


                CPU     %user     %nice   %system   %iowait    %steal     %idle
11:30:02 AM     all     38.47      5.57     12.01      3.43      0.00     40.53
11:40:02 AM     all     37.36      5.61     11.57      4.06      0.00     41.40
11:50:01 AM     all     38.84      5.73     11.63      4.01      0.00     39.79
12:00:02 PM     all     38.99      5.63     12.03      3.78      0.00     39.57
12:10:01 PM     all     17.96      2.53      5.88      2.80      0.00     70.82
12:20:01 PM     all      1.27      0.08      0.46      0.17      0.00     98.02
12:30:01 PM     all      1.06      0.07      0.43      0.05      0.00     98.39

            kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
11:30:02 AM    429940  65502432     99.35      2784   2184880  42437912     57.10  27393028   3474244     87552
11:40:02 AM    434328  65498044     99.34      2732   2262996  42416296     57.07  27314036   3541212     26828
11:50:01 AM    533528  65398844     99.19      2568   1451616  42286528     56.90  27520568   3257140     27148
12:00:02 PM    598556  65333816     99.09      2700   1085900  42580564     57.29  27543920   3160296     78272
12:10:01 PM  26531992  39400380     59.76      6572   4556032  35589444     47.89   2030348   4324928      2220
12:20:01 PM  25655172  40277200     61.09     23952   5361284  35609344     47.91   2549072   4669684      3396
12:30:01 PM  25304540  40627832     61.62     24840   5701996  35586908     47.88   2782596   4777068      2944

             kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
11:30:02 AM   2970544   5418060     64.59    232280      4.29
11:40:02 AM   2714704   5673900     67.64    229024      4.04
11:50:01 AM   2390440   5998164     71.50    249900      4.17
12:00:02 PM   2020524   6368080     75.91    220356      3.46
12:10:01 PM   3570620   4817984     57.43     91680      1.90
12:20:01 PM   3576252   4812352     57.37    121428      2.52
12:30:01 PM   3579580   4809024     57.33    124324      2.59

May 21 crash at 09:44 :


                CPU     %user     %nice   %system   %iowait    %steal     %idle
09:00:01 AM     all     33.32      7.42      9.71      2.61      0.00     46.94
09:10:01 AM     all     32.51      6.73     10.10      2.17      0.00     48.49
09:20:02 AM     all     32.09      6.75     10.27      2.50      0.00     48.40
09:30:02 AM     all     32.20      6.41      9.55      2.37      0.00     49.46
09:40:02 AM     all     34.08      6.48     10.17      4.98      0.00     44.29
09:50:01 AM     all     16.93      2.75      5.31      2.47      0.00     72.54
10:00:02 AM     all      0.94      0.08      0.45      0.05      0.00     98.49

            kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
09:00:01 AM    684396  65247976     98.96      6620  12489868  39250032     52.81  21870676   8354172    313188
09:10:01 AM    768752  65163620     98.83      6884  10145636  40047656     53.88  22893100   7287660    417328
09:20:02 AM    560148  65372224     99.15      6848   7709440  40094496     53.95  24282280   6187248    390432
09:30:02 AM    669136  65263236     98.99      5820   6772564  39834916     53.60  24672960   5765980    261164
09:40:02 AM    436924  65495448     99.34      2392    774276  40230372     54.13  27868580   3037652     32116
09:50:01 AM  29049272  36883100     55.94     25652   2155684  35554456     47.84   2226176   1659824      1304
10:00:02 AM  28781336  37151036     56.35     26612   2405776  35554344     47.84   2534504   1611564        96

Updated by Anonymous almost 7 years ago Actions #14


             kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
09:00:01 AM   3479724   4908880     58.52    301672      6.15
09:10:01 AM   3473240   4915364     58.60    300224      6.11
09:20:02 AM   3468888   4919716     58.65    278480      5.66
09:30:02 AM   3457368   4931236     58.78    240940      4.89
09:40:02 AM   2628548   5760056     68.67    122640      2.13
09:50:01 AM   3331300   5057304     60.29    125436      2.48
10:00:02 AM   3338468   5050136     60.20    128456      2.54

AH Updated by Andreas Herz almost 7 years ago Actions #15

Might be worth to test AF_PACKET instead of PF_RING and a more recent kernel maybe?

VJ Updated by Victor Julien over 6 years ago Actions #16

I think it would be good to set a maximum to what a single allocation can try to get. Determining what that max would be is of course the trick.

VJ Updated by Victor Julien over 6 years ago Actions #17

Possibly related:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007fb803076801 in __GI_abort () at abort.c:79
#2  0x00005619886e5367 in std::sys::unix::abort_internal ()
#3  0x00005619886f39fd in rust_oom ()
#4  0x0000561988708b67 in alloc::alloc::handle_alloc_error ()
#5  0x000056198860a181 in alloc::raw_vec::RawVec<T,A>::allocate_in (cap=4294965836, zeroed=<optimized out>, a=...) at /usr/src/rustc-1.37.0/src/liballoc/raw_vec.rs:99
#6  alloc::raw_vec::RawVec<T>::with_capacity_zeroed (cap=4294965836) at /usr/src/rustc-1.37.0/src/liballoc/raw_vec.rs:145
#7  <u8 as alloc::vec::SpecFromElem>::from_elem (elem=0, n=4294965836) at /usr/src/rustc-1.37.0/src/liballoc/vec.rs:1635
#8  alloc::vec::from_elem (elem=0, n=4294965836) at /usr/src/rustc-1.37.0/src/liballoc/vec.rs:1614
#9  suricata::smb::smb::SMBState::parse_tcp_data_tc_gap (self=0x7fb7f47b6630, gap_size=<optimized out>) at src/smb/smb.rs:1734
#10 rs_smb_parse_response_tcp_gap (state=state@entry=0x7fb7f47b6630, input_len=<optimized out>, input_len@entry=4294965836) at src/smb/smb.rs:1864
#11 0x00005619883f766e in SMBTCPParseResponse (f=0x56198a88b380, state=0x7fb7f47b6630, pstate=0x7fb7f4611760, input=0x0, input_len=4294965836, local_data=0x0, flags=24 '\030') at app-layer-smb.c:68
#12 0x00005619883f6e06 in AppLayerParserParse (tv=tv@entry=0x56199f507920, alp_tctx=0x7fb7f429eeb0, f=f@entry=0x56198a88b380, alproto=8, flags=<optimized out>, input=input@entry=0x0,
    input_len=input_len@entry=4294965836) at app-layer-parser.c:1225
#13 0x00005619883b1cd6 in AppLayerHandleTCPData (tv=tv@entry=0x56199f507920, ra_ctx=ra_ctx@entry=0x7fb7f429ea90, p=p@entry=0x7fb7f4268180, f=0x56198a88b380, ssn=ssn@entry=0x7fb7f4359620,
    stream=stream@entry=0x7fb7fd3ae748, data=<optimized out>, data@entry=0x0, data_len=<optimized out>, flags=<optimized out>) at app-layer.c:601
#14 0x00005619885483d0 in ReassembleUpdateAppLayer (dir=<optimized out>, p=<optimized out>, stream=<optimized out>, ssn=<optimized out>, ra_ctx=<optimized out>, tv=<optimized out>)
    at stream-tcp-reassemble.c:1012
#15 StreamTcpReassembleAppLayer (tv=tv@entry=0x56199f507920, ra_ctx=ra_ctx@entry=0x7fb7f429ea90, ssn=ssn@entry=0x7fb7f4359620, stream=<optimized out>, stream@entry=0x7fb7f4359630, p=p@entry=0x7fb7f4268180,
    dir=dir@entry=UPDATE_DIR_PACKET) at stream-tcp-reassemble.c:1123
#16 0x000056198854905c in StreamTcpReassembleHandleSegment (tv=tv@entry=0x56199f507920, ra_ctx=0x7fb7f429ea90, ssn=ssn@entry=0x7fb7f4359620, stream=0x7fb7f4359630, p=p@entry=0x7fb7f4268180,
    pq=pq@entry=0x7fb7f429e738) at stream-tcp-reassemble.c:1788
#17 0x000056198853e33f in HandleEstablishedPacketToClient (stt=<optimized out>, pq=<optimized out>, p=<optimized out>, ssn=<optimized out>, tv=<optimized out>) at stream-tcp.c:2408
#18 StreamTcpPacketStateEstablished (tv=0x56199f507920, p=0x7fb7f4268180, stt=stt@entry=0x7fb7f429e730, ssn=0x7fb7f4359620, pq=0x7fb7f429e738) at stream-tcp.c:2645
#19 0x0000561988541870 in StreamTcpStateDispatch (tv=0x56199f507920, p=0x7fb7f4268180, stt=0x7fb7f429e730, ssn=0x7fb7f4359620, pq=0x7fb7f429e738, state=<optimized out>) at stream-tcp.c:4650
#20 0x0000561988543bd9 in StreamTcpPacket (tv=0x56199f507920, p=0x7fb7f4268180, stt=0x7fb7f429e730, pq=0x7fb7f428d5f8) at stream-tcp.c:4838
#21 0x0000561988544505 in StreamTcp (tv=tv@entry=0x56199f507920, p=p@entry=0x7fb7f4268180, data=<optimized out>, pq=pq@entry=0x7fb7f428d5f8, postpq=postpq@entry=0x0) at stream-tcp.c:5174
#22 0x00005619884cc4b2 in FlowWorker (tv=0x56199f507920, p=0x7fb7f4268180, data=0x7fb7f428d5d0, preq=0x56199f507d00, unused=<optimized out>) at flow-worker.c:245
#23 0x000056198855308c in TmThreadsSlotVarRun (tv=tv@entry=0x56199f507920, p=p@entry=0x7fb7f4268180, slot=slot@entry=0x56199f507b80) at tm-threads.c:130
#24 0x000056198852ae4e in TmThreadsSlotProcessPkt (p=0x7fb7f4268180, s=0x56199f507b80, tv=0x56199f507920) at tm-threads.h:162
#25 AFPReadFromRing (ptv=<optimized out>) at source-af-packet.c:1025
#26 0x000056198852ecd6 in ReceiveAFPLoop (tv=0x56199f507920, data=<optimized out>, slot=<optimized out>) at source-af-packet.c:1589
#27 0x000056198855430d in TmThreadsSlotPktAcqLoop (td=0x56199f507920) at tm-threads.c:335
#28 0x00007fb803cf36db in start_thread (arg=0x7fb7fd3b0700) at pthread_create.c:463
#29 0x00007fb80315788f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

VJ Updated by Victor Julien over 6 years ago Actions #18

  • Status changed from New to Assigned
  • Assignee changed from OISF Dev to Victor Julien
  • Target version changed from Support to 5.0.2
  • Affected Versions 5.0.0, 5.0.1 added

VJ Updated by Victor Julien over 6 years ago Actions #19

  • Status changed from Assigned to Closed
Actions

Also available in: PDF Atom