Project

General

Profile

Actions

Bug #6782

open

Crasher in HTTP chunked / StreamingBuffer

Added by Gianni Tedesco 3 months ago. Updated 5 days ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

Seeing similar crashes on multiple sites, looks like a heap corruption somewhere:

SIGSEGV
Thread 2 (Thread 0x7ffff5d5c6c0 (LWP 130) "W#01-eno1"):
#0  0x00007ffff7990507 in __memmove_avx_unaligned_erms_rtm () from /lib64/libc.so.6
#1  0x000055555587a2eb in StreamingBufferAppend ()
#2  0x000055555588d75a in HtpBodyAppendChunk ()
#3  0x0000555555792d28 in HTPCallbackRequestBodyData ()
#4  0x00007ffff7fa3779 in htp_hook_run_all () from /lib64/libhtp.so.2
#5  0x00007ffff7fb20eb in htp_req_run_hook_body_data () from /lib64/libhtp.so.2
#6  0x00007ffff7fac63d in htp_tx_req_process_body_data_ex () from /lib64/libhtp.so.2
#7  0x00007ffff7facacf in htp_connp_REQ_BODY_IDENTITY () from /lib64/libhtp.so.2
#8  0x00007ffff7fa844d in htp_connp_req_data () from /lib64/libhtp.so.2
#9  0x000055555578f7d2 in HTPHandleRequestData ()
#10 0x0000555555799066 in AppLayerParserParse ()
#11 0x0000555555786b0c in AppLayerHandleTCPData ()
#12 0x0000555555862335 in StreamTcpReassembleAppLayer ()
#13 0x0000555555863004 in StreamTcpReassembleHandleSegment ()
#14 0x00005555558510e6 in StreamTcpPacketStateEstablished.lto_priv.0 ()
#15 0x00005555558574b8 in StreamTcpStateDispatch ()
#16 0x000055555585a439 in StreamTcpPacket ()
#17 0x000055555581b9bf in FlowWorkerStreamTCPUpdate ()
#18 0x000055555581bd20 in FlowWorker.lto_priv.0 ()
#19 0x000055555576dab3 in TmThreadsSlotVarRun ()
#20 0x00005555558473af in AFPReadFromRingV3 ()
#21 0x0000555555848de4 in ReceiveAFPLoop.lto_priv.0 ()
#22 0x000055555576fd74 in TmThreadsSlotPktAcqLoop ()
#23 0x00007ffff78b7897 in start_thread () from /lib64/libc.so.6
#24 0x00007ffff793e674 in clone () from /lib64/libc.so.6
SIGSEGV
Thread 2 (Thread 0x7ffff5d5b6c0 (LWP 104) "W#01-eno1"):
#0  0x00007ffff798f3ed in __memmove_avx_unaligned_erms_rtm () from /lib64/libc.so.6
#1  0x00005555557dd712 in StreamingBufferAppendNoTrack ()
#2  0x00005555557c9c40 in AppendData ()
#3  0x00005555557ca2bb in FileOpenFile.lto_priv.0 ()
#4  0x00005555556f856c in HTPFileOpen ()
#5  0x00005555556f262e in HTPCallbackRequestBodyData ()
#6  0x00007ffff7fa2899 in htp_hook_run_all () from /lib64/libhtp.so.2
#7  0x00007ffff7fb1d0b in htp_req_run_hook_body_data () from /lib64/libhtp.so.2
#8  0x00007ffff7fab97d in htp_tx_req_process_body_data_ex () from /lib64/libhtp.so.2
#9  0x00007ffff7fabe0f in htp_connp_REQ_BODY_IDENTITY () from /lib64/libhtp.so.2
#10 0x00007ffff7fa73ed in htp_connp_req_data () from /lib64/libhtp.so.2
#11 0x00005555556ed082 in HTPHandleRequestData ()
#12 0x00005555556fa7d6 in AppLayerParserParse ()
#13 0x00005555556efcdc in AppLayerHandleTCPData ()
#14 0x00005555557c53c5 in StreamTcpReassembleAppLayer ()
#15 0x00005555557c6094 in StreamTcpReassembleHandleSegment ()
#16 0x00005555557b4176 in StreamTcpPacketStateEstablished.lto_priv.0 ()
#17 0x00005555557ba548 in StreamTcpStateDispatch ()
#18 0x00005555557bd4c9 in StreamTcpPacket ()
#19 0x000055555577d40f in FlowWorkerStreamTCPUpdate ()
#20 0x000055555577d770 in FlowWorker.lto_priv.0 ()
#21 0x00005555556ce998 in TmThreadsSlotVarRun ()
#22 0x00005555557aa9bf in AFPReadFromRingV3 ()
#23 0x00005555557ac414 in ReceiveAFPLoop.lto_priv.0 ()
#24 0x00005555556d0cc4 in TmThreadsSlotPktAcqLoop ()
#25 0x00007ffff78b6897 in start_thread () from /lib64/libc.so.6
#26 0x00007ffff793d674 in clone () from /lib64/libc.so.6
SIGABRT
Thread 2 (Thread 0x7ffff5d5b6c0 (LWP 104) "W#01-eno1"):
#0  0x00007ffff78b8834 in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x00007ffff78668ee in raise () from /lib64/libc.so.6
#2  0x00007ffff784e8ff in abort () from /lib64/libc.so.6
#3  0x00007ffff784f7d0 in __libc_message.cold () from /lib64/libc.so.6
#4  0x00007ffff78c27a5 in malloc_printerr () from /lib64/libc.so.6
#5  0x00007ffff78c5abc in _int_malloc () from /lib64/libc.so.6
#6  0x00007ffff78c7dae in calloc () from /lib64/libc.so.6
#7  0x00005555557b389c in StreamTcpPacketStateSynSent.lto_priv.0 ()
#8  0x00005555557ba4a0 in StreamTcpStateDispatch ()
#9  0x00005555557bd4c9 in StreamTcpPacket ()
#10 0x000055555577d40f in FlowWorkerStreamTCPUpdate ()
#11 0x000055555577d770 in FlowWorker.lto_priv.0 ()
#12 0x00005555556ce998 in TmThreadsSlotVarRun ()
#13 0x00005555557aa9bf in AFPReadFromRingV3 ()
#14 0x00005555557ac414 in ReceiveAFPLoop.lto_priv.0 ()
#15 0x00005555556d0cc4 in TmThreadsSlotPktAcqLoop ()
#16 0x00007ffff78b6897 in start_thread () from /lib64/libc.so.6
#17 0x00007ffff793d674 in clone () from /lib64/libc.so.6

Files

buildinfo.txt (4.28 KB) buildinfo.txt Gianni Tedesco, 02/15/2024 09:48 AM
suricata.conf (45.3 KB) suricata.conf Gianni Tedesco, 02/16/2024 04:21 AM
StreamingBufferAppend_crash.txt (10.6 KB) StreamingBufferAppend_crash.txt Gianni Tedesco, 04/05/2024 02:41 AM
StreamingBufferInit_crash.txt (10.5 KB) StreamingBufferInit_crash.txt Gianni Tedesco, 04/05/2024 02:41 AM
Suricata HTTP Crasher (Bug #6782).docx (219 KB) Suricata HTTP Crasher (Bug #6782).docx Richard McConnell, 05/15/2024 08:42 AM
Actions #1

Updated by Gianni Tedesco 3 months ago

  • Affected Versions 7.0.1 added
Actions #2

Updated by Victor Julien 3 months ago

Do you have a reproducer? Which libhtp version do you see this with? Do you see this only with libhtp 0.5.46?

Actions #3

Updated by Gianni Tedesco 3 months ago

Ran with:

/usr/sbin/suricata -c suricata.conf --init-errors-fatal -k none -l /var/run --af-packet=eno1

Nothing in the log except the usual startup guff. Working on getting the config.

We don't have a reproducer, our best guess at the moment is to maybe get a pcap of HTTP traffic, but not sure if that's the way to go? It involves reaching out to affected customers and asking them to do it, so we can't directly get that.

The version of libhtp we built with is just whatever shipped with the release, we're building from release tags on github, first time we saw this was with 7.0.1, and we've just upgraded to 7.0.3 to repro it.

Actions #4

Updated by Victor Julien 3 months ago

Ok, thank you. That means it's not a new issue with libhtp 0.5.46.

Actions #5

Updated by Gianni Tedesco 3 months ago

Actions #6

Updated by Gianni Tedesco 2 months ago

A bit of extra context here. The systems this is happening on, it's happening pretty regularly (eg. every 10 minutes), the issue is that they're on 10GB NICs, which are almost fully saturated with traffic, single threaded (due to broadcom NICs having no symmetric RSS), and the drop rate for suri is often around 10-15% so i expect there's a lot of missing packets in the stream and stuff like that. Not sure if that's helpful or not.

Actions #7

Updated by Victor Julien 2 months ago

I think it would be useful to do a symbols enabled build and get a full bt, or even an ASAN enabled build.

Updated by Gianni Tedesco about 2 months ago

Ran with ASAN and debug compile and got the following output, not sure much more helpful it is than previous backtrace:

-Og -fno-inline-functions -fno-lto

Actions #9

Updated by Richard McConnell 6 days ago · Edited

edit by Victor: putting the contents of the file here.

I have attached a file with a summary of an investigation I completed regarding this crasher. Could you review this and offer your opinions on the proposed fix? It may not be sufficient or desired by you guys! I have tested this in isolation, with a single pcap from a customer, so I am unsure if it's acceptable given the complexity of the engine. If this fix or a combination of this with adjustments are required, we are happy to complete the work and submit a PR!

Introduction
As described in https://redmine.openinfosecfoundation.org/issues/6782 we have a number of customers experiencing Suricata crashes. The backtrace from these crashes point towards HTTP traffic. We have been supplied with a customer pcap as a reproducer and have conducted an investigation into the reason for this crash. Suricata was built and installed with AddressSanitizer and run under gdb and valgrind during this investigation.
Findings
The following output was produced during the segfault:

==83833==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x627000145100 at pc 0x7ffff786e9ef bp 0x7fffe29fe6e0 sp 0x7fffe29fdea0
WRITE of size 2920 at 0x627000145100 thread T1 (W#01)
    #0 0x7ffff786e9ee in __interceptor_memcpy (/lib64/libasan.so.8+0x6e9ee) (BuildId: 2b657470ea196ba4342e3bd8a3cc138b1e200599)
    #1 0x7c1295 in memcpy /usr/include/bits/string_fortified.h:29
    #2 0x7c1295 in StreamingBufferAppend /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/util-streaming-buffer.c:1087
    #3 0x7fc694 in HtpBodyAppendChunk /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/app-layer-htp-body.c:71
    #4 0x579cb4 in HTPCallbackResponseBodyData /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/app-layer-htp.c:2026
    #5 0x7ffff7f67e43 in htp_hook_run_all /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/libhtp/htp/htp_hooks.c:127
    #6 0x7ffff7f8223d in htp_tx_res_process_body_data_ex /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/libhtp/htp/htp_transaction.c:1005
    #7 0x7ffff7f7611d in htp_connp_RES_BODY_IDENTITY_CL_KNOWN /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/libhtp/htp/htp_response.c:490
    #8 0x7ffff7f7bcb4 in htp_connp_res_data (/lib64/libhtp.so.2+0x2ecb4) (BuildId: 01dcb6a8931edaab9119620e495b29935cfceab9)
    #9 0x5803d9 in HTPHandleResponseData /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/app-layer-htp.c:970
    #10 0x58dad4 in AppLayerParserParse /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/app-layer-parser.c:1403
    #11 0x56514c in AppLayerHandleTCPData /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/app-layer.c:787
    #12 0x772c67 in ReassembleUpdateAppLayer /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp-reassemble.c:1328
    #13 0x775ed9 in StreamTcpReassembleAppLayer /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp-reassemble.c:1391
    #14 0x776008 in StreamTcpReassembleHandleSegmentUpdateACK /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp-reassemble.c:1949
    #15 0x777602 in StreamTcpReassembleHandleSegment /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp-reassemble.c:1997
    #16 0x752450 in HandleEstablishedPacketToServer /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp.c:2666
    #17 0x75545b in StreamTcpPacketStateEstablished /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp.c:3209
    #18 0x766bf0 in StreamTcpStateDispatch /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp.c:5236
    #19 0x767670 in StreamTcpPacket /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp.c:5433
    #20 0x767c51 in StreamTcp /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/stream-tcp.c:5745
    #21 0x6cc42c in FlowWorkerStreamTCPUpdate /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/flow-worker.c:391
    #22 0x6cda1e in FlowWorker /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/flow-worker.c:607
    #23 0x52c653 in TmThreadsSlotVarRun /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/tm-threads.c:135
    #24 0x73ba89 in TmThreadsSlotProcessPkt /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/tm-threads.h:200
    #25 0x73be4d in PcapFileCallbackLoop /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/source-pcap-file-helper.c:108
    #26 0x7ffff74b15ae in pcap_offline_read (/lib64/libpcap.so.1+0x2d5ae) (BuildId: b2c27ef72665c5895f38a2abd1ea7e63b2962439)
    #27 0x73c7a7 in PcapFileDispatch /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/source-pcap-file-helper.c:153
    #28 0x737e0e in ReceivePcapFileLoop /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/source-pcap-file.c:180
    #29 0x52f73f in TmThreadsSlotPktAcqLoop /usr/src/debug/suricata-rapid7-7.0.3-1.fc39.x86_64/src/tm-threads.c:318
    #30 0x7ffff70ac896 in start_thread (/lib64/libc.so.6+0x8e896) (BuildId: 0d710e9d9dc10c500b8119c85da75004183618e2)


The following conclusions were made from investigating further:
• Comparing http.memcap between our customer suricata.yaml and the installed suricata.yaml showed we do not use the default (unlimited) http.memcap, rather cap this to 512mb.
• HTPConfigure() defines:
◦ htp_sbcfg.Calloc = HTPCalloc()
◦ htp_sbcfg.Realloc = HTPRealloc()
◦ htp_sbcfg.Free = HTPFree()
• The above functions return NULL if there is no memory left or malloc/calloc fails
• GrowRegionToSize() returns sc_errno if REALLOC (a macro that uses the above functions) returns NULL
• InitBuffer() returns sc_errno if CALLOC (a macro that uses the above functions) returns NULL
• The following three functions do not set sc_errno upon error return:
◦ HTPCalloc()
◦ HTPRealloc()
◦ HTPFree()

Given the above findings, I was able to test and confirm there is a scenario where callers to the above functions return ‘0’ where coincidentally SC_OK = 0; The two functions of concern here are:
• GrowRegionToSize()
• InitBuffer()

And therefore the callers to these receive SC_OK and continue, resulting in the segmentation fault shown above. The scenario is simple:
1. Set a low http.memcap (1mb for my test, but also failed at 512mb)
2. Pass HTTP traffic that requires reassembly to Suricata until there is no memory left
3. Pass more HTTP traffic that requires reassembly to Suricata

Once a new packet/message arrives it will fail to allocate memory resulting in a return NULL; from the allocator(s). Given the above findings, a return NULL; from these functions results in the caller (in this case StreamingBufferAppend()) being provided with a ‘0’ or SC_OK.

To test this theory, I updated the above allocators to set sc_errno = SC_ENOMEM; upon return NULL; (see the diff below). With these changes Suricata was able to process the file without segmentation faults with http.memcap = 1mb.

diff --git a/src/app-layer-htp-mem.c b/src/app-layer-htp-mem.c
index bd9b79f67..d61b42cb2 100644
--- a/src/app-layer-htp-mem.c
+++ b/src/app-layer-htp-mem.c
@@ -136,13 +136,17 @@ void *HTPMalloc(size_t size)
 {
     void *ptr = NULL;

-    if (HTPCheckMemcap((uint32_t)size) == 0)
-        return NULL;
+    if (HTPCheckMemcap((uint32_t)size) == 0) {
+       sc_errno = SC_ENOMEM;
+       return NULL;
+    }

     ptr = SCMalloc(size);

-    if (unlikely(ptr == NULL))
+    if (unlikely(ptr == NULL)) {
+        sc_errno = SC_ENOMEM;
         return NULL;
+    }

     HTPIncrMemuse((uint64_t)size);

@@ -153,13 +157,17 @@ void *HTPCalloc(size_t n, size_t size)
 {
     void *ptr = NULL;

-    if (HTPCheckMemcap((uint32_t)(n * size)) == 0)
+    if (HTPCheckMemcap((uint32_t)(n * size)) == 0) {
+        sc_errno = SC_ENOMEM;
         return NULL;
+    }

     ptr = SCCalloc(n, size);

-    if (unlikely(ptr == NULL))
+    if (unlikely(ptr == NULL)) {
+        sc_errno = SC_ENOMEM;
         return NULL;
+    }

     HTPIncrMemuse((uint64_t)(n * size));

@@ -169,13 +177,17 @@ void *HTPCalloc(size_t n, size_t size)
 void *HTPRealloc(void *ptr, size_t orig_size, size_t size)
 {
     if (size > orig_size) {
-        if (HTPCheckMemcap((uint32_t)(size - orig_size)) == 0)
+        if (HTPCheckMemcap((uint32_t)(size - orig_size)) == 0) {
+            sc_errno = SC_ENOMEM;
             return NULL;
+       }
     }

     void *rptr = SCRealloc(ptr, size);
-    if (rptr == NULL)
+    if (rptr == NULL) {
+        sc_errno = SC_ENOMEM;
         return NULL;
+    }

Conclusion
These findings reduce to a failure of setting sc_errno upon error return by the HTPAllocators (above). This fix may not be a complete fix for Suricata to correctly operate given additional functionality (functionality I’m unfamiliar with) within the system. These changes were completed to prove the above theory only. However, by reading ReallocFunc() and CallocFunc() in util-streaming-buffer.c - the default allocators if none are defined - they set sc_errno = SC_ENOMEM; upon error, so it’s possible this fix may be acceptable?

Options moving forward are:
1. We set http.memcap to default (unlimited)
2. A fix is produced for the above issue

Actions #10

Updated by Victor Julien 6 days ago · Edited

Thanks @Richard McConnell this is really helpful. I've now quite easily been able to reproduce the issue. I think the diff is a good start, but would also need updates to other users of the StreamingBuffer API. E.g. FTPCalloc/FTPRealloc would need updates as well. I haven't done a full scan.

Btw, a quick and dirty way to address this, appears to be

diff --git a/src/util-streaming-buffer.c b/src/util-streaming-buffer.c
index 7ef6f58e2..892db7068 100644
--- a/src/util-streaming-buffer.c
+++ b/src/util-streaming-buffer.c
@@ -721,6 +721,8 @@ static inline int WARN_UNUSED GrowRegionToSize(StreamingBuffer *sb,

     void *ptr = REALLOC(cfg, region->buf, region->buf_size, grow);
     if (ptr == NULL) {
+           if (sc_errno == SC_OK)
+                   sc_errno = SC_ENOMEM;
         return sc_errno;
     }
     /* for safe printing and general caution, lets memset the


I would prefer the updates to the various allocators, but we may need this as a safety net as well.

Can you cook up a PR?

Actions #11

Updated by Richard McConnell 6 days ago

Hey @Victor Julien, glad I was able to help you reproduce this error. I'll look into this and create the PR! My aim will be to keep what I proposed but additionally add the same functionality to the functions you mentioned. I'll check if similar functions require this update too, and update as appropriate. Lastly, to be safe I'll add the safety net you proposed above, thanks.

Actions #12

Updated by Richard McConnell 5 days ago

Hi @Victor Julien, I've added the PR: https://github.com/OISF/suricata/pull/11090 for your viewing. I believe the only two allocation functions that require updating are HTP/FTP, but I could be wrong! I have added the safety net as you suggested as well. Finally, I have written a unit test to go alongside the changes to show the error reporting is as we expect now. Hopefully it's good to go!

Actions

Also available in: Atom PDF