Project

General

Profile

Bug #3075

RX thread hang in pcap-file mode

Added by WenTan Liu 4 months ago. Updated 28 days ago.

Status:
Feedback
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

based on suricata4.1.4, RX thread sometimes(always two days) hang, so RX can't read pcap file.

gstack RX_thread_id

#0 0x00007f0a98fe8945 in pthread_cond_wait@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005a19c2 in PacketPoolWait () at tmqh-packetpool.c:155
#2 0x000000000058afdd in PcapFileDispatch (ptv=ptv@entry=0x7f0a8f38f2b0) at source-pcap-file-helper.c:135
#3 0x0000000000588a1fin PcapDirectoryDispatchForTimeRange (older_than=0x7f0a09150a0, pv=0x7f0a8c030e70) at source-pcap-file-directory-helper.c:462
#4 PcapDirectoryDispatch (ptv=0x7f0a8c030e70) at source-pcap-file-directory-helper.c:530
#5 0x00000000005860c6 ReceivePcapFileLoop (tv=<optimized out>, data=0x7f0a8c030db0, slot=<optimized out>) at source-pcap-file.c:177
#6 0x00000000005a5b26 in TmThreadsSlotPktAcqLoop (td=0x9deedc0) at tm-threads.c:356
#4 0x00007f0a98fe4e25 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f0a9869834d in clone () from /lib64/libc.so.6


Files

suricata.yaml (73 KB) suricata.yaml WenTan Liu, 07/08/2019 10:22 AM

History

#1

Updated by Andreas Herz 3 months ago

  • Status changed from New to Feedback
  • Assignee set to OISF Dev
  • Target version changed from 4.1.5 to TBD

Can you give us more details about your setup?
(Linux, NIC, configuration, runmode, parameter)

#2

Updated by WenTan Liu 3 months ago

1. Centos 7.2
2. NIC Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection
3. runmode <autofp>
4. suricata -c suricata.yaml -r pcap_file_directory --pcap-file-continuous -l log_dir

#3

Updated by Victor Julien 3 months ago

Are you able to test the current git master? I made some fixes some time ago that might be related.

#4

Updated by Victor Julien 3 months ago

  • Priority changed from High to Normal
#5

Updated by Victor Julien 3 months ago

  • Description updated (diff)
#6

Updated by Feng Dai about 1 month ago

I got similar issue in suriata 4.0.6 with loading test of 400Mbps for 20 minutes. The RX thread doesn't receive any more packages.

#0  0x00007fb60082f945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000561bdf192b7a in PacketPoolWait ()
#2  0x0000561bdf178bd5 in ReceivePcapLoop ()
#3  0x0000561bdf1975e7 in TmThreadsSlotPktAcqLoop ()
#4  0x00007fb60082be25 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fb60013e34d in clone () from /lib64/libc.so.6

I got a fix to pass my load test. The pattern to use condition variable was not correct from my opinion. Please review if my fix makes sense. Thanks.

diff -Naurp --exclude tags suricata-4.0.6/src/tmqh-packetpool.c suricata-4.0.6-twutm1605/src/tmqh-packetpool.c
--- suricata-4.0.6/src/tmqh-packetpool.c    2018-11-06 03:01:46.000000000 -0600
+++ suricata-4.0.6-twutm1605/src/tmqh-packetpool.c  2019-09-13 15:41:20.673513665 -0500
@@ -149,10 +149,13 @@ void PacketPoolWait(void)
 {
     PktPool *my_pool = GetThreadPacketPool();

-    if (PacketPoolIsEmpty(my_pool)) {
+    if (!my_pool->head) {
+        /* local stack is empty */
         SCMutexLock(&my_pool->return_stack.mutex);
-        SC_ATOMIC_ADD(my_pool->return_stack.sync_now, 1);
-        SCCondWait(&my_pool->return_stack.cond, &my_pool->return_stack.mutex);
+        while (PacketPoolIsEmpty(my_pool)) {
+            SC_ATOMIC_ADD(my_pool->return_stack.sync_now, 1);
+            SCCondWait(&my_pool->return_stack.cond, &my_pool->return_stack.mutex);
+        }
         SCMutexUnlock(&my_pool->return_stack.mutex);
     }

@@ -323,8 +326,8 @@ void PacketPoolReturnPacket(Packet *p)
                 my_pool->pending_tail->next = pool->return_stack.head;
                 pool->return_stack.head = my_pool->pending_head;
                 SC_ATOMIC_RESET(pool->return_stack.sync_now);
-                SCMutexUnlock(&pool->return_stack.mutex);
                 SCCondSignal(&pool->return_stack.cond);
+                SCMutexUnlock(&pool->return_stack.mutex);
                 /* Clear the list of pending packets to return. */
                 my_pool->pending_pool = NULL;
                 my_pool->pending_head = NULL;
@@ -337,8 +340,8 @@ void PacketPoolReturnPacket(Packet *p)
             p->next = pool->return_stack.head;
             pool->return_stack.head = p;
             SC_ATOMIC_RESET(pool->return_stack.sync_now);
-            SCMutexUnlock(&pool->return_stack.mutex);
             SCCondSignal(&pool->return_stack.cond);
+            SCMutexUnlock(&pool->return_stack.mutex);
         }
     }
 }
@@ -395,8 +398,8 @@ void PacketPoolInit(void)
         PacketPoolStorePacket(p);
     }

-    //SCLogInfo("preallocated %"PRIiMAX" packets. Total memory %"PRIuMAX"",
-    //        max_pending_packets, (uintmax_t)(max_pending_packets*SIZE_OF_PACKET));
+    SCLogInfo("preallocated %"PRIiMAX" packets. Total memory %"PRIuMAX"",
+            max_pending_packets, (uintmax_t)(max_pending_packets*SIZE_OF_PACKET));
 }

 void PacketPoolDestroy(void)

#7

Updated by Andreas Herz 28 days ago

First of all please test it again with current versions, 4.0.6 is rather old.
If you want to commit your patch please follow those https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Contributing steps, thanks!

Also available in: Atom PDF