Project

General

Profile

Actions

Bug #3075

closed

RX thread hang in pcap-file mode

Added by WenTan Liu about 3 years ago. Updated 8 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

based on suricata4.1.4, RX thread sometimes(always two days) hang, so RX can't read pcap file.

gstack RX_thread_id

#0 0x00007f0a98fe8945 in pthread_cond_wait@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005a19c2 in PacketPoolWait () at tmqh-packetpool.c:155
#2 0x000000000058afdd in PcapFileDispatch (ptv=ptv@entry=0x7f0a8f38f2b0) at source-pcap-file-helper.c:135
#3 0x0000000000588a1fin PcapDirectoryDispatchForTimeRange (older_than=0x7f0a09150a0, pv=0x7f0a8c030e70) at source-pcap-file-directory-helper.c:462
#4 PcapDirectoryDispatch (ptv=0x7f0a8c030e70) at source-pcap-file-directory-helper.c:530
#5 0x00000000005860c6 ReceivePcapFileLoop (tv=<optimized out>, data=0x7f0a8c030db0, slot=<optimized out>) at source-pcap-file.c:177
#6 0x00000000005a5b26 in TmThreadsSlotPktAcqLoop (td=0x9deedc0) at tm-threads.c:356
#4 0x00007f0a98fe4e25 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f0a9869834d in clone () from /lib64/libc.so.6


Files

suricata.yaml (73 KB) suricata.yaml WenTan Liu, 07/08/2019 10:22 AM
Actions #1

Updated by Andreas Herz about 3 years ago

  • Status changed from New to Feedback
  • Assignee set to OISF Dev
  • Target version changed from 4.1.5 to TBD

Can you give us more details about your setup?
(Linux, NIC, configuration, runmode, parameter)

Actions #2

Updated by WenTan Liu about 3 years ago

1. Centos 7.2
2. NIC Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection
3. runmode <autofp>
4. suricata -c suricata.yaml -r pcap_file_directory --pcap-file-continuous -l log_dir

Actions #3

Updated by Victor Julien about 3 years ago

Are you able to test the current git master? I made some fixes some time ago that might be related.

Actions #4

Updated by Victor Julien about 3 years ago

  • Priority changed from High to Normal
Actions #5

Updated by Victor Julien about 3 years ago

  • Description updated (diff)
Actions #6

Updated by Feng Dai about 3 years ago

I got similar issue in suriata 4.0.6 with loading test of 400Mbps for 20 minutes. The RX thread doesn't receive any more packages.

#0  0x00007fb60082f945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000561bdf192b7a in PacketPoolWait ()
#2  0x0000561bdf178bd5 in ReceivePcapLoop ()
#3  0x0000561bdf1975e7 in TmThreadsSlotPktAcqLoop ()
#4  0x00007fb60082be25 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fb60013e34d in clone () from /lib64/libc.so.6

I got a fix to pass my load test. The pattern to use condition variable was not correct from my opinion. Please review if my fix makes sense. Thanks.

diff -Naurp --exclude tags suricata-4.0.6/src/tmqh-packetpool.c suricata-4.0.6-twutm1605/src/tmqh-packetpool.c
--- suricata-4.0.6/src/tmqh-packetpool.c    2018-11-06 03:01:46.000000000 -0600
+++ suricata-4.0.6-twutm1605/src/tmqh-packetpool.c  2019-09-13 15:41:20.673513665 -0500
@@ -149,10 +149,13 @@ void PacketPoolWait(void)
 {
     PktPool *my_pool = GetThreadPacketPool();

-    if (PacketPoolIsEmpty(my_pool)) {
+    if (!my_pool->head) {
+        /* local stack is empty */
         SCMutexLock(&my_pool->return_stack.mutex);
-        SC_ATOMIC_ADD(my_pool->return_stack.sync_now, 1);
-        SCCondWait(&my_pool->return_stack.cond, &my_pool->return_stack.mutex);
+        while (PacketPoolIsEmpty(my_pool)) {
+            SC_ATOMIC_ADD(my_pool->return_stack.sync_now, 1);
+            SCCondWait(&my_pool->return_stack.cond, &my_pool->return_stack.mutex);
+        }
         SCMutexUnlock(&my_pool->return_stack.mutex);
     }

@@ -323,8 +326,8 @@ void PacketPoolReturnPacket(Packet *p)
                 my_pool->pending_tail->next = pool->return_stack.head;
                 pool->return_stack.head = my_pool->pending_head;
                 SC_ATOMIC_RESET(pool->return_stack.sync_now);
-                SCMutexUnlock(&pool->return_stack.mutex);
                 SCCondSignal(&pool->return_stack.cond);
+                SCMutexUnlock(&pool->return_stack.mutex);
                 /* Clear the list of pending packets to return. */
                 my_pool->pending_pool = NULL;
                 my_pool->pending_head = NULL;
@@ -337,8 +340,8 @@ void PacketPoolReturnPacket(Packet *p)
             p->next = pool->return_stack.head;
             pool->return_stack.head = p;
             SC_ATOMIC_RESET(pool->return_stack.sync_now);
-            SCMutexUnlock(&pool->return_stack.mutex);
             SCCondSignal(&pool->return_stack.cond);
+            SCMutexUnlock(&pool->return_stack.mutex);
         }
     }
 }
@@ -395,8 +398,8 @@ void PacketPoolInit(void)
         PacketPoolStorePacket(p);
     }

-    //SCLogInfo("preallocated %"PRIiMAX" packets. Total memory %"PRIuMAX"",
-    //        max_pending_packets, (uintmax_t)(max_pending_packets*SIZE_OF_PACKET));
+    SCLogInfo("preallocated %"PRIiMAX" packets. Total memory %"PRIuMAX"",
+            max_pending_packets, (uintmax_t)(max_pending_packets*SIZE_OF_PACKET));
 }

 void PacketPoolDestroy(void)

Actions #7

Updated by Andreas Herz about 3 years ago

First of all please test it again with current versions, 4.0.6 is rather old.
If you want to commit your patch please follow those https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Contributing steps, thanks!

Actions #8

Updated by Andreas Herz 8 months ago

  • Status changed from Feedback to Closed

Hi, we're closing this issue since there have been no further responses.
If you think this issue is still relevant, try to test it again with the
most recent version of suricata and reopen the issue. If you want to
improve the bug report please take a look at
https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Reporting_Bugs

Actions

Also available in: Atom PDF