Engine unable to disable detect thread, Killing engine. (in libpcap mode)
When I terminate suricata (libpcap mode, --pcap=iface ), if the packet capturing thread is busy (100% cpu), the main thread have to kill the capturing thread after timeout.
It is better to implement PktAcqBreakLoop handler to call pcap_breakloop internally, and a pull request will be proposed later.
The error msg is attached.
 20/12/2018 -- 10:41:58 - (runmode-pcap.c:257) <Info> (RunModeIdsPcapSingle) -- RunModeIdsPcapSingle initialised
 20/12/2018 -- 10:41:58 - (tm-threads.c:2172) <Notice> (TmThreadWaitOnThreadInit) -- all 1 packet processing threads, 4 management threads initialized, engine started.
^C23694 20/12/2018 -- 10:42:13 - (suricata.c:2847) <Notice> (SuricataMainLoop) -- Signal Received. Stopping engine.
[a lone time to wait, 1 minute]
 20/12/2018 -- 10:43:14 - (tm-threads.c:1578) <Error> (TmThreadDisableReceiveThreads) -- [ERRCODE: SC_ERR_FATAL(171)] - Engine unable to disable detect thread - "W#01-*****". Killing engine
The main reason is that the PktAcqBreakLoop handler is set to NULL in the src/source-pcap.c:117.
I will submit a pull request to fix it.
#2 Updated by Victor Julien about 1 month ago
- Affected Versions deleted (
The pcap dispatch function should not block for long because we set a timeout using pcap_set_timeout. The 100% CPU also suggests it's not libpcap blocking, but something else. It would be interesting to check what this thread is doing during shut down that is taking so long. Perhaps attaching to it with perf or gdb could give some more insight.
#3 Updated by jingyu YANG about 1 month ago
Thank you for your reply.
I agree with you that pcap_set_timeout() could avoid hangup during quit. And it is OK to reject this pull request.
But I prefer to set PktAcqBreakLoop as pcap_breakloop to quit explicitly, instead of waiting for timeout. Because, in my situation, libpcap is not running in default mode, but in DPDK mode.
More background information is followed.
1. 100% CPU is not a problem for this time. But main thread have to wait a long time (1 minute) to kill capturing thread is the main issue.
2. The main background is that I would like to enable DPDK(https://www.dpdk.org/) for suricata. Instead implementing src/source-dpdk.c directly for suricata, I prefer to implement DPDK for libpcap firstly, then use libpcap mode (--pcap=dpdk:0) in suricata. This is the pull request: https://github.com/the-tcpdump-group/libpcap/pull/790
3. In this case (--pcap=dpdk:0), DPDK will bind 1 cpu lcore to capturing thread, and will make one CPU core 100%. It is normal for DPDK use case, and if the main thread would like to quit, pcap_breakloop() need to be called explicitly.
4. Currently, in DPDK mode of libpcap, pcap_dispatch() will return only if the max_cnt is achieved.
5. Regarding pcap_set_timeout() that you mentioned, the parameter will only affect read timeout according the doc of libpcap(https://linux.die.net/man/3/pcap_set_timeout). In suricata libpcapmode, the pcap_dispatch() will return if during LIBPCAP_COPYWAIT(500 ms), there is no more packet received. As max_cnt is 64 in suricata when calling pcap_dispatch(), this means we have to wait (500ms*64 = 32s) maximum, if packets arrives one by one between 500 ms. I think 32s is also too long to wait.
Thank you for your review.
And any feedback is welcome.