Project

General

Profile

Actions

Bug #2220

open

When running on a single-CPU machine, pcap processing takes a long time

Added by Ralph Broenink about 5 years ago. Updated 10 months ago.

Status:
In Review
Priority:
Normal
Assignee:
-
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

I was attempting to batch-process a list of PCAPs, but I noticed the speed was very low on one of my worker machines. It is a virtual machine with only a single CPU. Increasing the number of CPUs resolves the issue.

I started Suricata in socket mode:
suricata --unix-socket=/tmp/suricata/socket --pidfile=/tmp/suricata/pid -S /tmp/suricata/rules

With the rules file being empty, or containing a single simple rule.

Then, I submit a lot of PCAPs to process using a very simple Python script, and interrupting it using Ctrl+C after a while:

import suricatasc
s = suricatasc.SuricataSC("/tmp/suricata/socket")
s.connect()
while True:
    s.send_command("pcap-file", {"filename": "empty.pcap", "output-dir": "/tmp/suricata/"})

On a machine with 1 CPU and 1 core, it takes almost exactly 1 second to process a single PCAP file, even if it is empty. The script above will be successful in sending two commands to the socket, so after 30s of running this script, 60 pcaps will be submitted (and 30 will still need processing).

On a machine with more CPU threads, the above works very fast.

This seems to be an issue with the threading library or so. Since my testing showed the processing taking 1 second per PCAP, I suspected a random sleep somewhere, but couldn't find it.

This issue occurs with at least Suricata 3.0, 3.2 and 4.0.


Related issues 1 (1 open0 closed)

Related to Bug #2221: Suricata batch processing slowed down by 0.2s intervalsNewOISF DevActions
Actions #1

Updated by Peter Manev about 5 years ago

Do you experience the same if you try runmode single?

Actions #2

Updated by Mathijs van de Nes about 5 years ago

Peter Manev wrote:

Do you experience the same if you try runmode single?

That does not seem to change anything.

It appears to be that the flow manager is delaying the analysis. When reducing the timeout FLOW_NORMAL_MODE_UPDATE_DELAY_* in flow-manager.c to 1 us instead of the default of 1 s, my setup works 33 times as fast on a single CPU. (1000 samples in 30 seconds instead of just 30)

This is just a workaround, but I hope it can point someone who is more familiar with Suricata in the right direction.

Actions #3

Updated by Andreas Herz about 5 years ago

  • Assignee set to OISF Dev
  • Target version set to TBD
Actions #4

Updated by Danny Browning almost 5 years ago

Flow manager will encounter this wait at https://github.com/OISF/suricata/blob/master/src/flow-manager.c#L790

Normally the condition variable would be woken up by https://github.com/OISF/suricata/blob/master/src/flow-manager.c#L143, however unix socket doesn't hit this code path as a result of https://github.com/OISF/suricata/blob/master/src/suricata.c#L2290.

This means at the end of a unix run mode, the flow manager will have to wait on the time remaining for the condition variable. Suggest adding a WakeupFlowMangerThread function to FlowManager, and calling that from https://github.com/OISF/suricata/blob/master/src/runmode-unix-socket.c#L379 or https://github.com/OISF/suricata/blob/master/src/suricata.c#L2290 so that processing is not waiting on the flow manager condition variable wakeup.

Actions #5

Updated by Danny Browning almost 5 years ago

  • Status changed from New to Assigned
  • Assignee changed from OISF Dev to Danny Browning
Actions #7

Updated by Danny Browning almost 5 years ago

  • Assignee changed from Danny Browning to Ralph Broenink

Ralph can you verify that slowness is not seen in this PR?

Actions #8

Updated by Mathijs van de Nes almost 5 years ago

Danny Browning wrote:

Ralph can you verify that slowness is not seen in this PR?

Unfortunately this does not seem to affect the behaviour.
With one CPU core the interval is still 1 second, even though the wake-up function is called.

Actions #9

Updated by Andreas Herz over 3 years ago

  • Related to Bug #2221: Suricata batch processing slowed down by 0.2s intervals added
Actions #10

Updated by Andreas Herz 10 months ago

  • Status changed from Assigned to Closed

More or less the exact same thing as #2221

Actions #11

Updated by Ralph Broenink 10 months ago

Though the issues look very similar, the issue described in #2221 is that the socket has a 0.2s read timeout. This issue concerns the fact that processing in general is very slow and is not concerned about the socket integration, though the issue can be reproduced in a similar way.

I feel that two issues are warranted, especially given some work was already done on this one, and should be reopened.

Actions #12

Updated by Ralph Broenink 10 months ago

  • Status changed from Closed to In Review
  • Assignee deleted (Ralph Broenink)
Actions

Also available in: Atom PDF