Hi, when using the pcap offline analysis and configured Suricata for writing eve-log events to a unix stream socket some events can be lost.
Find attached a pcap with a lot of DNS events (malware generated), first I wrote in Python a UNIX stream socket server for reading the eve log events and surprisingly some events were lost because Suricata was lot quicker writing to the socket than my code reading from it so using the "send()" primitive returned an EAGAIN error.
After that, instead of coding a C UNIX server I used the socat utility and unfortunately the same behaviour was observed.
IMHO a minimal wait would be sufficient when using a UNIX socket for eve log events.
I'm going to send a PR to the GitHub repository with a new eve log option named 'unix-retry-wait' where you can set the microseconds for waiting before retry a write on a UNIX stream socket with the write queue full
I think a better fix would be to allow blocking when not running live. This would mean no lost events due to a blocking socket when using pcaps and no configuration change required between live and file reading.
I don't think any wait is acceptable in live mode, as it does hold up processing packets.
I get your point and it's a really good one but I'm not totally convinced. :)
Why not let the user choose if waiting is acceptable or not?
Some will prefere to wait while others won't, in any case I think the user is the more appropriate one to decide what really needs and that's the point of the configuration option.
I think a better fix will be to take into account not being in live mode to make the wait but keep on using nonblocking sockets, IMHO it's a more balanced choice.