Support #2720
closedSuricata + Netmap + FreeBSD -- source of bad pkt netmap errors?
Description
Apologies in advance for my limited understanding of C, memory management, and network programming.
I'm trying to nail down the source of repeated "bad pkt" errors that netmap throws when used by Suricata on FreeBSD. I have resolved this issue in my case by setting appropriate system tuneables. However, the issue persists for others and I cannot explain that given the data/examples/settings they provide.
In these situations netmap_grab_packets() is regularly throwing "bad pkt" with slot->len > 2048 (which is the default dev.netmap.buf_size). The part I don't understand is these examples all have default MTUs of 1500. It would make sense if they had jumbo packets with MTUs of 9000, but they never do. As I read the netmap code, this means that the slot->len should never be above something like 1518 -- certainly lower than 2048. I can't explain how they get slot->len numbers like 2200 or sometimes in the 4000s with an MTU of 1500 (on my machines I can lower the default dev.netmap.buf_size to 1920 without running into these errors).
So maybe this seems like a netmap issue. But when I look into that I see that netmap_grab_packets() is called by netmap_txsync_to_host() which is called by a "user process context" (which I read to be the application using netmap: "Suricata" in this case?).
If I'm following (unlikely), Suricata is calling netmap_txsync_to_host() and providing the ring with the slot that is oversized. Netmap then chokes on a packet that is too large for its buf_size.
I hope the question makes sense. I suppose this could be a driver issue or a FreeBSD issue - maybe the packet is oversized when Suricata gets it -- I can't tell, as I have trouble tracing the flow of data between OS/Driver/Suricata/Netmap.
To head off a few questions:
-users who get oversized packets while running Suricata+netmap are using the same driver and nic chipsets as me
-checksum offloading and other required settings have been compared and are all "correct"
If you're confident that the oversized packets are coming from elsewhere, I'll try to take the question there. Netmap devs say this isn't their issue (increase dev.netmap.buf_size) -- That works, but that doesn't explain why the MTU and packet sizes aren't respected. A user moved dev.netmap.buf_size to 4096 only to have bad pkt errors with Suricata running with slot->len > 4200.
Updated by Victor Julien almost 6 years ago
Can you give a bit more background on how you run suri (netmap section from your yaml, commandline) and what your network setup is?
Probably already checked, but when I read packet sizes > mtu, the first thing I think of is offloading like gro. Did you disable these types of offloading?
Updated by booble tins almost 6 years ago
Victor Julien wrote:
Can you give a bit more background on how you run suri (netmap section from your yaml, commandline) and what your network setup is?
from suricata.yaml:
# IPS Mode Configuration
# Netmap
netmap:
- interface: default
threads: auto
copy-mode: ips
disable-promisc: no
checksum-checks: auto
- interface: igb0
copy-iface: igb0+
- interface: igb0+
copy-iface: igb0
Command Line:
/usr/local/bin/suricata --netmap -D -c /usr/local/etc/suricata/suricata_igb0/suricata.yaml --pidfile /var/run/suricata_igb0.pid
Network Setup:
igb0 is the WAN interface and is configured up with:
ifconfig igb0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtsooutput from ifconfig igb0:
igb0: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 *mtu 1500* options=1000b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,NETMAP>Flow control on all interfaces is disabled.
Using "route get [any WAN/LAN ip]" I see an mtu of 1500 everywhere except on link-local (which shows as 16384).
Probably already checked, but when I read packet sizes > mtu, the first thing I think of is offloading like gro. Did you disable these types of offloading?
I believe so, yes.
This isn't a huge priority. Setting dev.netmap.buf_size higher than the bad pkt reported slot->len will resolve the netmap errors, but I can't figure how and where the slot->len that netmap receives is getting to be greater than 2048 for these users.
As I read source_netmap.c from Suricata moving packets around in copy mode seems to be pretty straight-forward:
else { unsigned char *slot_data = (unsigned char *)NETMAP_BUF(txring->tx, ts->buf_idx); memcpy(slot_data, GET_PKT_DATA(p), GET_PKT_LEN(p)); ts->len = GET_PKT_LEN(p); ts->flags |= NS_BUF_CHANGED; }
Maybe if I ask the question this way: if I have an MTU of 1500 and add an ethernet header of 18? bytes (or 22 for VLANs), then should I be able to set dev.netmap.buf_size=1518 and expect Suricata+Netmap to work if offloading is disabled? Or is Suricata adding data anywhere?
Or maybe asking the question a slightly different way: do we know why the default dev.netmap.buf_size=2048 and not something closer to the typical MTU of 1500? I recognize that's a question for the netmap or FreeBSD team, but I'm trying to get a handle on the general idea.
Updated by Victor Julien over 5 years ago
There have been fixes to Suricata 4.1.x and in the master branch netmap support have been rewritten. Are you still seeing this with 4.1.3 or the master branch?
Updated by Andreas Herz over 5 years ago
- Assignee set to booble tins
- Target version set to Support
Updated by Victor Julien almost 5 years ago
- Status changed from Feedback to Closed