Project

General

Profile

Actions

Bug #1805

closed

pfring: zero copy broken

Added by Victor Julien almost 9 years ago. Updated almost 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Affected Versions:
Effort:
Difficulty:
Label:

Description

It appears that in certain setups, using PF_RING with multiple threads and zero copy mode is broken.

My test is simple: I blast ~9.6Gbps at the system affected. At some point it crashes sometimes.

I have made test to trigger the issue very quickly: In our 'Packet' structure we have a pointer to the position in the packet that is the ethernet header. I can see that the data in some cases gets corrupted.

So the test I added does this:

Next to the pointer, I added a static data structure for holding the contents of the ethernet header. On ethernet layer decoding I copy the data from the pointer into the static struct. Then just before the end of the life of the packet inside suricata (so before the next pfring_recv call on that thread) I compare if the data the pointer points to and my static copy are they same. If not, I abort.

This test can be found here https://github.com/inliniac/suricata/pull/2144/files

When using more than one thread, it blows up within a minute. When I use one thread, it appears to work correctly. Also when running for a long time.

On manual inspection I can see that the 'static' copy of the ethernet header header is correct. It contains the proper eth_type. The packet has also been decoded correctly at the higher levels which proves that in the pointer version it was correct at one point in time as well. However, in this test the pointer to the ethernet header shows junk values.

I'm suspecting there is some synchronization issue in the kernel/pfring module/driver.

On the same hardware and running the same test both AF_PACKET(v3) and NETMAP behave correctly.

Setup:

Intel X710:

# ethtool -i ens2f1
driver: i40e
version: 1.4.25-k
firmware-version: 4.53 0x8000206e 0.0.0
expansion-rom-version:
bus-info: 0000:0f:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

It's an older (Nehalem) 4core Xeon with Hyper threading:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 26
Model name:            Intel(R) Xeon(R) CPU           W3550  @ 3.07GHz

8 RSS queues:

[    0.869890] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 1.4.25-k
[    0.869892] i40e: Copyright (c) 2013 - 2014 Intel Corporation.
[    0.885006] i40e 0000:0f:00.0: fw 4.40.35115 api 1.4 nvm 4.53 0x8000206e 0.0.0
[    0.989150] i40e 0000:0f:00.0: MAC address: xxx
[    0.993134] i40e 0000:0f:00.0: SAN MAC: xxx
[    1.673081] i40e 0000:0f:00.0: PCI-Express: Speed 5.0GT/s Width x8
[    1.673084] i40e 0000:0f:00.0: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
[    1.673086] i40e 0000:0f:00.0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
[    1.679122] i40e 0000:0f:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 8 RX: 1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[    1.693104] i40e 0000:0f:00.1: fw 4.40.35115 api 1.4 nvm 4.53 0x8000206e 0.0.0
[    1.795281] i40e 0000:0f:00.1: MAC address: xxx
[    1.799253] i40e 0000:0f:00.1: SAN MAC: xxx
[    2.043232] i40e 0000:0f:00.1: PCI-Express: Speed 5.0GT/s Width x8
[    2.043237] i40e 0000:0f:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
[    2.043240] i40e 0000:0f:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
[    2.074505] i40e 0000:0f:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 8 RX: 1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[    2.075630] i40e 0000:0f:00.1 ens2f1: renamed from eth2
[    2.093337] i40e 0000:0f:00.0 ens2f0: renamed from eth0
[ 3953.702730] i40e 0000:0f:00.1 ens2f1: NIC Link is Up 10 Gbps Full Duplex, Flow Control: None
[ 3957.127461] i40e 0000:0f:00.1 ens2f1: NIC Link is Down
[ 3959.517008] i40e 0000:0f:00.1 ens2f1: NIC Link is Up 10 Gbps Full Duplex, Flow Control: None

Using PF_RING 6.4.0

[18827] 9/6/2016 -- 11:01:34 - (runmode-pfring.c:343) <Info> (ParsePfringConfig) -- Using flow cluster mode for PF_RING (iface ens2f1)
[18827] 9/6/2016 -- 11:01:34 - (util-runmodes.c:295) <Info> (RunModeSetLiveCaptureWorkersForDevice) -- Going to use 2 thread(s)
[New Thread 0x7ffff3e18700 (LWP 18859)]
[18859] 9/6/2016 -- 11:01:34 - (source-pfring.c:472) <Info> (ReceivePfringThreadInit) -- Enabling zero-copy for ens2f1
[18859] 9/6/2016 -- 11:01:34 - (source-pfring.c:537) <Info> (ReceivePfringThreadInit) -- (W#01-ens2f1) Using PF_RING v.6.4.0, interface ens2f1, cluster-id 99
[New Thread 0x7ffff2f54700 (LWP 18860)]
[18860] 9/6/2016 -- 11:01:34 - (source-pfring.c:472) <Info> (ReceivePfringThreadInit) -- Enabling zero-copy for ens2f1
[18860] 9/6/2016 -- 11:01:34 - (source-pfring.c:537) <Info> (ReceivePfringThreadInit) -- (W#02-ens2f1) Using PF_RING v.6.4.0, interface ens2f1, cluster-id 99
[18827] 9/6/2016 -- 11:01:34 - (runmode-pfring.c:521) <Info> (RunModeIdsPfringWorkers) -- RunModeIdsPfringWorkers initialised

$ cat /proc/net/pf_ring/info
PF_RING Version          : 6.4.0 (unknown)
Total rings              : 2

Standard (non ZC) Options
Ring slots               : 4096
Slot version             : 16
Capture TX               : Yes [RX+TX]
IP Defragment            : No
Socket Mode              : Standard
Total plugins            : 0
Cluster Fragment Queue   : 0
Cluster Fragment Discard : 0

$ cat /proc/net/pf_ring/19136-ens2f1.37
Bound Device(s)    : ens2f1
Active             : 1
Breed              : Standard
Appl. Name         : Suricata
Socket Mode        : RX+TX
Capture Direction  : RX+TX
Sampling Rate      : 1
IP Defragment      : No
BPF Filtering      : Disabled
Sw Filt Hash Rules : 0
Sw Filt WC Rules   : 0
Hw Filt Rules      : 0
Sw Filt Hash Match : 0
Sw Filt Hash Miss  : 0
Poll Pkt Watermark : 128
Num Poll Calls     : 2
Channel Id Mask    : 0xFFFFFFFFFFFFFFFF
Cluster Id         : 99
Slot Version       : 16 [6.4.0]
Min Num Slots      : 4098
Bucket Len         : 1524
Slot Len           : 1728 [bucket+header]
Tot Memory         : 7090176
Tot Packets        : 9680214
Tot Pkt Lost       : 9220222
Tot Insert         : 458907
Tot Read           : 448573
Insert Offset      : 294608
Remove Offset      : 297888
Num Free Slots     : 0
TX: Send Ok        : 0
TX: Send Errors    : 0
Reflect: Fwd Ok    : 0
Reflect: Fwd Errors: 0
Actions

Also available in: Atom PDF