Project

General

Profile

Bug #1805

Updated by Victor Julien almost 9 years ago

It appears that in certain setups, using PF_RING with multiple threads and zero copy mode is broken. 

 My test is simple: I blast ~9.6Gbps at the system affected. At some point it crashes sometimes. 

 I have made test to trigger the issue very quickly: In our 'Packet' structure we have a pointer to the position in the packet that is the ethernet header. I can see that the data in some cases gets corrupted. 

 So the test I added does this: 

 Next to the pointer, I added a static data structure for holding the contents of the ethernet header. On ethernet layer decoding I copy the data from the pointer into the static struct. Then just before the end of the life of the packet inside suricata (so before the next pfring_recv call on that thread) I compare if the data the pointer points to and my static copy are they same. If not, I abort. 

 This test can be found here https://github.com/inliniac/suricata/pull/2144/files 

 When using more than one thread, it blows up within a minute. When I use one thread, it appears to work correctly. Also when running for a long time. 

 On manual inspection I can see that the 'static' copy of the ethernet header header is correct. It contains the proper eth_type. The packet has also been decoded correctly at the higher levels which proves that in the pointer version it was correct at one point in time as well. However, in this test the pointer to the ethernet header shows junk values. 

 I'm suspecting there is some synchronization issue in the kernel/pfring module/driver. 

 On the same hardware and running the same test both AF_PACKET(v3) and NETMAP behave correctly. 


 Setup: 

 Intel X710: 
 <pre> 
 # ethtool -i ens2f1 
 driver: i40e 
 version: 1.4.25-k 
 firmware-version: 4.53 0x8000206e 0.0.0 
 expansion-rom-version: 
 bus-info: 0000:0f:00.1 
 supports-statistics: yes 
 supports-test: yes 
 supports-eeprom-access: yes 
 supports-register-dump: yes 
 supports-priv-flags: yes 
 </pre> 

 It's an older (Nehalem) 4core Xeon with Hyper threading: 
 <pre> 
 Architecture:            x86_64 
 CPU op-mode(s):          32-bit, 64-bit 
 Byte Order:              Little Endian 
 CPU(s):                  8 
 On-line CPU(s) list:     0-7 
 Thread(s) per core:      2 
 Core(s) per socket:      4 
 Socket(s):               1 
 NUMA node(s):            1 
 Vendor ID:               GenuineIntel 
 CPU family:              6 
 Model:                   26 
 Model name:              Intel(R) Xeon(R) CPU             W3550    @ 3.07GHz 
 </pre> 

 8 RSS queues: 

 <pre> 
 [      0.869890] i40e: Intel(R) Ethernet Connection XL710 Network Driver - 
 version 1.4.25-k 
 [      0.869892] i40e: Copyright (c) 2013 - 2014 Intel Corporation. 
 [      0.885006] i40e 0000:0f:00.0: fw 4.40.35115 api 1.4 nvm 4.53 
 0x8000206e 0.0.0 
 [      0.989150] i40e 0000:0f:00.0: MAC address: xxx 
 [      0.993134] i40e 0000:0f:00.0: SAN MAC: xxx 
 [      1.673081] i40e 0000:0f:00.0: PCI-Express: Speed 5.0GT/s Width x8 
 [      1.673084] i40e 0000:0f:00.0: PCI-Express bandwidth available for 
 this device may be insufficient for optimal performance. 
 [      1.673086] i40e 0000:0f:00.0: Please move the device to a different 
 PCI-e link with more lanes and/or higher transfer rate. 
 [      1.679122] i40e 0000:0f:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 
 QP: 8 RX: 1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA 
 [      1.693104] i40e 0000:0f:00.1: fw 4.40.35115 api 1.4 nvm 4.53 
 0x8000206e 0.0.0 
 [      1.795281] i40e 0000:0f:00.1: MAC address: xxx 
 [      1.799253] i40e 0000:0f:00.1: SAN MAC: xxx 
 [      2.043232] i40e 0000:0f:00.1: PCI-Express: Speed 5.0GT/s Width x8 
 [      2.043237] i40e 0000:0f:00.1: PCI-Express bandwidth available for 
 this device may be insufficient for optimal performance. 
 [      2.043240] i40e 0000:0f:00.1: Please move the device to a different 
 PCI-e link with more lanes and/or higher transfer rate. 
 [      2.074505] i40e 0000:0f:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 
 QP: 8 RX: 1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA 
 [      2.075630] i40e 0000:0f:00.1 ens2f1: renamed from eth2 
 [      2.093337] i40e 0000:0f:00.0 ens2f0: renamed from eth0 
 [ 3953.702730] i40e 0000:0f:00.1 ens2f1: NIC Link is Up 10 Gbps Full 
 Duplex, Flow Control: None 
 [ 3957.127461] i40e 0000:0f:00.1 ens2f1: NIC Link is Down 
 [ 3959.517008] i40e 0000:0f:00.1 ens2f1: NIC Link is Up 10 Gbps Full 
 Duplex, Flow Control: None 
 </pre> 

 Using PF_RING 6.4.0 

 <pre> 
 [18827] 9/6/2016 -- 11:01:34 - (runmode-pfring.c:343) <Info> 
 (ParsePfringConfig) -- Using flow cluster mode for PF_RING (iface ens2f1) 
 [18827] 9/6/2016 -- 11:01:34 - (util-runmodes.c:295) <Info> 
 (RunModeSetLiveCaptureWorkersForDevice) -- Going to use 2 thread(s) 
 [New Thread 0x7ffff3e18700 (LWP 18859)] 
 [18859] 9/6/2016 -- 11:01:34 - (source-pfring.c:472) <Info> 
 (ReceivePfringThreadInit) -- Enabling zero-copy for ens2f1 
 [18859] 9/6/2016 -- 11:01:34 - (source-pfring.c:537) <Info> 
 (ReceivePfringThreadInit) -- (W#01-ens2f1) Using PF_RING v.6.4.0, 
 interface ens2f1, cluster-id 99 
 [New Thread 0x7ffff2f54700 (LWP 18860)] 
 [18860] 9/6/2016 -- 11:01:34 - (source-pfring.c:472) <Info> 
 (ReceivePfringThreadInit) -- Enabling zero-copy for ens2f1 
 [18860] 9/6/2016 -- 11:01:34 - (source-pfring.c:537) <Info> 
 (ReceivePfringThreadInit) -- (W#02-ens2f1) Using PF_RING v.6.4.0, 
 interface ens2f1, cluster-id 99 
 [18827] 9/6/2016 -- 11:01:34 - (runmode-pfring.c:521) <Info> 
 (RunModeIdsPfringWorkers) -- RunModeIdsPfringWorkers initialised 

 


 $ cat /proc/net/pf_ring/info 
 PF_RING Version            : 6.4.0 (unknown) 
 Total rings                : 2 

 Standard (non ZC) Options 
 Ring slots                 : 4096 
 Slot version               : 16 
 Capture TX                 : Yes [RX+TX] 
 IP Defragment              : No 
 Socket Mode                : Standard 
 Total plugins              : 0 
 Cluster Fragment Queue     : 0 
 Cluster Fragment Discard : 0 

 $ cat /proc/net/pf_ring/19136-ens2f1.37 
 Bound Device(s)      : ens2f1 
 Active               : 1 
 Breed                : Standard 
 Appl. Name           : Suricata 
 Socket Mode          : RX+TX 
 Capture Direction    : RX+TX 
 Sampling Rate        : 1 
 IP Defragment        : No 
 BPF Filtering        : Disabled 
 Sw Filt Hash Rules : 0 
 Sw Filt WC Rules     : 0 
 Hw Filt Rules        : 0 
 Sw Filt Hash Match : 0 
 Sw Filt Hash Miss    : 0 
 Poll Pkt Watermark : 128 
 Num Poll Calls       : 2 
 Channel Id Mask      : 0xFFFFFFFFFFFFFFFF 
 Cluster Id           : 99 
 Slot Version         : 16 [6.4.0] 
 Min Num Slots        : 4098 
 Bucket Len           : 1524 
 Slot Len             : 1728 [bucket+header] 
 Tot Memory           : 7090176 
 Tot Packets          : 9680214 
 Tot Pkt Lost         : 9220222 
 Tot Insert           : 458907 
 Tot Read             : 448573 
 Insert Offset        : 294608 
 Remove Offset        : 297888 
 Num Free Slots       : 0 
 TX: Send Ok          : 0 
 TX: Send Errors      : 0 
 Reflect: Fwd Ok      : 0 
 Reflect: Fwd Errors: 0 
 </pre> 

Back