Project

General

Profile

Actions

Bug #1805

closed

pfring: zero copy broken

Added by Victor Julien almost 8 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Affected Versions:
Effort:
Difficulty:
Label:

Description

It appears that in certain setups, using PF_RING with multiple threads and zero copy mode is broken.

My test is simple: I blast ~9.6Gbps at the system affected. At some point it crashes sometimes.

I have made test to trigger the issue very quickly: In our 'Packet' structure we have a pointer to the position in the packet that is the ethernet header. I can see that the data in some cases gets corrupted.

So the test I added does this:

Next to the pointer, I added a static data structure for holding the contents of the ethernet header. On ethernet layer decoding I copy the data from the pointer into the static struct. Then just before the end of the life of the packet inside suricata (so before the next pfring_recv call on that thread) I compare if the data the pointer points to and my static copy are they same. If not, I abort.

This test can be found here https://github.com/inliniac/suricata/pull/2144/files

When using more than one thread, it blows up within a minute. When I use one thread, it appears to work correctly. Also when running for a long time.

On manual inspection I can see that the 'static' copy of the ethernet header header is correct. It contains the proper eth_type. The packet has also been decoded correctly at the higher levels which proves that in the pointer version it was correct at one point in time as well. However, in this test the pointer to the ethernet header shows junk values.

I'm suspecting there is some synchronization issue in the kernel/pfring module/driver.

On the same hardware and running the same test both AF_PACKET(v3) and NETMAP behave correctly.

Setup:

Intel X710:

# ethtool -i ens2f1
driver: i40e
version: 1.4.25-k
firmware-version: 4.53 0x8000206e 0.0.0
expansion-rom-version:
bus-info: 0000:0f:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

It's an older (Nehalem) 4core Xeon with Hyper threading:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 26
Model name:            Intel(R) Xeon(R) CPU           W3550  @ 3.07GHz

8 RSS queues:

[    0.869890] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 1.4.25-k
[    0.869892] i40e: Copyright (c) 2013 - 2014 Intel Corporation.
[    0.885006] i40e 0000:0f:00.0: fw 4.40.35115 api 1.4 nvm 4.53 0x8000206e 0.0.0
[    0.989150] i40e 0000:0f:00.0: MAC address: xxx
[    0.993134] i40e 0000:0f:00.0: SAN MAC: xxx
[    1.673081] i40e 0000:0f:00.0: PCI-Express: Speed 5.0GT/s Width x8
[    1.673084] i40e 0000:0f:00.0: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
[    1.673086] i40e 0000:0f:00.0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
[    1.679122] i40e 0000:0f:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 8 RX: 1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[    1.693104] i40e 0000:0f:00.1: fw 4.40.35115 api 1.4 nvm 4.53 0x8000206e 0.0.0
[    1.795281] i40e 0000:0f:00.1: MAC address: xxx
[    1.799253] i40e 0000:0f:00.1: SAN MAC: xxx
[    2.043232] i40e 0000:0f:00.1: PCI-Express: Speed 5.0GT/s Width x8
[    2.043237] i40e 0000:0f:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance.
[    2.043240] i40e 0000:0f:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate.
[    2.074505] i40e 0000:0f:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 8 RX: 1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[    2.075630] i40e 0000:0f:00.1 ens2f1: renamed from eth2
[    2.093337] i40e 0000:0f:00.0 ens2f0: renamed from eth0
[ 3953.702730] i40e 0000:0f:00.1 ens2f1: NIC Link is Up 10 Gbps Full Duplex, Flow Control: None
[ 3957.127461] i40e 0000:0f:00.1 ens2f1: NIC Link is Down
[ 3959.517008] i40e 0000:0f:00.1 ens2f1: NIC Link is Up 10 Gbps Full Duplex, Flow Control: None

Using PF_RING 6.4.0

[18827] 9/6/2016 -- 11:01:34 - (runmode-pfring.c:343) <Info> (ParsePfringConfig) -- Using flow cluster mode for PF_RING (iface ens2f1)
[18827] 9/6/2016 -- 11:01:34 - (util-runmodes.c:295) <Info> (RunModeSetLiveCaptureWorkersForDevice) -- Going to use 2 thread(s)
[New Thread 0x7ffff3e18700 (LWP 18859)]
[18859] 9/6/2016 -- 11:01:34 - (source-pfring.c:472) <Info> (ReceivePfringThreadInit) -- Enabling zero-copy for ens2f1
[18859] 9/6/2016 -- 11:01:34 - (source-pfring.c:537) <Info> (ReceivePfringThreadInit) -- (W#01-ens2f1) Using PF_RING v.6.4.0, interface ens2f1, cluster-id 99
[New Thread 0x7ffff2f54700 (LWP 18860)]
[18860] 9/6/2016 -- 11:01:34 - (source-pfring.c:472) <Info> (ReceivePfringThreadInit) -- Enabling zero-copy for ens2f1
[18860] 9/6/2016 -- 11:01:34 - (source-pfring.c:537) <Info> (ReceivePfringThreadInit) -- (W#02-ens2f1) Using PF_RING v.6.4.0, interface ens2f1, cluster-id 99
[18827] 9/6/2016 -- 11:01:34 - (runmode-pfring.c:521) <Info> (RunModeIdsPfringWorkers) -- RunModeIdsPfringWorkers initialised

$ cat /proc/net/pf_ring/info
PF_RING Version          : 6.4.0 (unknown)
Total rings              : 2

Standard (non ZC) Options
Ring slots               : 4096
Slot version             : 16
Capture TX               : Yes [RX+TX]
IP Defragment            : No
Socket Mode              : Standard
Total plugins            : 0
Cluster Fragment Queue   : 0
Cluster Fragment Discard : 0

$ cat /proc/net/pf_ring/19136-ens2f1.37
Bound Device(s)    : ens2f1
Active             : 1
Breed              : Standard
Appl. Name         : Suricata
Socket Mode        : RX+TX
Capture Direction  : RX+TX
Sampling Rate      : 1
IP Defragment      : No
BPF Filtering      : Disabled
Sw Filt Hash Rules : 0
Sw Filt WC Rules   : 0
Hw Filt Rules      : 0
Sw Filt Hash Match : 0
Sw Filt Hash Miss  : 0
Poll Pkt Watermark : 128
Num Poll Calls     : 2
Channel Id Mask    : 0xFFFFFFFFFFFFFFFF
Cluster Id         : 99
Slot Version       : 16 [6.4.0]
Min Num Slots      : 4098
Bucket Len         : 1524
Slot Len           : 1728 [bucket+header]
Tot Memory         : 7090176
Tot Packets        : 9680214
Tot Pkt Lost       : 9220222
Tot Insert         : 458907
Tot Read           : 448573
Insert Offset      : 294608
Remove Offset      : 297888
Num Free Slots     : 0
TX: Send Ok        : 0
TX: Send Errors    : 0
Reflect: Fwd Ok    : 0
Reflect: Fwd Errors: 0
Actions #1

Updated by Victor Julien almost 8 years ago

If anyone is willing to run https://github.com/inliniac/suricata/pull/2144 and report back, I'd appreciate it very much! It will abort Suricata in case of this issue, or it will run happily w/o issues otherwise.

Actions #2

Updated by Victor Julien almost 8 years ago

  • Description updated (diff)
Actions #3

Updated by Victor Julien almost 8 years ago

It's still broken with these options:

# ethtool -k ens2f1
Features for ens2f1:
rx-checksumming: off
tx-checksumming: off
        tx-checksum-ipv4: off
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: off
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off
scatter-gather: off
        tx-scatter-gather: off
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off
        tx-tcp-ecn-segmentation: off
        tx-tcp6-segmentation: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off [fixed]
rx-vlan-offload: off
tx-vlan-offload: off
ntuple-filters: off
receive-hashing: off
highdma: off
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]

# ethtool -a ens2f1
Pause parameters for ens2f1:
Autonegotiate:  off
RX:             off
TX:             off

# ethtool -c ens2f1
Coalesce parameters for ens2f1:
Adaptive RX: off  TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 25
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 256

tx-usecs: 25
tx-frames: 0
tx-usecs-irq: 0
tx-frames-irq: 256

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

Actions #4

Updated by Victor Julien almost 8 years ago

The problem also appears if I modify the 'single' runmode to still use zero copy. Normally it is only activated with workers. So with a single pfring processing thread the problem also appears.

Actions #5

Updated by Victor Julien almost 8 years ago

I can also reproduce this in a modified pfcount, so it appears not specific to Suricata.

Actions #6

Updated by Peter Manev almost 8 years ago

I can confirm the same. Some additional info from my test environment
- using the latest pfring git master
- Ubuntu LTS Trusty with 3.19 kernel


PF_RING Version          : 6.5.0 (dev:9f358aa8dd5b43bb74f67304c10ff41915e2f562)
Total rings              : 0

Standard (non ZC) Options
Ring slots               : 65534
Slot version             : 16
Capture TX               : Yes [RX+TX]
IP Defragment            : No
Socket Mode              : Standard
Total plugins            : 0
Cluster Fragment Queue   : 0
Cluster Fragment Discard : 0

driver: ixgbe
version: 4.2.1
firmware-version: 0x800000cb
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

driver: ixgbe
version: 4.2.1
firmware-version: 0x800000cb
bus-info: 0000:04:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

vendor_id    : GenuineIntel
cpu family    : 6
model        : 45
model name    : Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz
stepping    : 7
microcode    : 0x70b
cpu MHz        : 3186.105
cache size    : 20480 KB
physical id    : 0
siblings    : 16
core id        : 0
cpu cores    : 8
apicid        : 0
initial apicid    : 0
fpu        : yes
fpu_exception    : yes
cpuid level    : 13
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_
tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_
2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid xsaveopt
bugs        :
bogomips    : 5399.69
clflush size    : 64
cache_alignment    : 64
address sizes    : 46 bits physical, 48 bits virtual
power management:
Actions #7

Updated by Victor Julien almost 8 years ago

Also reproducible on a different (Intel) card: 82576 with igb driver.

PF_RING upstream now convinced it's a PF_RING issue.

Actions #9

Updated by Victor Julien almost 8 years ago

It appears 6.4.1 has been released to fix this and other issues.

Actions

Also available in: Atom PDF