Bug #8654
openCPU exhaustion related to app-layer quic protocol parsing observed in Suricata 8.0.3
Description
Issue Summary
In an IPS deployment with QUIC enabled, we’re observing severe CPU spikes dominated by QUIC transaction iteration. At scale, this issue pins CPUs to 100%, introduces significant latency, while generating a flood of parser errors. (Note that the parser errors are also present in Suricata 7.0.8, there were no CPU spikes on suricata 7.0.8). The CPU fell back to normal by rolling back to Suricata 7.0.8.
Reproduction Steps
We have attached a rules file (suricata.rules), a yaml config file(test.yaml), a packet capture file, and the scapy script used to generate that pcap (AI Generated).
When we run Suricata 8.0.3 in af_packet mode, and play back the packets in the pcap file with tcpreplay in a continuous loop, we were able to observe the CPU growth in a test environment. We were also able to capture the perf top -p <suricata-pid> to see that the functions consuming high CPU were related to Applayer and QUIC protocol parsing. See the perf output below
+ 39.48% [.] suricata::quic::quic::quic_state_get_tx_iterator - -
+ 31.30% [.] AppLayerParserTransactionsCleanup - -
+ 12.27% [.] AppLayerParserGetStateProgress - -
+ 6.92% [.] AppLayerParserGetTxData - -
+ 3.86% [.] FlowGetProtoMapping - -
+ 1.28% [.] suricata::quic::quic::quic_get_tx_data - -
+ 0.95% [.] suricata::rdp::rdp::rdp_tx_get_progress - -
+ 0.93% [.] __pthread_mutex_trylock - -
+ 0.86% [.] __pthread_mutex_unlock_usercnt - -
0.14% [.] HostTimeoutHash - -
0.11% [.] AFPReadFromRing - -
0.08% [k] audit_filter_syscall.constprop.0.isra.0
We do not see the same CPU profile when we repeat this in Suricata 7.0.8.
6.90% suricata [.] AFPReadFromRing 5.53% [kernel] [k] copy_user_enhanced_fast_string 3.81% libc-2.26.so [.] __memmove_avx_unaligned_erms 3.71% libpthread-2.26.so [.] __pthread_mutex_unlock_usercnt 3.59% [vdso] [.] 0x0000000000000728 3.17% libpthread-2.26.so [.] __pthread_mutex_trylock 2.30% [kernel] [k] entry_SYSCALL_64 1.74% [kernel] [k] _raw_spin_lock_irqsave 1.68% suricata [.] suricata::quic::quic::QuicState::new_tx 1.61% [kernel] [k] audit_filter_syscall.constprop.0.isra.0 1.46% suricata [.] DecodeIPV4 1.31% libpthread-2.26.so [.] __pthread_disable_asynccancel 1.24% libpthread-2.26.so [.] __pthread_mutex_lock
The high CPU utilization has hampered progress in our attempts to upgrade to the Suricata 8.0.3. Can you investigate this performance issue in Suricata 8’s QUIC transaction handling with high priority? The fact that the issue was not observed in Suricata 7.0.8, suggests a fix is needed for Suricata 8.0.3.
Thank you!
Full setup used:
# Setup: create a dummy interface pair ip link add SFE_0_TX type dummy ip link add SFE_0_RX type dummy ip link set SFE_0_TX up mtu 9001 ip link set SFE_0_RX up mtu 9001 # Terminal 1: run suricata suricata -c suricata.yaml -S rules.rules -k none --af-packet \ --set af-packet.0.interface=SFE_0_TX \ --set af-packet.1.interface=SFE_0_RX \ -l logs/ -vvv # Terminal 2: replay pcap tcpreplay --intf1=SFE_0_TX --mbps=500 --loop=0 quic-cpu-repro.pcap # Terminal 3: profile after 2-3 minutes perf record -g -p $(pgrep suricata) -- sleep 60 perf report --no-children --sort=symbol
Files