Bug #2435
closedSuricata 4.0.3 in IPS mode seems to discard some DNS requests
Added by Dan Rimal almost 7 years ago. Updated over 5 years ago.
Description
Suricata 4.0.3 on Centos 7.2 and kernel 4.4.29 (COS7 elrepo LT) have the very same problem like this closed bugs:
https://redmine.openinfosecfoundation.org/issues/1920
https://redmine.openinfosecfoundation.org/issues/1923
SSH client send A and AAAA request, but inline IPS Suricata via NFQ drops AAAA query pointing to the DNS sitting in another vlan than client. Tcpdump show client's incoming query, but not outgoing query toward DNS server.
Suricata has no rules loaded.
This issue slow down ssh login to remote servers and it is very easy to reproduce (occur everytime).
Attached files:
dnsq-from-vlan12 (client vlan)
dnsq-from-vlan10 (server vlan)
Files
dnsq-from-vlan10 (1.51 KB) dnsq-from-vlan10 | Dan Rimal, 02/05/2018 10:14 AM | ||
dnsq-from-vlan12 (1.98 KB) dnsq-from-vlan12 | Dan Rimal, 02/05/2018 10:14 AM | ||
suricata.yaml (66.4 KB) suricata.yaml | Dan Rimal, 02/05/2018 10:15 AM | ||
suricata.yaml (66.4 KB) suricata.yaml | Dan Rimal, 02/13/2018 06:16 AM |
Updated by Victor Julien almost 7 years ago
Can you share your iptables/nftables rules?
Updated by Dan Rimal almost 7 years ago
There is a relevant part of quite tricky set of rules (~1000). There are two routing tables, two wan and so on. But IPS relevant configuration is pretty straightforward. Traffic FROM and TO the selected user vlan is passed to the NFQ, except iscsi traffic (heavy uninterresting traffic) and except DNS (due this bug):
Chain FORWARD (policy DROP 120 packets, 6550 bytes)
pkts bytes target prot opt in out source destination
24112 21M IPS all -- * enp1s0.12 0.0.0.0/0 0.0.0.0/0 /* =ids_out_vl12= IDS OUT VLAN12 /
20362 9451K IPS all -- enp1s0.12 * 0.0.0.0/0 0.0.0.0/0 / =ids_in_vl12= IDS IN VLAN12 */
Chain IPS (2 references)
pkts bytes target prot opt in out source destination
23776 15M RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x500/0xff00
14 862 RETURN udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:53
6 304 RETURN tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:3260
0 0 RETURN tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:3260
20678 15M NFQUEUE all -- * * 0.0.0.0/0 0.0.0.0/0 /* =ids= IDS UP */ NFQUEUE balance 0:1 bypass
If i disable DNS return, dns hit bug immediately.
I can provide full iptables rules from this firewall, but just privately or i can build some test rig with basic setup if needed.
Updated by Victor Julien almost 7 years ago
It's odd as nothing in Suricata should 'drop' certain DNS packet automatically. Are you able to test w/o the queue balance option just to rule that out?
A simplified test rig is starting to sound good.
Updated by Jason Ish almost 7 years ago
Could you also disable the DNS app-layer, just to help rule it out. In suricata.yaml, under app-layer look for:
dns: # memcaps. Globally and per flow/state. #global-memcap: 16mb #state-memcap: 512kb # How many unreplied DNS requests are considered a flood. # If the limit is reached, app-layer-event:dns.flooded; will match. #request-flood: 500 tcp: enabled: yes detection-ports: dp: 53 udp: enabled: yes detection-ports: dp: 53
And set the "enabled" fields to no.
Thanks.
Updated by Dan Rimal almost 7 years ago
So,
1) without queue balance -> bug still persist
2) disabled DNS app-layer -> bug still persist
Verify:
from client:
[root@router.zab.cldn.eu rc.d]# tcpdump -i enp1s0.12 proto UDP and host 10.0.10.36 and host 10.0.12.22 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp1s0.12, link-type EN10MB (Ethernet), capture size 65535 bytes
14:11:44.224405 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:44.224463 IP 10.0.12.22.60260 > 10.0.10.36.53: 31747+ AAAA? mail.belohrad.cz. (34)
14:11:44.226351 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)
14:11:49.228471 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:49.228871 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)
14:11:49.229052 IP 10.0.12.22.60260 > 10.0.10.36.53: 31747+ AAAA? mail.belohrad.cz. (34)
14:11:49.229530 IP 10.0.10.36.53 > 10.0.12.22.60260: 31747 0/1/0 (98)
to server:
[root@router.zab.cldn.eu ~]# tcpdump -i enp1s0.10 proto UDP and host 10.0.10.36 and host 10.0.12.22 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp1s0.10, link-type EN10MB (Ethernet), capture size 65535 bytes
14:11:44.224528 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:44.226087 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)
14:11:49.228546 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:49.228781 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)
14:11:49.229151 IP 10.0.12.22.60260 > 10.0.10.36.53: 31747+ AAAA? mail.belohrad.cz. (34)
14:11:49.229474 IP 10.0.10.36.53 > 10.0.12.22.60260: 31747 0/1/0 (98)
AAAA request dropped somewhere. Without suricata, no drops. But second query will pass (after 5 secs) through suricata.
Ok, i will create simpler setup to better debug next week and will post results.
Updated by Dan Rimal almost 7 years ago
- File suricata.yaml suricata.yaml added
Hello,
I have just made simple test rig on latest Fedora (4.14.16-300.fc27.x86_64) with suricata:
[root@fedorafw.zab.cldn.eu ~]# suricata --build-info This is Suricata version 4.0.3 RELEASE Features: NFQ PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LIBJANSSON TLS MAGIC SIMD support: none Atomic intrisics: 1 2 4 8 byte(s) 64-bits, Little-endian architecture GCC version 7.2.1 20170915 (Red Hat 7.2.1-2), C version 199901 compiled with _FORTIFY_SOURCE=2 L1 cache line size (CLS)=64 thread local storage method: __thread compiled with LibHTP v0.5.25, linked against LibHTP v0.5.25 Suricata Configuration: AF_PACKET support: yes PF_RING support: no NFQueue support: yes NFLOG support: no IPFW support: no Netmap support: no DAG enabled: no Napatech enabled: no Unix socket enabled: yes Detection enabled: yes Libmagic support: yes libnss support: yes libnspr support: yes libjansson support: yes hiredis support: yes hiredis async with libevent: yes Prelude support: no PCRE jit: yes LUA support: yes libluajit: no libgeoip: yes Non-bundled htp: no Old barnyard2 support: no CUDA enabled: no Hyperscan support: yes Libnet support: yes Rust support (experimental): no Experimental Rust parsers: no Rust strict mode: no Suricatasc install: yes Profiling enabled: no Profiling locks enabled: no Development settings: Coccinelle / spatch: no Unit tests enabled: no Debug output enabled: no Debug validation enabled: no Generic build parameters: Installation prefix: /usr Configuration directory: /etc/suricata/ Log directory: /var/log/suricata/ --prefix /usr --sysconfdir /etc --localstatedir /var Host: x86_64-redhat-linux-gnu Compiler: gcc (exec name) / gcc (real) GCC Protect enabled: yes GCC march native enabled: no GCC Profile enabled: no Position Independent Executable enabled: yes CFLAGS -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic PCAP_CFLAGS SECCFLAGS -fstack-protector -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security
Very simple ruleset:
[root@fedorafw.zab.cldn.eu ~]# iptables -L -n -v Chain INPUT (policy ACCEPT 1241 packets, 171K bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 33 packets, 6292 bytes) pkts bytes target prot opt in out source destination 35 6022 IPS all -- eno1 * 10.16.17.0/24 0.0.0.0/0 30 6510 IPS all -- * eno1 0.0.0.0/0 10.16.17.0/24 Chain OUTPUT (policy ACCEPT 1173 packets, 204K bytes) pkts bytes target prot opt in out source destination Chain IPS (2 references) pkts bytes target prot opt in out source destination 33 6292 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x500/0xff00 32 6240 NFQUEUE all -- * * 0.0.0.0/0 0.0.0.0/0 NFQUEUE balance 0:1 bypass [root@fedorafw.zab.cldn.eu ~]# iptables -L -n -v -t nat Chain PREROUTING (policy ACCEPT 80 packets, 4380 bytes) pkts bytes target prot opt in out source destination Chain INPUT (policy ACCEPT 77 packets, 4196 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 3 184 MASQUERADE all -- * eno2 10.16.17.0/24 0.0.0.0/0
Suricata with no rules loaded still drops AAAA request going together with A do DNS server (8.8.8.8 at this moment) - typicaly ssh client
I can provide access to this rig, contact me off the list on dan@danrimal.net for further info, if needed.
Any ideas what next?
Updated by Andreas Herz almost 7 years ago
- Assignee set to OISF Dev
- Target version set to TBD
Does this also happen when you do the DNS lookup with dig or other tools?
Updated by Dan Rimal almost 7 years ago
Dig works well, even when run A and AAAA query almost together, with the same sport (as ssh client does):
dig -b 10.16.17.20#12000 root.cz @8.8.8.8 & dig -b 10.16.17.20#12000 aaaa root.cz @8.8.8.8 &
Time to time, i get from dig: ";; Warning: ID mismatch: expected ID 59101, got 13302" probably because second response overtake first, but it is non-relevant issue. Important is, both query pass suricata each time.
SSH client do the same thing (i think):
LAN IN: 09:57:05.977876 IP 10.16.17.20.37087 > 8.8.8.8.53: 23122+ A? mail.belohrad.cz. (34) 09:57:05.977917 IP 10.16.17.20.37087 > 8.8.8.8.53: 26324+ AAAA? mail.belohrad.cz. (34)
but suricata drops AAAA query (maybe because it is a second query, not because AAAA, but just guess)
WAN OUT: 09:57:05.978043 IP 31.170.176.24.37087 > 8.8.8.8.53: 23122+ A? mail.belohrad.cz. (34) 09:57:05.994957 IP 8.8.8.8.53 > 31.170.176.24.37087: 23122 1/0/0 A 85.163.73.55 (50) .. .. 09:57:10.981799 IP 31.170.176.24.37087 > 8.8.8.8.53: 23122+ A? mail.belohrad.cz. (34) 09:57:10.982336 IP 8.8.8.8.53 > 31.170.176.24.37087: 23122 1/0/0 A 85.163.73.55 (50) 09:57:10.982761 IP 31.170.176.24.37087 > 8.8.8.8.53: 26324+ AAAA? mail.belohrad.cz. (34) 09:57:11.035739 IP 8.8.8.8.53 > 31.170.176.24.37087: 26324 0/1/0 (98)
But second DNS query attempt of ssh client works with both RR types. So ssh client has 5 seconds delay because of that.
Updated by Andreas Herz almost 7 years ago
I tried to reproduce this with a small setup as well but ssh to hosts with IPv4 and IPv6 works fast and it resolves to IPv6 first (thus trying to connect via IPv6).
I did it with 4.0.4 and on ArchLinux.
Can anyone else reproduce this?
Updated by Dan Rimal almost 7 years ago
When i set on the client in the /etc/resolv.conf
options single-request-reopen
then everything works well and the queries are these:
11:10:17.225128 IP 10.16.17.20.56291 > 8.8.8.8.53: 1458+ A? mail.belohrad.cz. (34) 11:10:17.242036 IP 8.8.8.8.53 > 10.16.17.20.56291: 1458 1/0/0 A 85.163.73.55 (50) 11:10:17.242231 IP 10.16.17.20.41375 > 8.8.8.8.53: 35714+ AAAA? mail.belohrad.cz. (34) 11:10:17.278970 IP 8.8.8.8.53 > 10.16.17.20.41375: 35714 0/1/0 (98)
So, they don't use the same source port and then it works well.
And, this test remote host has not ipv6 (AAAA) record, but it is probably irrelevant.
Updated by Dan Rimal almost 7 years ago
After the upgrade to suricata 4.0.4 problem still persist:
19/2/2018 -- 11:36:36 - <Notice> - This is Suricata version 4.0.4 RELEASE
19/2/2018 -- 11:36:36 - <Info> - CPUs/cores online: 8
19/2/2018 -- 11:36:36 - <Info> - Protocol detection and parser disabled for dns protocol.
19/2/2018 -- 11:36:36 - <Info> - Protocol detection and parser disabled for dns protocol.
19/2/2018 -- 11:36:36 - <Info> - NFQ running in REPEAT mode with mark 1280/65280
19/2/2018 -- 11:36:36 - <Info> - Running in live mode, activating unix socket
19/2/2018 -- 11:36:36 - <Info> - No signatures supplied.
Updated by Victor Julien almost 6 years ago
Can this issue be reproduced if nfq is in accept mode?
Updated by Victor Julien almost 6 years ago
- Related to Bug #2806: Parallel DNS queries dropped when using same socket added
Updated by Victor Julien over 5 years ago
I've been looking into this for #2806, but I can't reproduce it. Is it still possible to get remote access to analyze the issue?
Updated by Victor Julien over 5 years ago
- Status changed from New to Closed
- Assignee deleted (
OISF Dev) - Target version deleted (
TBD)
This can be fixed by upgrading the kernel to 5.0 or stable >= v4.19.29. For more details see #2806.