Project

General

Profile

Actions

Bug #2435

closed

Suricata 4.0.3 in IPS mode seems to discard some DNS requests

Added by Dan Rimal about 6 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Affected Versions:
Effort:
Difficulty:
Label:

Description

Suricata 4.0.3 on Centos 7.2 and kernel 4.4.29 (COS7 elrepo LT) have the very same problem like this closed bugs:

https://redmine.openinfosecfoundation.org/issues/1920
https://redmine.openinfosecfoundation.org/issues/1923

SSH client send A and AAAA request, but inline IPS Suricata via NFQ drops AAAA query pointing to the DNS sitting in another vlan than client. Tcpdump show client's incoming query, but not outgoing query toward DNS server.

Suricata has no rules loaded.

This issue slow down ssh login to remote servers and it is very easy to reproduce (occur everytime).

Attached files:
dnsq-from-vlan12 (client vlan)
dnsq-from-vlan10 (server vlan)


Files

dnsq-from-vlan10 (1.51 KB) dnsq-from-vlan10 Dan Rimal, 02/05/2018 10:14 AM
dnsq-from-vlan12 (1.98 KB) dnsq-from-vlan12 Dan Rimal, 02/05/2018 10:14 AM
suricata.yaml (66.4 KB) suricata.yaml Dan Rimal, 02/05/2018 10:15 AM
suricata.yaml (66.4 KB) suricata.yaml Dan Rimal, 02/13/2018 06:16 AM

Related issues 1 (0 open1 closed)

Related to Suricata - Bug #2806: Parallel DNS queries dropped when using same socketClosedActions
Actions #1

Updated by Victor Julien about 6 years ago

Can you share your iptables/nftables rules?

Actions #2

Updated by Dan Rimal about 6 years ago

There is a relevant part of quite tricky set of rules (~1000). There are two routing tables, two wan and so on. But IPS relevant configuration is pretty straightforward. Traffic FROM and TO the selected user vlan is passed to the NFQ, except iscsi traffic (heavy uninterresting traffic) and except DNS (due this bug):

Chain FORWARD (policy DROP 120 packets, 6550 bytes)
pkts bytes target prot opt in out source destination
24112 21M IPS all -- * enp1s0.12 0.0.0.0/0 0.0.0.0/0 /* =ids_out_vl12= IDS OUT VLAN12 /
20362 9451K IPS all -- enp1s0.12 * 0.0.0.0/0 0.0.0.0/0 /
=ids_in_vl12= IDS IN VLAN12 */

Chain IPS (2 references)
pkts bytes target prot opt in out source destination
23776 15M RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x500/0xff00
14 862 RETURN udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:53
6 304 RETURN tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:3260
0 0 RETURN tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:3260
20678 15M NFQUEUE all -- * * 0.0.0.0/0 0.0.0.0/0 /* =ids= IDS UP */ NFQUEUE balance 0:1 bypass

If i disable DNS return, dns hit bug immediately.

I can provide full iptables rules from this firewall, but just privately or i can build some test rig with basic setup if needed.

Actions #3

Updated by Victor Julien about 6 years ago

It's odd as nothing in Suricata should 'drop' certain DNS packet automatically. Are you able to test w/o the queue balance option just to rule that out?

A simplified test rig is starting to sound good.

Actions #4

Updated by Jason Ish about 6 years ago

Could you also disable the DNS app-layer, just to help rule it out. In suricata.yaml, under app-layer look for:

    dns:
      # memcaps. Globally and per flow/state.
      #global-memcap: 16mb
      #state-memcap: 512kb

      # How many unreplied DNS requests are considered a flood.
      # If the limit is reached, app-layer-event:dns.flooded; will match.
      #request-flood: 500

      tcp:
        enabled: yes
        detection-ports:
          dp: 53
      udp:
        enabled: yes
        detection-ports:
          dp: 53

And set the "enabled" fields to no.

Thanks.

Actions #5

Updated by Dan Rimal about 6 years ago

So,

1) without queue balance -> bug still persist

2) disabled DNS app-layer -> bug still persist

Verify:

from client:
[ rc.d]# tcpdump -i enp1s0.12 proto UDP and host 10.0.10.36 and host 10.0.12.22 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp1s0.12, link-type EN10MB (Ethernet), capture size 65535 bytes
14:11:44.224405 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:44.224463 IP 10.0.12.22.60260 > 10.0.10.36.53: 31747+ AAAA? mail.belohrad.cz. (34)
14:11:44.226351 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)

14:11:49.228471 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:49.228871 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)
14:11:49.229052 IP 10.0.12.22.60260 > 10.0.10.36.53: 31747+ AAAA? mail.belohrad.cz. (34)
14:11:49.229530 IP 10.0.10.36.53 > 10.0.12.22.60260: 31747 0/1/0 (98)

to server:
[ ~]# tcpdump -i enp1s0.10 proto UDP and host 10.0.10.36 and host 10.0.12.22 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp1s0.10, link-type EN10MB (Ethernet), capture size 65535 bytes
14:11:44.224528 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:44.226087 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)

14:11:49.228546 IP 10.0.12.22.60260 > 10.0.10.36.53: 5359+ A? mail.belohrad.cz. (34)
14:11:49.228781 IP 10.0.10.36.53 > 10.0.12.22.60260: 5359 1/0/0 A 85.163.73.55 (50)
14:11:49.229151 IP 10.0.12.22.60260 > 10.0.10.36.53: 31747+ AAAA? mail.belohrad.cz. (34)
14:11:49.229474 IP 10.0.10.36.53 > 10.0.12.22.60260: 31747 0/1/0 (98)

AAAA request dropped somewhere. Without suricata, no drops. But second query will pass (after 5 secs) through suricata.

Ok, i will create simpler setup to better debug next week and will post results.

Actions #6

Updated by Dan Rimal about 6 years ago

Hello,

I have just made simple test rig on latest Fedora (4.14.16-300.fc27.x86_64) with suricata:

[root@fedorafw.zab.cldn.eu ~]# suricata --build-info
This is Suricata version 4.0.3 RELEASE
Features: NFQ PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LIBJANSSON TLS MAGIC 
SIMD support: none
Atomic intrisics: 1 2 4 8 byte(s)
64-bits, Little-endian architecture
GCC version 7.2.1 20170915 (Red Hat 7.2.1-2), C version 199901
compiled with _FORTIFY_SOURCE=2
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.25, linked against LibHTP v0.5.25

Suricata Configuration:
  AF_PACKET support:                       yes
  PF_RING support:                         no
  NFQueue support:                         yes
  NFLOG support:                           no
  IPFW support:                            no
  Netmap support:                          no
  DAG enabled:                             no
  Napatech enabled:                        no

  Unix socket enabled:                     yes
  Detection enabled:                       yes

  Libmagic support:                        yes
  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  hiredis support:                         yes
  hiredis async with libevent:             yes
  Prelude support:                         no
  PCRE jit:                                yes
  LUA support:                             yes
  libluajit:                               no
  libgeoip:                                yes
  Non-bundled htp:                         no
  Old barnyard2 support:                   no
  CUDA enabled:                            no
  Hyperscan support:                       yes
  Libnet support:                          yes

  Rust support (experimental):             no
  Experimental Rust parsers:               no
  Rust strict mode:                        no

  Suricatasc install:                      yes

  Profiling enabled:                       no
  Profiling locks enabled:                 no

Development settings:
  Coccinelle / spatch:                     no
  Unit tests enabled:                      no
  Debug output enabled:                    no
  Debug validation enabled:                no

Generic build parameters:
  Installation prefix:                     /usr
  Configuration directory:                 /etc/suricata/
  Log directory:                           /var/log/suricata/

  --prefix                                 /usr
  --sysconfdir                             /etc
  --localstatedir                          /var

  Host:                                    x86_64-redhat-linux-gnu
  Compiler:                                gcc (exec name) / gcc (real)
  GCC Protect enabled:                     yes
  GCC march native enabled:                no
  GCC Profile enabled:                     no
  Position Independent Executable enabled: yes
  CFLAGS                                   -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic
  PCAP_CFLAGS                               
  SECCFLAGS                                -fstack-protector -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security

Very simple ruleset:

[root@fedorafw.zab.cldn.eu ~]# iptables -L -n -v
Chain INPUT (policy ACCEPT 1241 packets, 171K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 33 packets, 6292 bytes)
 pkts bytes target     prot opt in     out     source               destination         
   35  6022 IPS        all  --  eno1   *       10.16.17.0/24        0.0.0.0/0           
   30  6510 IPS        all  --  *      eno1    0.0.0.0/0            10.16.17.0/24       

Chain OUTPUT (policy ACCEPT 1173 packets, 204K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain IPS (2 references)
 pkts bytes target     prot opt in     out     source               destination         
   33  6292 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match 0x500/0xff00
   32  6240 NFQUEUE    all  --  *      *       0.0.0.0/0            0.0.0.0/0            NFQUEUE balance 0:1 bypass

[root@fedorafw.zab.cldn.eu ~]# iptables -L -n -v -t nat
Chain PREROUTING (policy ACCEPT 80 packets, 4380 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 77 packets, 4196 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    3   184 MASQUERADE  all  --  *      eno2    10.16.17.0/24        0.0.0.0/0    

Suricata with no rules loaded still drops AAAA request going together with A do DNS server (8.8.8.8 at this moment) - typicaly ssh client

I can provide access to this rig, contact me off the list on for further info, if needed.

Any ideas what next?

Actions #7

Updated by Andreas Herz about 6 years ago

  • Assignee set to OISF Dev
  • Target version set to TBD

Does this also happen when you do the DNS lookup with dig or other tools?

Actions #8

Updated by Dan Rimal about 6 years ago

Dig works well, even when run A and AAAA query almost together, with the same sport (as ssh client does):

dig -b 10.16.17.20#12000 root.cz @8.8.8.8 & dig -b 10.16.17.20#12000 aaaa root.cz @8.8.8.8 &

Time to time, i get from dig: ";; Warning: ID mismatch: expected ID 59101, got 13302" probably because second response overtake first, but it is non-relevant issue. Important is, both query pass suricata each time.

SSH client do the same thing (i think):

LAN IN:
09:57:05.977876 IP 10.16.17.20.37087 > 8.8.8.8.53: 23122+ A? mail.belohrad.cz. (34)
09:57:05.977917 IP 10.16.17.20.37087 > 8.8.8.8.53: 26324+ AAAA? mail.belohrad.cz. (34)

but suricata drops AAAA query (maybe because it is a second query, not because AAAA, but just guess)
WAN OUT:
09:57:05.978043 IP 31.170.176.24.37087 > 8.8.8.8.53: 23122+ A? mail.belohrad.cz. (34)
09:57:05.994957 IP 8.8.8.8.53 > 31.170.176.24.37087: 23122 1/0/0 A 85.163.73.55 (50)
..
..
09:57:10.981799 IP 31.170.176.24.37087 > 8.8.8.8.53: 23122+ A? mail.belohrad.cz. (34)
09:57:10.982336 IP 8.8.8.8.53 > 31.170.176.24.37087: 23122 1/0/0 A 85.163.73.55 (50)
09:57:10.982761 IP 31.170.176.24.37087 > 8.8.8.8.53: 26324+ AAAA? mail.belohrad.cz. (34)
09:57:11.035739 IP 8.8.8.8.53 > 31.170.176.24.37087: 26324 0/1/0 (98)

But second DNS query attempt of ssh client works with both RR types. So ssh client has 5 seconds delay because of that.

Actions #9

Updated by Andreas Herz about 6 years ago

I tried to reproduce this with a small setup as well but ssh to hosts with IPv4 and IPv6 works fast and it resolves to IPv6 first (thus trying to connect via IPv6).
I did it with 4.0.4 and on ArchLinux.

Can anyone else reproduce this?

Actions #10

Updated by Dan Rimal about 6 years ago

When i set on the client in the /etc/resolv.conf

options single-request-reopen

then everything works well and the queries are these:

11:10:17.225128 IP 10.16.17.20.56291 > 8.8.8.8.53: 1458+ A? mail.belohrad.cz. (34)
11:10:17.242036 IP 8.8.8.8.53 > 10.16.17.20.56291: 1458 1/0/0 A 85.163.73.55 (50)
11:10:17.242231 IP 10.16.17.20.41375 > 8.8.8.8.53: 35714+ AAAA? mail.belohrad.cz. (34)
11:10:17.278970 IP 8.8.8.8.53 > 10.16.17.20.41375: 35714 0/1/0 (98)

So, they don't use the same source port and then it works well.

And, this test remote host has not ipv6 (AAAA) record, but it is probably irrelevant.

Actions #11

Updated by Dan Rimal about 6 years ago

After the upgrade to suricata 4.0.4 problem still persist:

19/2/2018 -- 11:36:36 - <Notice> - This is Suricata version 4.0.4 RELEASE
19/2/2018 -- 11:36:36 - <Info> - CPUs/cores online: 8
19/2/2018 -- 11:36:36 - <Info> - Protocol detection and parser disabled for dns protocol.
19/2/2018 -- 11:36:36 - <Info> - Protocol detection and parser disabled for dns protocol.
19/2/2018 -- 11:36:36 - <Info> - NFQ running in REPEAT mode with mark 1280/65280
19/2/2018 -- 11:36:36 - <Info> - Running in live mode, activating unix socket
19/2/2018 -- 11:36:36 - <Info> - No signatures supplied.

Actions #12

Updated by Victor Julien about 5 years ago

Can this issue be reproduced if nfq is in accept mode?

Actions #13

Updated by Victor Julien about 5 years ago

  • Related to Bug #2806: Parallel DNS queries dropped when using same socket added
Actions #14

Updated by Victor Julien about 5 years ago

I've been looking into this for #2806, but I can't reproduce it. Is it still possible to get remote access to analyze the issue?

Actions #15

Updated by Victor Julien about 5 years ago

  • Status changed from New to Closed
  • Assignee deleted (OISF Dev)
  • Target version deleted (TBD)

This can be fixed by upgrading the kernel to 5.0 or stable >= v4.19.29. For more details see #2806.

Actions

Also available in: Atom PDF