Project

General

Profile

Actions

Bug #2806

closed

Parallel DNS queries dropped when using same socket

Added by Rob Mosher almost 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Difficulty:
Label:

Description

Suricata is dropping DNS queries and not logging the drop when concurrent queries are performed using the same socket. This is standard behavior since glibc 2.10. This causes DNS resolution to be very slow.

The resolv.conf option 'single-request' can work around the issue, but that also delays DNS resolution slightly, though not nearly as bad.

The dns requests below are triggered using 'ping' which I believe calls gethostbyname

Initial queries sent concurrently using same socket:

 00:00:00.000000 IP 192.168.1.224.40887 > 192.168.1.1.53: 9647+ A? google.com. (28)
 00:00:00.000020 IP 192.168.1.224.40887 > 192.168.1.1.53: 52827+ AAAA? google.com. (28)

Only the A query receives a response. Glibc continues to wait for the AAAA response.

 00:00:00.001318 IP 192.168.1.1.53 > 192.168.1.224.40887: 9647 1/0/0 A 172.217.9.142 (44)

Attempt number 2 falls back to single request method, which seems to work just fine, but there's a 5 second delay before the retry.

 00:00:05.001888 IP 192.168.1.224.40887 > 192.168.1.1.53: 9647+ A? google.com. (28)
 00:00:05.003131 IP 192.168.1.1.53 > 192.168.1.224.40887: 9647 1/0/0 A 172.217.9.142 (44)
 00:00:05.003226 IP 192.168.1.224.40887 > 192.168.1.1.53: 52827+ AAAA? google.com. (28)
 00:00:05.004345 IP 192.168.1.1.53 > 192.168.1.224.40887: 52827 1/0/0 AAAA 2607:f8b0:4000:813::200e (56)

If we use the 'single-request' option in resolv.conf to not send multiple queries at the same time, there is minimal delay and no dropped query.

 00:00:00.000000 IP 192.168.1.224.53244 > 192.168.1.1.53: 11413+ A? google.com. (28)
 00:00:00.001201 IP 192.168.1.1.53 > 192.168.1.224.53244: 11413 1/0/0 A 172.217.9.142 (44)
 00:00:00.001295 IP 192.168.1.224.53244 > 192.168.1.1.53: 32038+ AAAA? google.com. (28)
 00:00:00.002325 IP 192.168.1.1.53 > 192.168.1.224.53244: 32038 1/0/0 AAAA 2607:f8b0:4000:813::200e (56)

Both concurrent queries are seen when logging from Suricata, but no AAAA reply, as the second query get's dropped, through Suricata definitely receives it and even logs it.

-- query
02/09/2019-23:57:09.942921 192.168.1.224 -> 192.168.1.1 A? google.com
-- query
02/09/2019-23:57:09.942927 192.168.1.224 -> 192.168.1.1 AAAA? google.com
-- query+answer
02/09/2019-23:57:09.943823 192.168.1.224 -> 192.168.1.1 A? google.com
02/09/2019-23:57:09.943823 192.168.1.1 -> 192.168.1.224 A: google.com: 216.58.194.110

5 seconds later on the single query retry...

-- query
02/09/2019-23:57:14.947594 192.168.1.224 -> 192.168.1.1 A? google.com
-- query+answer
02/09/2019-23:57:14.948331 192.168.1.224 -> 192.168.1.1 A? google.com
02/09/2019-23:57:14.948331 192.168.1.1 -> 192.168.1.224 A: google.com: 216.58.194.110
-- query
02/09/2019-23:57:14.948974 192.168.1.224 -> 192.168.1.1 AAAA? google.com
-- query+answer
02/09/2019-23:57:14.949637 192.168.1.224 -> 192.168.1.1 AAAA? google.com
02/09/2019-23:57:14.949637 192.168.1.1 -> 192.168.1.224 AAAA: google.com: 2607:f8b0:4002:0c06:0000:0000:0000:0064

From a remote dns server's point of view, I have confirmed the second query gets dropped, and only the A query makes it through until the single query retry 5 seconds later.

 00:00:00.000000 IP 24.191.114.206.32749 > 209.51.175.25.53: 15540+ A? test.somedomain. (31)
 00:00:00.000141 IP 209.51.175.25.53 > 24.191.114.206.32749: 15540 NXDomain* 0/1/0 (96)
 00:00:05.005913 IP 24.191.114.206.50912 > 209.51.175.25.53: 57854+ A? test.somedomain. (31)
 00:00:05.006070 IP 209.51.175.25.53 > 24.191.114.206.50912: 57854 NXDomain* 0/1/0 (96)
 00:00:05.021736 IP 24.191.114.206.44855 > 209.51.175.25.53: 40938+ AAAA? test.somedomain. (31)
 00:00:05.021960 IP 209.51.175.25.53 > 24.191.114.206.44855: 40938 NXDomain* 0/1/0 (96)

No other changes made aside from stopping Suricata, this begins working normally as it no longer drops the second query

 00:00:00.000000 IP 192.168.1.224.34752 > 192.168.1.1.53: 5322+ A? google.com. (28)
 00:00:00.000019 IP 192.168.1.224.34752 > 192.168.1.1.53: 59535+ AAAA? google.com. (28)
 00:00:00.032286 IP 192.168.1.1.53 > 192.168.1.224.34752: 59535 1/4/4 AAAA 2607:f8b0:4000:813::200e (240)
 00:00:00.033168 IP 192.168.1.1.53 > 192.168.1.224.34752: 5322 1/4/4 A 172.217.9.142 (228)


Related issues 1 (0 open1 closed)

Related to Suricata - Bug #2435: Suricata 4.0.3 in IPS mode seems to discard some DNS requestsClosedActions
Actions #1

Updated by Victor Julien almost 6 years ago

How are you running Suricata? Does Suricata log anything?

Actions #2

Updated by Rob Mosher almost 6 years ago

Victor Julien wrote:

How are you running Suricata? Does Suricata log anything?

IPS mode using NFQ. There is Suricata log output detailed in the bug report. DNS logging was done via LUA.

Actions #3

Updated by Victor Julien almost 6 years ago

What does your NFQ setup look like? Can you share the iptables/nftables rules as well as the Suricata nfq config?

Actions #4

Updated by Rob Mosher almost 6 years ago

Victor Julien wrote:

What does your NFQ setup look like? Can you share the iptables/nftables rules as well as the Suricata nfq config?

Just noting, this config works fine and has for some time with all other traffic. This DNS issue has been present for a very long time. I mentioned it a few times on IRC and email, but finally got around to creating a bug report for it.

These are the rules that match the DNS traffic destined to the router which is running Suricata.

iptables -t mangle -I INPUT -i br-lan -m mark ! --mark 0x1000/0x1000 \
-m connbytes ! --connbytes 1000000 --connbytes-dir both --connbytes-mode bytes \
-j NFQUEUE --queue-bypass

iptables -t mangle -I OUTPUT -o br-lan -m mark ! --mark 0x1000/0x1000 \
-m connbytes ! --connbytes 1000000 --connbytes-dir both --connbytes-mode bytes \
-j NFQUEUE --queue-bypass

general forwarding rule

iptables -t mangle -I FORWARD \
-m mark ! --mark 0x1000/0x1000 \
-m connbytes ! --connbytes 1000000 --connbytes-dir both --connbytes-mode bytes \
-j NFQUEUE --queue-bypass

This is the nfq config.

nfq:
  mode: repeat
  repeat-mark: 4096
  repeat-mask: 4096

The traffic is confirmed reaching Suricata based on log output, however Suricata is dropping the second dns query as it never makes it to the application (dnsmasq). It seems Suricata cannot handle multiple open queries on the same socket. Once one query is completed, it handles the second appropriately.

This is the output from dnsmasq with query logging enabled and Suricata running. You can see the delay here as the parallel AAAA query never makes it through.

parallel resolution.  There should be an A and AAAA query here.
Wed Feb 13 15:31:14 2019 daemon.info dnsmasq[2620]: 72 192.168.1.224/40269 query[A] google.com from 192.168.1.224
Wed Feb 13 15:31:14 2019 daemon.info dnsmasq[2620]: 72 192.168.1.224/40269 forwarded google.com to 209.51.175.25

5 seconds later, sequential resolution
Wed Feb 13 15:31:19 2019 daemon.info dnsmasq[2620]: 104 192.168.1.224/40269 query[A] google.com from 192.168.1.224
Wed Feb 13 15:31:19 2019 daemon.info dnsmasq[2620]: 104 192.168.1.224/40269 forwarded google.com to 209.51.175.25
Wed Feb 13 15:31:19 2019 daemon.info dnsmasq[2620]: 105 192.168.1.224/40269 query[AAAA] google.com from 192.168.1.224
Wed Feb 13 15:31:19 2019 daemon.info dnsmasq[2620]: 105 192.168.1.224/40269 forwarded google.com to 209.51.175.25
Wed Feb 13 15:31:19 2019 daemon.info dnsmasq[2620]: 104 192.168.1.224/40269 reply google.com is 172.217.9.174
Wed Feb 13 15:31:19 2019 daemon.info dnsmasq[2620]: 105 192.168.1.224/40269 reply google.com is 2607:f8b0:4002:c08::71

Simply stopping Suricata fixes this, and you can see the difference. The parallel AAAA query arrives as expected

Wed Feb 13 15:35:17 2019 daemon.info dnsmasq[2829]: 15 192.168.1.224/38629 query[A] google.com from 192.168.1.224
Wed Feb 13 15:35:17 2019 daemon.info dnsmasq[2829]: 15 192.168.1.224/38629 forwarded google.com to 209.51.175.25
Wed Feb 13 15:35:17 2019 daemon.info dnsmasq[2829]: 16 192.168.1.224/38629 query[AAAA] google.com from 192.168.1.224
Wed Feb 13 15:35:17 2019 daemon.info dnsmasq[2829]: 16 192.168.1.224/38629 forwarded google.com to 209.51.175.25
Wed Feb 13 15:35:17 2019 daemon.info dnsmasq[2829]: 15 192.168.1.224/38629 reply google.com is 172.217.12.206
Wed Feb 13 15:35:17 2019 daemon.info dnsmasq[2829]: 16 192.168.1.224/38629 reply google.com is 2607:f8b0:4002:c08::71

Actions #5

Updated by Eric Leblond almost 6 years ago

Is it possible to get a dump of the ruleset ? I'm wondering if you could have a CONNMARK rule somewhere that could mess thing up with the second packet.

Actions #6

Updated by Rob Mosher almost 6 years ago

Eric Leblond wrote:

Is it possible to get a dump of the ruleset ? I'm wondering if you could have a CONNMARK rule somewhere that could mess thing up with the second packet.

I commented out all the rule files except for one, which contains a single entry. The issue persists. Have you tried to reproduce? Should be fairly easy.

(DEBIAN)root@rooter:/etc/suricata# suricata -c suricata-rooter.yaml --dump-config | grep rule-files
rule-files = (null)
rule-files.0 = iprep.rules

(DEBIAN)root@rooter:/etc/suricata# cat rules/iprep.rules

# ----- Begin iprep Rules Category ----- #

# -- Begin GID:0 Based Rules -- #

alert ip any any -> any any (msg:"IPREP Blacklist Match"; metadata:policy balanced-ips alert; iprep:dst,Bad,=,99; classtype:bad-unknown; sid:99; rev:1;)
Actions #7

Updated by Victor Julien almost 6 years ago

  • Related to Bug #2435: Suricata 4.0.3 in IPS mode seems to discard some DNS requests added
Actions #8

Updated by Victor Julien almost 6 years ago

I think Eric was referring to the iptables rules.

Can you still reproduce this if Suricata+nfq is in accept mode instead of repeat mode?

Actions #9

Updated by Rob Mosher almost 6 years ago

Victor Julien wrote:

I think Eric was referring to the iptables rules.

Can you still reproduce this if Suricata+nfq is in accept mode instead of repeat mode?

Same behavior in accept mode

Actions #10

Updated by Victor Julien over 5 years ago

I'm unable to reproduce this. I've tried it on a local box (INPUT/OUTPUT NFQUEUE rules) and on a forwarding box (FORWARD NFQUEUE rules). Tried, ping, dig, dnsmasq, but not a packet was dropped. I'm not sure how to analyze this further.

Actions #11

Updated by Rob Mosher over 5 years ago

Victor Julien wrote:

I'm unable to reproduce this. I've tried it on a local box (INPUT/OUTPUT NFQUEUE rules) and on a forwarding box (FORWARD NFQUEUE rules). Tried, ping, dig, dnsmasq, but not a packet was dropped. I'm not sure how to analyze this further.

Are you sure you're sending parallel queries using the same src port? I confirm this is affected by both forward and input rule sets.

You should see requests like this if capturing with tcpdump with identical source ports before any response is received.

00:00:00.000000 IP 192.168.1.224.40887 > 192.168.1.1.53: 9647+ A? google.com. (28)
00:00:00.000020 IP 192.168.1.224.40887 > 192.168.1.1.53: 52827+ AAAA? google.com. (28)

If there's some way you'd like me to recreate this for you, let me know. I just tested this again on Suricata 4.1.2 and it is still bugged there.

Actions #12

Updated by Victor Julien over 5 years ago

Can you reproduce it with this scapy snippet?

from scapy.all import *
pkt_list = []
for x in range(0, 5):
    pkt = IP(dst="8.8.8.8")/UDP(sport=12345,dport=53)/DNS(rd=1,qd=DNSQR(qname="www.google.com",qtype="A"))
    pkt_list.append(pkt)
    pkt = IP(dst="8.8.8.8")/UDP(sport=12345,dport=53)/DNS(rd=1,qd=DNSQR(qname="www.google.com",qtype="AAAA"))
    pkt_list.append(pkt)

ans, unans = sr(pkt_list,verbose=0)
if len(ans) != 10:
    print ("FAILED: got: %u" % len(ans))
else:
    print ("OK")

Actions #13

Updated by Rob Mosher over 5 years ago

Victor Julien wrote:

Can you reproduce it with this scapy snippet?
[...]

For whatever reason, that does not reproduce it.

This is output form the scapy snippet, modified to use unique id's.

18:26:16.445085 IP 192.168.1.224.2345 > 8.8.8.8.53: 9000+ A? www.google.com. (32)
18:26:16.447313 IP 192.168.1.224.2345 > 8.8.8.8.53: 5000+ AAAA? www.google.com. (32)
18:26:16.469077 IP 8.8.8.8.53 > 192.168.1.224.2345: 9000 1/0/0 A 172.217.12.164 (48)
18:26:16.472318 IP 8.8.8.8.53 > 192.168.1.224.2345: 5000 1/0/0 AAAA 2607:f8b0:4006:819::2004 (60)

This is output from using ping which triggers getaddrinfo()

18:26:33.350093 IP 192.168.1.224.41038 > 8.8.8.8.53: 62930+ A? www.google.com. (32)
18:26:33.350099 IP 192.168.1.224.41038 > 8.8.8.8.53: 48460+ AAAA? www.google.com. (32)
18:26:33.365281 IP 8.8.8.8.53 > 192.168.1.224.41038: 62930 1/0/0 A 172.217.3.100 (48)

Going to look closer at the packets and see what the difference is and try to get you a scapy sample to reproduce.

Actions #14

Updated by Rob Mosher over 5 years ago

Can't quite figure out why scapy can't reproduce this, but this does reproduce it


import socket
HOST = socket.getaddrinfo("www.google.com", 0, 0, 0)

oringal query pair
18:44:35.978774 IP 192.168.1.224.48563 > 8.8.8.8.53: 8220+ A? www.google.com. (32)
18:44:35.978780 IP 192.168.1.224.48563 > 8.8.8.8.53: 12506+ AAAA? www.google.com. (32)

only one response received
18:44:35.994388 IP 8.8.8.8.53 > 192.168.1.224.48563: 8220 1/0/0 A 172.217.10.4 (48)

5 seconds later, retries one query at a time.
18:44:40.983729 IP 192.168.1.224.48563 > 8.8.8.8.53: 8220+ A? www.google.com. (32)
18:44:41.004814 IP 8.8.8.8.53 > 192.168.1.224.48563: 8220 1/0/0 A 172.217.10.4 (48)
18:44:41.005109 IP 192.168.1.224.48563 > 8.8.8.8.53: 12506+ AAAA? www.google.com. (32)
18:44:41.025106 IP 8.8.8.8.53 > 192.168.1.224.48563: 12506 1/0/0 AAAA 2607:f8b0:4006:813::2004 (60)

Actions #15

Updated by Rob Mosher over 5 years ago

For the above, this was the resolv.conf file:

nameserver 8.8.8.8
Actions #16

Updated by Rob Mosher over 5 years ago

I think scapy is too slow sending these packets to trigger the issue.

Looking at the interval between them, we can see scapy is many orders slower than libc

Scapy, 0.003 seconds
18:53:19.524443 IP 192.168.1.224.2345 > 8.8.8.8.53: 9000+ A? www.google.com. (32)
18:53:19.527611 IP 192.168.1.224.2345 > 8.8.8.8.53: 5000+ AAAA? www.google.com. (32)
libc 0.000007 seconds
18:53:24.764083 IP 192.168.1.224.44368 > 8.8.8.8.53: 42668+ A? www.google.com. (32)
18:53:24.764090 IP 192.168.1.224.44368 > 8.8.8.8.53: 52728+ AAAA? www.google.com. (32)
Actions #17

Updated by Rob Mosher over 5 years ago

This reproduces in C using crafting packets manually

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include <unistd.h>

#define T_A 1
#define T_AAAA 28

struct dns_header
{
    unsigned short id;

    unsigned char rd :1;
    unsigned char tc :1;
    unsigned char aa :1;
    unsigned char opcode :4;
    unsigned char qr :1;

    unsigned char rcode :4;
    unsigned char cd :1;
    unsigned char ad :1;
    unsigned char z :1;
    unsigned char ra :1;

    unsigned short q_count;
    unsigned short ans_count;
    unsigned short auth_count;
    unsigned short add_count;
};

struct question
{
    unsigned short qtype;
    unsigned short qclass;
};

typedef struct
{
    unsigned char *name;
    struct question *ques;
} query;

int main( int argc , char *argv[])
{
    unsigned char buf[100],*qname;
    int s;

    struct sockaddr_in dest;

    struct dns_header *dns = NULL;
    struct question *qinfo = NULL;

    memset(buf, 0, sizeof(buf));

    s = socket(AF_INET , SOCK_DGRAM , IPPROTO_UDP);

    dest.sin_family = AF_INET;
    dest.sin_port = htons(53);
    dest.sin_addr.s_addr = inet_addr("8.8.8.8");

    dns = (struct dns_header *)&buf;

    dns->id = (unsigned short) htons(getpid());
    dns->rd = 1;
    dns->q_count = htons(1);

    qname =(unsigned char*)&buf[sizeof(struct dns_header)];
    memcpy(qname, "\003www\006google\003com", 15);

    qinfo =(struct question*)&buf[sizeof(struct dns_header) + (strlen((const char*)qname) + 1)];

    qinfo->qtype = htons( T_A );
    qinfo->qclass = htons(1);

    printf("Sending Packets...");
     sendto(s,(char*)buf,sizeof(struct dns_header) + (strlen((const char*)qname)+1) + sizeof(struct question),0,(struct sockaddr*)&dest,sizeof(dest));
    qinfo->qtype = htons( T_AAAA );
     sendto(s,(char*)buf,sizeof(struct dns_header) + (strlen((const char*)qname)+1) + sizeof(struct question),0,(struct sockaddr*)&dest,sizeof(dest));
    printf("Done\n");

}
Actions #18

Updated by Victor Julien over 5 years ago

Using this C program I can reproduce the issue as well. The bad news however is that I can reproduce it w/o Suricata as the NFQUEUE handler as well. When I use the example nf-queue tool from https://git.netfilter.org/libnetfilter_queue/tree/examples then I see the same issue. So it looks to me this is a Linux kernel and/or libnetfilter_queue issue. Could you test the example program as well to see if you see the same?

Actions #19

Updated by Victor Julien over 5 years ago

I've asked Florian Westphal from the Netfilter project, and it seems this issue has been fixed fairly recently:

"They need to pick up following two fixes:

commit ed07d9a021df6da53456663a76999189badc432a
netfilter: nf_conntrack: resolve clash for matching conntracks
commit 4e35c1cb9460240e983a01745b5f29fe3a4d8e39
netfilter: nf_nat: skip nat clash resolution for same-origin entries

The 1st one fixes the problem when no NAT is in play, the 2nd one
(most recent one) also allows us to handle the NAT case."

He suggests: "Please try with kernel 5.0 or stable >= v4.19.29, those should both work."

Are you able to test one of these?

Actions #20

Updated by Eric Leblond over 5 years ago

What is the iptables/nftables ruleset used to reproduce that ?

Actions #21

Updated by Rob Mosher over 5 years ago

Eric Leblond wrote:

What is the iptables/nftables ruleset used to reproduce that ?

https://redmine.openinfosecfoundation.org/issues/2806#note-4

Actions #22

Updated by Rob Mosher over 5 years ago

Victor Julien wrote:

He suggests: "Please try with kernel 5.0 or stable >= v4.19.29, those should both work."

Are you able to test one of these?

I'll either test one of these or backport the commits to 4.14 and post results in a bit.

Actions #23

Updated by Rob Mosher over 5 years ago

Victor Julien wrote:

I've asked Florian Westphal from the Netfilter project, and it seems this issue has been fixed fairly recently:

"They need to pick up following two fixes:

commit ed07d9a021df6da53456663a76999189badc432a
netfilter: nf_conntrack: resolve clash for matching conntracks
commit 4e35c1cb9460240e983a01745b5f29fe3a4d8e39
netfilter: nf_nat: skip nat clash resolution for same-origin entries

The 1st one fixes the problem when no NAT is in play, the 2nd one
(most recent one) also allows us to handle the NAT case."

He suggests: "Please try with kernel 5.0 or stable >= v4.19.29, those should both work."

Are you able to test one of these?

Updated to 4.19.34, confirmed the issue is resolved. Thanks for helping track this down!

I tried digging through the Suricata code to find out where this was getting dropped... this explains why I couldn't find out where.

Linux rooter 4.19.34 #0 SMP Tue Apr 9 03:41:00 2019 x86_64 GNU/Linux

Actions #24

Updated by Victor Julien over 5 years ago

  • Status changed from New to Closed

Great, thanks for the test case and the confirmation.

Actions

Also available in: Atom PDF