Bug #1174
closedsegfault in suricata 2.0
Description
I'm having a segfault occur about once a week with suricata 2.0 . I
think the issue is may not be specific to just 2.0, we ran 1.4.7 for a
little while and it segfaulted once or twice too. All the core dumps
I've captured seem to point at a buffer overflow in the memcpy function
called at stream-tcp-reassemble.c line 3139.
Stack trace:
(gdb) bt
#0 0x0000003968432925 in raise (sig=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000003968434105 in abort () at abort.c:92
#2 0x0000003968470837 in __libc_message (do_abort=2,
fmt=0x3968557930 "*** %s ***: %s terminated\n")
at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3 0x0000003968502827 in __fortify_fail (
msg=0x39685578d6 "buffer overflow detected") at fortify_fail.c:32
#4 0x0000003968500710 in __chk_fail () at chk_fail.c:29
#5 0x0000000000511230 in memcpy (tv=0xad3dd80, ra_ctx=0x7f75c0000fb0,
ssn=0x7f75c3ae0050, stream=0x7f75c3ae0058, p=0x33e4230)
at /usr/include/bits/string3.h:52
#6 StreamTcpReassembleAppLayer (tv=0xad3dd80, ra_ctx=0x7f75c0000fb0,
ssn=0x7f75c3ae0050, stream=0x7f75c3ae0058, p=0x33e4230)
at stream-tcp-reassemble.c:3139
#7 0x00000000005115c0 in StreamTcpReassembleHandleSegmentUpdateACK (
tv=0xad3dd80, ra_ctx=0x7f75c0000fb0, ssn=0x7f75c3ae0050,
stream=0x7f75c3ae0058, p=0x33e4230) at stream-tcp-reassemble.c:3545
#8 0x0000000000513773 in StreamTcpReassembleHandleSegment (tv=0xad3dd80,
ra_ctx=0x7f75c0000fb0, ssn=0x7f75c3ae0050, stream=0x7f75c3ae00a0,
p=0x33e4230, pq=<value optimized out>) at stream-tcp-reassemble.c:3573
#9 0x000000000050b09b in HandleEstablishedPacketToClient (tv=0xad3dd80,
p=0x33e4230, stt=0x7f75c00008c0, ssn=0x7f75c3ae0050,
pq=<value optimized out>) at stream-tcp.c:2091
#10 StreamTcpPacketStateEstablished (tv=0xad3dd80, p=0x33e4230,
stt=0x7f75c00008c0, ssn=0x7f75c3ae0050, pq=<value optimized out>)
at stream-tcp.c:2337
#11 0x000000000050e670 in StreamTcpPacket (tv=0xad3dd80, p=0x33e4230,
stt=0x7f75c00008c0, pq=0xad3deb0) at stream-tcp.c:4243
#12 0x000000000050f4d3 in StreamTcp (tv=0xad3dd80, p=0x33e4230,
data=0x7f75c00008c0, pq=<value optimized out>,
postpq=<value optimized out>) at stream-tcp.c:4485
#13 0x0000000000524109 in TmThreadsSlotVarRun (tv=0xad3dd80, p=0x33e4230,
slot=<value optimized out>) at tm-threads.c:557
#14 0x00000000005242e9 in TmThreadsSlotVar (td=0xad3dd80) at
tm-threads.c:814
#15 0x0000003aede079d1 in start_thread (arg=0x7f75cbfff700)
at pthread_create.c:301
#16 0x00000039684e8b6d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
compiled with command:
CFLAGS="-O2 -g" CCFLAGS="-O2 -g" ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --libdir=/usr/lib64 --enable-gccprotect --with-nss-includes=/usr/include/nss3 --with-libnspr-includes=/usr/include/nspr
Suricata Configuration: AF_PACKET support: yes PF_RING support: no NFQueue support: no IPFW support: no DAG enabled: no Napatech enabled: no Unix socket enabled: yes Detection enabled: yes libnss support: yes libnspr support: yes libjansson support: yes Prelude support: no PCRE jit: no libluajit: no libgeoip: no Non-bundled htp: no Old barnyard2 support: no CUDA enabled: no Suricatasc install: yes Unit tests enabled: no Debug output enabled: no Debug validation enabled: no Profiling enabled: no Profiling locks enabled: no Coccinelle / spatch: no Generic build parameters: Installation prefix (--prefix): /usr Configuration directory (--sysconfdir): /etc/suricata/ Log directory (--localstatedir) : /var/log/suricata/ Host: x86_64-unknown-linux-gnu GCC binary: gcc GCC Protect enabled: yes GCC march native enabled: yes GCC Profile enabled: no
Suricata run with command:
suricata -c /etc/suricata/suricata.yaml --af-packet=eth2 -D
suricata.yaml minified:
%YAML 1.1
---
host-mode: sniffer-only
default-log-dir: /var/log/suricata/
unix-command:
enabled: no
outputs:
- fast:
enabled: no
filename: fast.log
append: yes
- eve-log:
enabled: no
type: file #file|syslog|unix_dgram|unix_stream
filename: eve.json
types:
- alert
- http:
extended: yes # enable this for extended logging information
- dns
- tls:
extended: yes # enable this for extended logging information
- files:
force-magic: no # force logging magic on all logged files
force-md5: no # force logging of md5 checksums
- ssh
- unified2-alert:
enabled: yes
filename: unified2.alert
limit: 32mb
sensor-id: 0
xff:
enabled: yes
mode: extra-data
header: X-Forwarded-For
- http-log:
enabled: no
filename: http.log
append: yes
- tls-log:
enabled: no # Log TLS connections.
filename: tls.log # File to store TLS logs.
append: yes
certs-log-dir: certs # directory to store the certificates files
- dns-log:
enabled: no
filename: dns.log
append: yes
- pcap-info:
enabled: no
- pcap-log:
enabled: no
filename: log.pcap
limit: 1000mb
max-files: 2000
mode: normal # normal or sguil.
use-stream-depth: no #If set to "yes" packets seen after reaching
stream inspection depth are ignored. "no" logs all packets
- alert-debug:
enabled: no
filename: alert-debug.log
append: yes
- alert-prelude:
enabled: no
profile: suricata
log-packet-content: no
log-packet-header: yes
- stats:
enabled: no
filename: stats.log
interval: 8
- syslog:
enabled: no
facility: local5
- drop:
enabled: no
filename: drop.log
append: yes
- file-store:
enabled: no # set to yes to enable
log-dir: files # directory to store the files
force-magic: no # force logging magic on all stored files
force-md5: no # force logging of md5 checksums
- file-log:
enabled: no
filename: files-json.log
append: yes
force-magic: no # force logging magic on all logged files
force-md5: no # force logging of md5 checksums
magic-file: /usr/share/file/magic
nfq:
af-packet:
- interface: eth2
threads: 8
cluster-id: 99
cluster-type: cluster_flow
defrag: yes
use-mmap: no
checksum-checks: no
- interface: eth1
threads: 1
cluster-id: 98
cluster-type: cluster_flow
defrag: yes
- interface: default
legacy:
uricontent: enabled
detect-engine:
- profile: high
- custom-values:
toclient-src-groups: 15
toclient-dst-groups: 15
toclient-sp-groups: 15
toclient-dp-groups: 20
toserver-src-groups: 15
toserver-dst-groups: 15
toserver-sp-groups: 15
toserver-dp-groups: 40
- sgh-mpm-context: auto
- inspection-recursion-limit: 3000
threading:
set-cpu-affinity: no
cpu-affinity:
- management-cpu-set:
cpu: [ 0 ] # include only these cpus in affinity settings
- receive-cpu-set:
cpu: [ 0 ] # include only these cpus in affinity settings
- decode-cpu-set:
cpu: [ 0, 1 ]
mode: "balanced"
- stream-cpu-set:
cpu: [ "0-1" ]
- detect-cpu-set:
cpu: [ "all" ]
mode: "exclusive" # run detect threads in these cpus
prio:
low: [ 0 ]
medium: [ "1-2" ]
high: [ 3 ]
default: "medium"
- verdict-cpu-set:
cpu: [ 0 ]
prio:
default: "high"
- reject-cpu-set:
cpu: [ 0 ]
prio:
default: "low"
- output-cpu-set:
cpu: [ "all" ]
prio:
default: "medium"
detect-thread-ratio: 1.5
cuda:
mpm:
data-buffer-size-min-limit: 0
data-buffer-size-max-limit: 1500
cudabuffer-buffer-size: 500mb
gpu-transfer-size: 50mb
batching-timeout: 2000
device-id: 0
cuda-streams: 2
mpm-algo: ac
pattern-matcher:
- b2gc:
search-algo: B2gSearchBNDMq
hash-size: low
bf-size: medium
- b2gm:
search-algo: B2gSearchBNDMq
hash-size: low
bf-size: medium
- b2g:
search-algo: B2gSearchBNDMq
hash-size: low
bf-size: medium
- b3g:
search-algo: B3gSearchBNDMq
hash-size: low
bf-size: medium
- wumanber:
hash-size: low
bf-size: medium
defrag:
memcap: 32mb
hash-size: 65536
trackers: 65535 # number of defragmented flows to follow
max-frags: 65535 # number of fragments to keep (higher than trackers)
prealloc: yes
timeout: 60
flow:
memcap: 64mb
hash-size: 65536
prealloc: 10000
emergency-recovery: 30
vlan:
use-for-tracking: true
flow-timeouts:
default:
new: 30
established: 300
closed: 0
emergency-new: 10
emergency-established: 100
emergency-closed: 0
tcp:
new: 60
established: 3600
closed: 120
emergency-new: 10
emergency-established: 300
emergency-closed: 20
udp:
new: 30
established: 300
emergency-new: 10
emergency-established: 100
icmp:
new: 30
established: 300
emergency-new: 10
emergency-established: 100
stream:
memcap: 32mb
checksum-validation: no # reject wrong csums
inline: auto # auto will use inline mode in IPS mode,
yes or no set it statically
reassembly:
memcap: 128mb
depth: 1mb # reassemble 1mb into a stream
toserver-chunk-size: 2560
toclient-chunk-size: 2560
randomize-chunk-size: yes
host:
hash-size: 4096
prealloc: 1000
memcap: 16777216
logging:
default-log-level: notice
default-output-filter:
outputs:
- console:
enabled: yes
- file:
enabled: yes
filename: /var/log/suricata/suricata.log
- syslog:
enabled: no
facility: local5
format: "[%i] <%d> -- "
mpipe:
load-balance: dynamic
iqueue-packets: 2048
inputs:
- interface: xgbe2
- interface: xgbe3
- interface: xgbe4
stack:
size128: 0
size256: 9
size512: 0
size1024: 0
size1664: 7
size4096: 0
size10386: 0
size16384: 0
pfring:
- interface: eth0
threads: 1
cluster-id: 99
cluster-type: cluster_flow
- interface: default
pcap:
- interface: eth0
- interface: default
pcap-file:
checksum-checks: auto
ipfw:
default-rule-path: /etc/suricata/rules
rule-files:
- botcc.portgrouped.rules
- ciarmy.rules
- compromised.rules
- drop.rules
- dshield.rules
- emerging-activex.rules
- emerging-attack_response.rules
- emerging-chat.rules
- emerging-current_events.rules
- emerging-dns.rules
- emerging-dos.rules
- emerging-exploit.rules
- emerging-ftp.rules
- emerging-games.rules
- emerging-imap.rules
- emerging-inappropriate.rules
- emerging-malware.rules
- emerging-misc.rules
- emerging-mobile_malware.rules
- emerging-netbios.rules
- emerging-p2p.rules
- emerging-policy.rules
- emerging-pop3.rules
- emerging-rpc.rules
- emerging-scada.rules
- emerging-scan.rules
- emerging-shellcode.rules
- emerging-smtp.rules
- emerging-snmp.rules
- emerging-sql.rules
- emerging-telnet.rules
- emerging-tftp.rules
- emerging-trojan.rules
- emerging-user_agents.rules
- emerging-voip.rules
- emerging-web_client.rules
- emerging-web_server.rules
- emerging-web_specific_apps.rules
- emerging-worm.rules
- tor.rules
- http-events.rules # available in suricata sources under rules dir
- smtp-events.rules # available in suricata sources under rules dir
classification-file: /etc/suricata/rules/classification.config
reference-config-file: /etc/suricata/rules/reference.config
vars:
address-groups:
HOME_NET:
"[192.168.0.0/16,10.0.0.0/8,172.16.0.0/12,50.114.0.0/16,199.58.198.224/27,199.58.199.0/24,69.27.166.0/26]"
EXTERNAL_NET: "!$HOME_NET"
HTTP_SERVERS: "$HOME_NET"
SMTP_SERVERS: "$HOME_NET"
SQL_SERVERS: "$HOME_NET"
DNS_SERVERS: "$HOME_NET"
TELNET_SERVERS: "$HOME_NET"
AIM_SERVERS: "$EXTERNAL_NET"
DNP3_SERVER: "$HOME_NET"
DNP3_CLIENT: "$HOME_NET"
MODBUS_CLIENT: "$HOME_NET"
MODBUS_SERVER: "$HOME_NET"
ENIP_CLIENT: "$HOME_NET"
ENIP_SERVER: "$HOME_NET"
port-groups:
HTTP_PORTS: "80"
SHELLCODE_PORTS: "!80"
ORACLE_PORTS: 1521
SSH_PORTS: 22
DNP3_PORTS: 20000
action-order:
- pass
- drop
- reject
- alert
host-os-policy:
windows: []
bsd: []
bsd-right: []
old-linux: []
linux: [0.0.0.0/0]
old-solaris: []
solaris: []
hpux10: []
hpux11: []
irix: []
macos: []
vista: []
windows2k3: []
asn1-max-frames: 256
engine-analysis:
rules-fast-pattern: yes
rules: yes
pcre:
match-limit: 3500
match-limit-recursion: 1500
app-layer:
protocols:
tls:
enabled: yes
detection-ports:
toserver: 443
dcerpc:
enabled: yes
ftp:
enabled: yes
ssh:
enabled: yes
smtp:
enabled: yes
imap:
enabled: detection-only
msn:
enabled: detection-only
smb:
enabled: yes
detection-ports:
toserver: 139
dns:
tcp:
enabled: yes
detection-ports:
toserver: 53
udp:
enabled: yes
detection-ports:
toserver: 53
http:
enabled: yes
libhtp:
default-config:
personality: IDS
request-body-limit: 3072
response-body-limit: 3072
request-body-minimal-inspect-size: 32kb
request-body-inspect-window: 4kb
response-body-minimal-inspect-size: 32kb
response-body-inspect-window: 4kb
double-decode-path: no
double-decode-query: no
server-config:
profiling:
rules:
enabled: yes
filename: rule_perf.log
append: yes
sort: avgticks
limit: 100
keywords:
enabled: yes
filename: keyword_perf.log
append: yes
packets:
enabled: yes
filename: packet_stats.log
append: yes
csv:
enabled: no
filename: packet_stats.csv
locks:
enabled: no
filename: lock_stats.log
append: yes
coredump:
max-dump: unlimited
napatech:
hba: -1
use-all-streams: yes
streams: [1, 2, 3]
Let me know if I need to provide any more information or enable features.
Thanks,
Jason
Updated by Victor Julien over 11 years ago
Could try compiling with --enable-debug and then run that for a while? It will add some extra checks in this part of the code.
Updated by Jason Borden over 11 years ago
I recompiled and ran it with --enable-debug and set default-log-level: debug in suricata.yaml . After ten minutes of running the log file is already 6GB in size so I don't think that's going to work. I can change the surrounding SCLogDebug calls to SCLogInfo instead an run with info level debug. That might give the pertinent information without having to log so much data.
Updated by Victor Julien over 11 years ago
Oh sorry, didn't mention that you don't need to change the log level. Keep it on info or notice. Under the hood there is some more aggressive (and costly) checking if debug is compiled in.
Updated by Jason Borden over 11 years ago
OK, good to know. I've got it compiled and running with --enable-debug and I'll report back next time it segfaults.
Updated by Jason Borden over 11 years ago
Had another segfault today with --enable-debug running. Looks about the same as all the others I've had.
(gdb) bt
#0 0x0000003968432925 in raise (sig=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000003968434105 in abort () at abort.c:92
#2 0x0000003968470837 in __libc_message (do_abort=2,
fmt=0x3968557930 "*** %s ***: %s terminated\n")
at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3 0x0000003968502827 in __fortify_fail (
msg=0x39685578d6 "buffer overflow detected") at fortify_fail.c:32
#4 0x0000003968500710 in __chk_fail () at chk_fail.c:29
#5 0x000000000059f0b9 in memcpy (tv=0xed1a760, ra_ctx=0x7f0160000fb0,
ssn=0x7f016126c290, stream=0x7f016126c298, p=0x25b9d40)
at /usr/include/bits/string3.h:52
#6 StreamTcpReassembleAppLayer (tv=0xed1a760, ra_ctx=0x7f0160000fb0,
ssn=0x7f016126c290, stream=0x7f016126c298, p=0x25b9d40)
at stream-tcp-reassemble.c:3139
#7 0x00000000005ac52f in StreamTcpReassembleHandleSegmentUpdateACK (
tv=0xed1a760, ra_ctx=0x7f0160000fb0, ssn=0x7f016126c290,
stream=0x7f016126c298, p=0x25b9d40) at stream-tcp-reassemble.c:3545
#8 0x00000000005ac721 in StreamTcpReassembleHandleSegment (tv=0xed1a760,
ra_ctx=0x7f0160000fb0, ssn=0x7f016126c290, stream=0x7f016126c2e0,
p=0x25b9d40, pq=<value optimized out>) at stream-tcp-reassemble.c:3573
#9 0x000000000058db07 in HandleEstablishedPacketToServer (tv=0xed1a760,
p=0x25b9d40, stt=0x7f01600008c0, ssn=0x7f016126c290, pq=0x7f01600008d0)
at stream-tcp.c:1969
#10 StreamTcpPacketStateEstablished (tv=0xed1a760, p=0x25b9d40,
stt=0x7f01600008c0, ssn=0x7f016126c290, pq=0x7f01600008d0)
at stream-tcp.c:2323
#11 0x0000000000593da0 in StreamTcpPacket (tv=0xed1a760, p=0x25b9d40,
stt=0x7f01600008c0, pq=0x29eebc30) at stream-tcp.c:4243
#12 0x00000000005995a9 in StreamTcp (tv=0xed1a760, p=0x25b9d40,
data=0x7f01600008c0, pq=0x29eebc30, postpq=<value optimized out>)
at stream-tcp.c:4485
#13 0x00000000005c0e69 in TmThreadsSlotVarRun (tv=0xed1a760, p=0x25b9d40,
slot=<value optimized out>) at tm-threads.c:557
#14 0x00000000005c10a6 in TmThreadsSlotVar (td=0xed1a760) at tm-threads.c:814
#15 0x0000003aede079d1 in start_thread (arg=0x7f0182ca1700)
at pthread_create.c:301
#16 0x00000039684e8b6d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
Let me know where you want to go from here.
Updated by Jason Borden over 11 years ago
I've made some progress on determining the issue here. The problem manifests on line 3129 of stream-tcp-reassemble.c:
if (copy_size > (seg->payload_len - payload_offset)) {
copy_size = (seg->payload_len - payload_offset);
}
When I get a segfault, seg->payload_len is less than payload_offset which in turn tries to set copy_size to a negative value. Since copy_size is a uint16_t this takes and subtracts the negative value from 65536. Then on line 3139 it tries to memcpy far more data than can be stored in the 4k size of the data variable causing a segfault.
What I'm not sure of is why I occasionally get a seg->payload_len that is less than my payload_offset. I haven't heard anyone else having this issue, so my thoughts are that maybe it's a configuration issue. I'll try reverting some of my settings back to defaults to see if I can figure out what the cause is.
Updated by Victor Julien over 11 years ago
If you get this again, can you post the following gdb output:
Jump to frame 6 (the one that has "#6 StreamTcpReassembleAppLayer (tv=0xad3dd80, ra_ctx=0x7f75c0000fb0,")
(gdb) f 6
(gdb) print *ssn
(gdb) print ra_base_seq
(gdb) print payload_len
(gdb) print payload_offset
(gdb) print copy_size
(gdb) print *seg
Hopefully this will be enough to find the cause.
Please don't change your settings. That would only hide the bug, because that is what it is :)
Updated by Jason Borden over 11 years ago
(gdb) f 6
#6 StreamTcpReassembleAppLayer (tv=0x9f0de70, ra_ctx=0x7f1b50000fb0,
ssn=0x7f1b51c5c860, stream=0x7f1b51c5c868, p=0x27682e0)
at stream-tcp-reassemble.c:3139
3139 memcpy(data + data_len, seg->payload +
(gdb) p *ssn
$1 = {res = 0, state = 4 '\004', queue_len = 0 '\000',
data_first_seen_dir = -13 '\363', flags = 5648, server = {flags = 128,
wscale = 6 '\006', os_policy = 5 '\005', isn = 3552530536,
next_seq = 3552862937, last_ack = 3552851289, next_win = 3552886425,
window = 29312, last_ts = 0, last_pkt_ts = 0,
ra_app_base_seq = 3552854200, ra_raw_base_seq = 3552849832,
seg_list = 0x7f1b513eda50, seg_list_tail = 0x7f1b49d6f410,
sack_head = 0x7f1ae43a6030, sack_tail = 0x7f1ae43a6030}, client = {
flags = 160, wscale = 6 '\006', os_policy = 0 '\000', isn = 1221981603,
next_seq = 1221982478, last_ack = 1221982478, next_win = 1222013390,
window = 30912, last_ts = 0, last_pkt_ts = 1397754340,
ra_app_base_seq = 1221982477, ra_raw_base_seq = 1221982477,
seg_list = 0x0, seg_list_tail = 0x0, sack_head = 0x0, sack_tail = 0x0},
toserver_smsg_head = 0x0, toserver_smsg_tail = 0x0,
toclient_smsg_head = 0x7f1b53ab7420, toclient_smsg_tail = 0x7f1b53ab7420,
queue = 0x0}
(gdb) p ra_base_seq
$2 = 3552858296
(gdb) p payload_len
$3 = 272
(gdb) p payload_offset
$4 = 11376
(gdb) p copy_size
$5 = 64352
(gdb) p *seg
$6 = {payload = 0x7f1b52851180 "e\243k", payload_len = 10192,
pool_size = 65535, seq = 3552846921, next = 0x7f1b49d6f410, prev = 0x0,
flags = 1 '\001'}
Updated by Victor Julien over 11 years ago
It seems the problem actually originates from before this packet. We have a case where stream->ra_app_base_seq > stream->last_ack, which shouldn't be possible.
Are you able to reproduce this on a pcap? Is it possible for you to record some of your traffic and replay it somehow to Suricata (or directly read the pcap file) to see if that reproduces the issue?
Updated by Jason Borden over 11 years ago
At this point I'm about 90% sure that the problem is related to having checksum checks turned off. I have not yet been able to reproduce the issue when they are turned on. I'll still try out pcap with checksums off and record the traffic.
Updated by Victor Julien over 11 years ago
Thanks. Even if the checksum checks somehow influence it, a crash still shouldn't happen in any case.
Updated by Jason Borden over 11 years ago
I've been running with --pcap=eth2 and checksums off for the past couple weeks and haven't been able to reproduce the bug.