Project

General

Profile

Actions

Bug #4109

closed

mac address logging crash

Added by Jan Hugo Prins over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Target version:
Affected Versions:
Effort:
Difficulty:
medium
Label:

Description

context:
I have 3 servers running both Zeek and Suricata using a zbalance_ipc setup.
To make this work I compiled Suricata with pfring support and installed the pfring_zc drivers on the servers.
Zeek has been running like this for more then 2 years now, but Suricata has not been able to stay online more then a few hours.

Versions:
PFRing_ZC: 7.9.0-3263
Suricata: 6.0.0

suricata --build-info
This is Suricata version 6.0.0 RELEASE
Features: NFQ PCAP_SET_BUFF PF_RING AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LIBJANSSON TLS TLS_GNU MAGIC RUST
SIMD support: none
Atomic intrinsics: 1 2 4 8 byte(s)
64-bits, Little-endian architecture
GCC version 4.8.5 20150623 (Red Hat 4.8.5-39), C version 199901
compiled with _FORTIFY_SOURCE=2
L1 cache line size (CLS)=64
thread local storage method: __thread
compiled with LibHTP v0.5.35, linked against LibHTP v0.5.35

Suricata Configuration:
AF_PACKET support: yes
eBPF support: no
XDP support: no
PF_RING support: yes
NFQueue support: yes
NFLOG support: no
IPFW support: no
Netmap support: no
DAG enabled: no
Napatech enabled: no
WinDivert enabled: no

Unix socket enabled:                     yes
Detection enabled: yes
Libmagic support:                        yes
libnss support: yes
libnspr support: yes
libjansson support: yes
hiredis support: yes
hiredis async with libevent: yes
Prelude support: no
PCRE jit: yes
LUA support: yes
libluajit: no
GeoIP2 support: yes
Non-bundled htp: no
Old barnyard2 support:
Hyperscan support: no
Libnet support: yes
liblz4 support: yes
Rust support:                            yes
Rust strict mode: no
Rust compiler path: /usr/bin/rustc
Rust compiler version: rustc 1.47.0
Cargo path: /usr/bin/cargo
Cargo version: cargo 1.47.0
Cargo vendor: yes
Python support:                          yes
Python path: /usr/bin/python2.7
Python distutils yes
Python yaml yes
Install suricatactl: yes
Install suricatasc: yes
Install suricata-update: yes
Profiling enabled:                       no
Profiling locks enabled: no
Plugin support (experimental):           yes

Development settings:
Coccinelle / spatch: no
Unit tests enabled: no
Debug output enabled: no
Debug validation enabled: no

Generic build parameters:
Installation prefix: /usr
Configuration directory: /etc/suricata/
Log directory: /var/log/suricata/

--prefix                                 /usr
--sysconfdir /etc
--localstatedir /var
--datarootdir /usr/share
Host:                                    x86_64-redhat-linux-gnu
Compiler: gcc (exec name) / g++ (real)
GCC Protect enabled: yes
GCC march native enabled: no
GCC Profile enabled: no
Position Independent Executable enabled: yes
CFLAGS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -std=gnu99 -I${srcdir}/../rust/gen -I${srcdir}/../rust/dist
PCAP_CFLAGS
SECCFLAGS -fstack-protector -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security

The error in my journallog:
idsprobe03.ids.be.nl kernel: W#01-zc:0@34145: segfault at 130 ip 0000562e96ebfc08 sp 00007f6d026c1418 error 4 in suricata[562e96c9b000+61c000]

The error in the Suricata systemd journal:
Nov 04 16:06:36 idsprobe03.ids.be.nl systemd1: : main process exited, code=killed, status=11/SEGV
Nov 04 16:06:36 idsprobe03.ids.be.nl systemd1: Unit entered failed state.
Nov 04 16:06:36 idsprobe03.ids.be.nl systemd1: failed.


Files

debug_output_core4695.txt (19.6 KB) debug_output_core4695.txt Jan Hugo Prins, 11/04/2020 05:55 PM
debug_output_core4697.txt (19.7 KB) debug_output_core4697.txt Jan Hugo Prins, 11/04/2020 05:55 PM
valgrind.log (10.8 KB) valgrind.log Jan Hugo Prins, 11/05/2020 07:44 PM
weird_packet_bgp01.pcap (570 Bytes) weird_packet_bgp01.pcap packets with truncated headers Jan Hugo Prins, 11/07/2020 03:22 AM
ethernet-metadata-packet-context.patch (2.25 KB) ethernet-metadata-packet-context.patch Sascha Steinbiss, 11/08/2020 12:03 PM
gdb_dump_core-W#01-zc_0@2-11-993-990-442-1604842672 (20 KB) gdb_dump_core-W#01-zc_0@2-11-993-990-442-1604842672 Jan Hugo Prins, 11/08/2020 03:01 PM
gdb_dump_core-W#01-11-993-990-12096-1604846259 (13.9 KB) gdb_dump_core-W#01-11-993-990-12096-1604846259 Jan Hugo Prins, 11/08/2020 03:01 PM
core-W#01-zc_0@2-11-993-990-442-1604842672.pcap (176 Bytes) core-W#01-zc_0@2-11-993-990-442-1604842672.pcap Jan Hugo Prins, 11/08/2020 03:01 PM
Actions #1

Updated by Jan Hugo Prins over 3 years ago

Nov 04 17:32:16 idsprobe01.ids.be.nl kernel: W#01-zc:0@34791: segfault at 130 ip 000055efde4e7c08 sp 00007f5d6262b418 error 4 in suricata[55efde2c3000+61c000]
Nov 04 17:48:32 idsprobe01.ids.be.nl kernel: W#01-zc:1@24747: segfault at 130 ip 00005629a7abbc08 sp 00007f9b5b2eb418 error 4 in suricata[5629a7897000+61c000]

I have the coredumps for these crashes, but they are too big to upload them here.

Get in touch with me so we can arrange a way for you to get them savely.

Jan Hugo Prins

Actions #2

Updated by Jan Hugo Prins over 3 years ago

[root@idsprobe01 debuginfo_idsprobe01]# gdb /sbin/suricata /var/tmp/core.4695
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/&gt;...
Reading symbols from /usr/sbin/suricata...Reading symbols from /usr/lib/debug/usr/sbin/suricata.debug...done.
done.

warning: core file may not match specified executable file.
[New LWP 4747]
[New LWP 4695]
[New LWP 4766]
[New LWP 4763]
[New LWP 4765]
[New LWP 4760]
[New LWP 4764]
[New LWP 4767]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/sbin/suricata -c /etc/suricata/cluster1.yaml --pidfile /var/run/suricata/clust'.
Program terminated with signal 11, Segmentation fault.
#0 0x00005629a7abbc08 in StorageGetById (storage=storage@entry=0x128, type=type@entry=STORAGE_FLOW, id=1) at util-storage.c:224
224 return storage[id];

Actions #4

Updated by Jan Hugo Prins over 3 years ago

Valgrind logfile added with the crash of suricata in it.

Actions #5

Updated by Victor Julien over 3 years ago

  • Assignee set to Sascha Steinbiss

Crash seems here in the MAC address logging:

    if (cfg->include_ethernet) {
        MacSet *ms = FlowGetStorageById((Flow*) f, MacSetGetFlowStorageID()); // <- HERE
        if (ms != NULL)
            CreateJSONEther(js, p, ms);
    }

Sascha any idea?

Actions #6

Updated by Jan Hugo Prins over 3 years ago

It looks like Suricata is trying to read from address 0x130. Not an address normally used by a userland program I would think.

Actions #7

Updated by Sascha Steinbiss over 3 years ago

Jan Hugo Prins wrote in #note-6:

It looks like Suricata is trying to read from address 0x130. Not an address normally used by a userland program I would think.

Yes, the seems to suggest that (https://redmine.openinfosecfoundation.org/attachments/2178#L253):

Thread 1 (Thread 0x7f5d6262c700 (LWP 4791)):
#0  0x000055efde4e7c08 in StorageGetById (storage=storage@entry=0x128, type=type@entry=STORAGE_FLOW, id=1) at util-storage.c:224
No locals.
#1  0x000055efde44f513 in FlowGetStorageById (f=f@entry=0x0, id=<optimized out>) at flow-storage.c:41
No locals.

From #1, it looks like the flow f is NULL. FlowGetStorageById() then calculates StorageGetById((Storage *)((void *)f + sizeof(Flow)), STORAGE_FLOW, id) which then seems to lead to the call in #0 above. Apparently we need a valid flow pointer here.

Can a flow be NULL in such a context? If so, it was a bad idea to store the MacSet in flow storage...

Anyway, I would also be happy to take a look at the coredumps. You can reach me via satta~at~debian.org.

Actions #8

Updated by Jan Hugo Prins over 3 years ago

I really don't know if the flow f can be NULL.
I would think that every packet belongs to a flow, but I disabled (at least I think) both flow and netflow in the config because I use Zeek and Packetbeat to collect netflow information.

Actions #9

Updated by Sascha Steinbiss over 3 years ago

Maybe a question for Victor to answer...

Actions #10

Updated by Jason Ish over 3 years ago

Jan Hugo Prins wrote in #note-8:

I really don't know if the flow f can be NULL.
I would think that every packet belongs to a flow, but I disabled (at least I think) both flow and netflow in the config because I use Zeek and Packetbeat to collect netflow information.

The flow ID is still logged with alerts, etc. Could try disabling all ouput, and see if it still crashes. Some of the data dumped in the hdr in your gdb output looks odd to me as well, and the address out of bound errors, which appears to be in data provided from PF_RING.

I've used PF_RING recently without issue, but not ZC.

Actions #11

Updated by Jan Hugo Prins over 3 years ago

Did some analyses and here the results for now:
My config is not "the default" config.
- I enabled the 'ethernet' option in the eve section, in the default this is disabled. Maybe this is something?
- I disabled the 'flow' and the 'netflow' option in the eve section.

I have done several test the last couple of hours:
- Disabled eve logging on one of the servers several hours ago. No crashes on that server so far.
- Set the flow option in the eve section on one server, this instance crashed within several minutes.
- Set the netflow option in the eve section on one of the servers, currently (after 30 minutes) still running, but it has only been 30 minutes so far. Will see tomorrow.
- On the server where I set the flow option, I disabled the 'ethernet' option in the eve section. I did this because the error was about logging MAC addresses. Started it a few minutes ago, so too early to call.

Actions #12

Updated by Jan Hugo Prins over 3 years ago

Could you explain what output in the gdb output looks odd to you?
Maybe I'm able to explain.

Actions #13

Updated by Jan Hugo Prins over 3 years ago

Suricata was stable during last night test run in the following configuration combinations.

- Without eve logging.
- With "netflow" logging in combination with the "ethernet" option.
- With "flow" logging with the "ethernet" option disabled.

I have a lot of asymmetric routing in my network (3 external BGP routers), so maybe that has something to do with this?

I have now configured "netflow" in combination with the "ethernet" option to see if it is stable for a longer period of time on all probe servers.

Actions #14

Updated by Jan Hugo Prins over 3 years ago

Just had a crash with the combination "netflow" / "ethernet".
Now disabled the "ethernet" option on all servers to validate this.

Actions #15

Updated by Jason Ish over 3 years ago

Jan Hugo Prins wrote in #note-12:

Could you explain what output in the gdb output looks odd to you?
Maybe I'm able to explain.

The ip_ver for instance, and ethertype. But I suppose this isn't related to the crash, and it just what is being seen in another thread, and it may be gracefully discarded, or only looks off due to ZC. Not sure how pf_ring zc works, but can you try to get a pcap around the time its crashing?

Actions #16

Updated by Jan Hugo Prins over 3 years ago

Creating a pcap during the crash will be a bit difficult, but zeek has been running constantly and should have logged all metadata. I will see what I can find for these crashes.

Jan Hugo

Actions #17

Updated by Jan Hugo Prins over 3 years ago

I have found the packets that created the 2 crashes in for which I have the core files and the backtraces above:

core.4695
CYKo4i1amV4MORqzI8

{"ts":1604508512.428319,"uid":"CYKo4i1amV4MORqzI8","id.orig_h":"91.203.165.166","id.orig_p":47011,"id.resp_h":"95.130.232.214","id.resp_p":445,"proto":"tcp","conn_state":"OTH","local_orig":false,"local_resp":true,"missed_bytes":0,"orig_pkts":0,"orig_ip_bytes":0,"resp_pkts":0,"resp_ip_bytes":0,"orig_l2_addr":"70:e4:22:25:60:3e","resp_l2_addr":"00:25:90:e3:d7:6d"} {"ts":1604508512.428319,"uid":"CYKo4i1amV4MORqzI8","id.orig_h":"91.203.165.166","id.orig_p":47011,"id.resp_h":"95.130.232.214","id.resp_p":445,"name":"truncated_header","notice":false,"peer":"worker-2-1"}

core.4697
C6w7IU2xoXeWnSLd0e

{"ts":1604507536.64603,"uid":"C6w7IU2xoXeWnSLd0e","id.orig_h":"95.130.232.190","id.orig_p":3,"id.resp_h":"189.194.58.119","id.resp_p":0,"proto":"icmp","duration":0.0001659393310546875,"orig_bytes":104,"resp_bytes":0,"conn_state":"OTH","local_orig":true,"local_resp":false,"missed_bytes":0,"orig_pkts":2,"orig_ip_bytes":160,"resp_pkts":0,"resp_ip_bytes":0,"orig_l2_addr":"00:25:90:e3:d0:bd","resp_l2_addr":"0c:86:10:ed:c7:c6"}

Actions #18

Updated by Jan Hugo Prins over 3 years ago

I have also found an other packet that crashed a Suricata server on one of my other probes around the same time:

{"ts":1604508707.507337,"uid":"C1UOGu2TzEgYOvF2eh","id.orig_h":"91.203.165.166","id.orig_p":47011,"id.resp_h":"95.130.232.204","id.resp_p":445,"proto":"tcp","conn_state":"OTH","local_orig":false,"local_resp":true,"missed_bytes":0,"orig_pkts":0,"orig_ip_bytes":0,"resp_pkts":0,"resp_ip_bytes":0,"orig_l2_addr":"70:e4:22:49:ab:29","resp_l2_addr":"00:25:90:e3:d5:af"} {"ts":1604508707.507337,"uid":"C1UOGu2TzEgYOvF2eh","id.orig_h":"91.203.165.166","id.orig_p":47011,"id.resp_h":"95.130.232.204","id.resp_p":445,"name":"truncated_header","notice":false,"peer":"worker-5-1"}

For this one I don't have a core file available at the moment.

Actions #19

Updated by Jan Hugo Prins over 3 years ago

I have done some more analyses and what I have found is that in a lot of cases this crash happens when the packet is classified by Zees as a weird packet with truncated headers.
Noticed that I was receiving some tonight as well, but these ones didn't crash Suricata, except for one.
I have been able to capture some of them that were actually marked as truncated header packets, but at the moment these packets don't crash my suricata instances. pcap file is attached.
I then noticed an other crash, but for this one I don't have the pcap file, or the core file. At the moment I still receive packets with truncated headers.

Actions #20

Updated by Sascha Steinbiss over 3 years ago

Jan Hugo Prins wrote in #note-19:

I have done some more analyses and what I have found is that in a lot of cases this crash happens when the packet is classified by Zees as a weird packet with truncated headers.
Noticed that I was receiving some tonight as well, but these ones didn't crash Suricata, except for one.
I have been able to capture some of them that were actually marked as truncated header packets, but at the moment these packets don't crash my suricata instances. pcap file is attached.

I was able to reproduce the crash on the current git master (compiled with debug flags) with the pcap above and the following rule (to trigger packet based metadata output):

alert ip any any -> any any (msg:"test"; sid:1;)

like this:

$ suricata -vvv -k none -S x.rules -c suricata.yaml -r ~/Downloads/weird_packet_bgp01.pcap -l .
[...]
[24968] 8/11/2020 -- 12:30:54 - (source-pcap-file.c:173) <Info> (ReceivePcapFileLoop) -- Starting file run for /home/satta/Downloads/weird_packet_bgp01.pcap
[24968] 8/11/2020 -- 12:30:54 - (source-pcap-file-helper.c:157) <Info> (PcapFileDispatch) -- pcap file /home/satta/Downloads/weird_packet_bgp01.pcap end of file reached (pcap err code 0)
AddressSanitizerAddressSanitizerAddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
AddressSanitizerAddressSanitizer:DEADLYSIGNAL
=================================================================
:DEADLYSIGNAL
:DEADLYSIGNAL
==24967==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000130 (pc 0x55b2264c5c37 bp 0x7f9d2b1ee3c0 sp 0x7f9d2b1ee3b0 T2)
==24967==The signal is caused by a READ memory access.
==24967==Hint: address points to the zero page.
AddressSanitizer:DEADLYSIGNAL
[24967] 8/11/2020 -- 12:30:54 - (suricata.c:2637) <Notice> (SuricataMainLoop) -- Signal Received.  Stopping engine.
[24977] 8/11/2020 -- 12:30:54 - (flow-manager.c:1032) <Perf> (FlowManager) -- 0 new flows, 0 established flows were timed out, 0 flows in closed state
    #0 0x55b2264c5c36 in StorageGetById /home/satta/tmp/suricata/src/util-storage.c:224
    #1 0x55b2261329df in FlowGetStorageById /home/satta/tmp/suricata/src/flow-storage.c:41
    #2 0x55b2261e4509 in EveAddCommonOptions /home/satta/tmp/suricata/src/output-json.c:451
    #3 0x55b22618eb0d in AlertJson /home/satta/tmp/suricata/src/output-json-alert.c:622
    #4 0x55b226190100 in JsonAlertLogger /home/satta/tmp/suricata/src/output-json-alert.c:767
    #5 0x55b2261da996 in OutputPacketLog /home/satta/tmp/suricata/src/output-packet.c:116
    #6 0x55b2261819e3 in OutputLoggerLog /home/satta/tmp/suricata/src/output.c:882
    #7 0x55b22614231e in FlowWorker /home/satta/tmp/suricata/src/flow-worker.c:545
    #8 0x55b226388e55 in TmThreadsSlotVarRun /home/satta/tmp/suricata/src/tm-threads.c:117
    #9 0x55b22638b2c2 in TmThreadsSlotVar /home/satta/tmp/suricata/src/tm-threads.c:452
    #10 0x7f9d3ff07f26 in start_thread /build/glibc-WZtAaN/glibc-2.30/nptl/pthread_create.c:479
    #11 0x7f9d3ee5c2ee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfd2ee)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/satta/tmp/suricata/src/util-storage.c:224 in StorageGetById
Thread T2 (W#01) created by T0 (Suricata-Main) here:
    #0 0x7f9d40813db0 in __interceptor_pthread_create (/lib/x86_64-linux-gnu/libasan.so.5+0x50db0)
    #1 0x55b22639133b in TmThreadSpawn /home/satta/tmp/suricata/src/tm-threads.c:1721
    #2 0x55b226203eb4 in RunModeFilePcapAutoFp /home/satta/tmp/suricata/src/runmode-pcap-file.c:227
    #3 0x55b22620c19d in RunModeDispatch /home/satta/tmp/suricata/src/runmodes.c:391
    #4 0x55b22637dfd4 in SuricataMain /home/satta/tmp/suricata/src/suricata.c:2801
    #5 0x55b225617c74 in main /home/satta/tmp/suricata/src/main.c:22
    #6 0x7f9d3ed85e0a in __libc_start_main ../csu/libc-start.c:308

==24967==ABORTING

Actions #21

Updated by Sascha Steinbiss over 3 years ago

I think I found the bug. It looks like we're assuming a flow exists to look up a storage for, but we must consider that we might have a packet alert context.

Jan Hugo, would you be able to test a patch? See attachment.

Updated by Jan Hugo Prins over 3 years ago

I have a pcap file that is able to crash my Suricata instance.
I have added the pcap file and 2 gdb backtraces.
One for the crash that happened when the packet came in, and one for the reproduction reading the pcap file.

Actions #23

Updated by Jan Hugo Prins over 3 years ago

Sascha Steinbiss wrote in #note-21:

I think I found the bug. It looks like we're assuming a flow exists to look up a storage for, but we must consider that we might have a packet alert context.

Jan Hugo, would you be able to test a patch? See attachment.

I will test it and let you know.

Jan Hugo

Actions #24

Updated by Sascha Steinbiss over 3 years ago

Jan Hugo Prins wrote in #note-22:

I have a pcap file that is able to crash my Suricata instance.
I have added the pcap file and 2 gdb backtraces.
One for the crash that happened when the packet came in, and one for the reproduction reading the pcap file.

I can confirm that the pcap crashes the unpatched master, but not the patched version. The patched version also produces correct ethernet metadata:

{
  "timestamp": "2020-11-08T14:37:52.903303+0100",
  "pcap_cnt": 2,
  "event_type": "alert",
  "src_ip": "116.90.231.16",
  "src_port": 0,
  "dest_ip": "95.130.237.126",
  "dest_port": 0,
  "proto": "TCP",
  "ether": {
    "src_mac": "0c:86:10:ed:d7:c6",
    "dest_mac": "00:25:90:e3:d2:e1" 
  },
  "alert": {
    "action": "allowed",
    "gid": 1,
    "signature_id": 1,
    "rev": 0,
    "signature": "test",
    "category": "",
    "severity": 3
  }
}
{
  "timestamp": "2020-11-08T14:36:12.752230+0100",
  "pcap_cnt": 1,
  "event_type": "alert",
  "src_ip": "116.90.231.16",
  "src_port": 0,
  "dest_ip": "185.100.143.115",
  "dest_port": 0,
  "proto": "TCP",
  "ether": {
    "src_mac": "0c:86:10:ed:d7:c6",
    "dest_mac": "00:25:90:e3:d2:e1" 
  },
  "alert": {
    "action": "allowed",
    "gid": 1,
    "signature_id": 1,
    "rev": 0,
    "signature": "test",
    "category": "",
    "severity": 3
  }
}

Actions #25

Updated by Jan Hugo Prins over 3 years ago

I just wanted to confirm the same.
Thanks for the good work.
I hope this patch is in time for the next release.

Jan Hugo Prins

Actions #26

Updated by Sascha Steinbiss over 3 years ago

Jan Hugo Prins wrote in #note-25:

I just wanted to confirm the same.
Thanks for the good work.
I hope this patch is in time for the next release.

Thanks for the detailed information and good debugging support! I will open a PR now.

BTW there still seems to be something off with the TCP headers in the example pcaps, Suricata can't parse ports and Wireshark also reports something malformed there. This is likely not related to this issue, but still interesting?

Cheers
Sascha

Actions #27

Updated by Sascha Steinbiss over 3 years ago

  • Status changed from New to In Review
  • Target version changed from 6.0.0 to TBD
  • Difficulty set to medium
Actions #28

Updated by Sascha Steinbiss over 3 years ago

One more thing, could we possibly use the pcaps provided by you as test cases for public integration tests (via https://github.com/OISF/suricata-verify)?

Actions #29

Updated by Jan Hugo Prins over 3 years ago

Hello Sascha,

You can use those packets as far as I'm concerned.
You are right that the packets look corrupted in Wireshark as well, Zeek is also identify them as being corrupted.
To me it looks like someone is playing around with Scapy or something else and produces corrupt packets.
The checksum for all these packets is also wrong.

Jan Hugo Prins

Actions #31

Updated by Jan Hugo Prins over 3 years ago

Sascha Steinbiss wrote in #note-26:

Thanks for the detailed information and good debugging support! I will open a PR now.

Thanks, pure self interest ;-)

BTW there still seems to be something off with the TCP headers in the example pcaps, Suricata can't parse ports and Wireshark also reports something malformed there. This is likely not related to this issue, but still interesting?

You are absolutely right that there is something wrong with the TCP headers.
The TCP headers are incomplete and malformed:
- There are only zero's after the incorrect checksum bytes.
- The urgent pointer is all zero's, probably not an issue.
- The rest is marked as padding in wireshark.

The source and destination port are in the packet, and Zeek is able to identify them.

I wander if Suricata is first trying to validate the TCP headers, and is not continuing to get information out of it, after it discovered that the TCP headers are corrupted. This is probably a completely different bug, that is only now visible after we have fixed the previous bug.

Jan Hugo Prins

Actions #32

Updated by Victor Julien over 3 years ago

  • Target version changed from TBD to 6.0.1
Actions #33

Updated by Victor Julien over 3 years ago

One thing that caught my eye in the valgrind.log is

==32594== Thread 3 W#01-zc:0@3:
==32594== Conditional jump or move depends on uninitialised value(s)
==32594==    at 0x2D7E04: PfringProcessPacket (source-pfring.c:255)
==32594==    by 0x2D7E04: ReceivePfringLoop (source-pfring.c:413)
==32594==    by 0x2EEB5D: TmThreadsSlotPktAcqLoop (tm-threads.c:312)
==32594==    by 0x87C9E64: start_thread (pthread_create.c:307)
==32594==    by 0x8EF688C: clone (clone.S:111)

Code:
    if (ptv->vlan_in_ext_header &&
        h->extended_hdr.parsed_pkt.offset.vlan_offset == 0 &&
        h->extended_hdr.parsed_pkt.vlan_id)
    {

May or may not be related, but something that looks suspicious in any case.

Actions #34

Updated by Victor Julien over 3 years ago

  • Subject changed from Suricata, build with pfring enabled, crashes after some time to mac address logging crash
  • Status changed from In Review to Closed
Actions

Also available in: Atom PDF