Support #5366: Displaying Chinese Characters in eve.json - Suricata - Open Information Security Foundation

Actions

Copy link

Support #5366

closed

Displaying Chinese Characters in eve.json

Added by Genina Po about 3 years ago. Updated over 2 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Jason Ish

Affected Versions:

7.0.0-beta1

Label:

Description

Hi OISF Team,

Is there a way to display Chinese characters in my eve.json?

This question came up as I was creating sigs today. I was looking at content similiar to this:

return d.includes("hbWallet") ? "火币钱包"

I generated a pcap for it. To confirm that I generated the pcap correctly, I confirmed that the To Hex content above was correctly reflected in my Wireshark Hexdump. Here is the To Hex of the content:

return|20|d|2e|includes|28 22|hbWallet|22 29 20 3f 20 22 e7 81 ab e5 b8 81 e9 92 b1 e5 8c 85 22|

The generated .pcap should be attached for your testing as well.

As I was testing my sigs, I noticed that the eve.json would display content with ... instead of Chinese characters.
"http_response_body_printable":"return d.includes(\"hbWallet\") ? \"............\"\n"
and
"payload_printable":"HTTP/1.0 200 OK\r\nServer: SimpleHTTP/0.6 Python/3.8.10\r\nDate: Wed, 18 May 2022 00:10:49 GMT\r\nContent-type: application/javascript\r\nContent-Length: 47\r\nLast-Modified: Tue, 17 May 2022 23:59:19 GMT\r\n\r\nreturn d.includes(\"hbWallet\") ? \"............\"\n"

I have reviewed this past, similar ticket: https://redmine.openinfosecfoundation.org/issues/2647. I did confirm that the following variables are set to "yes" and are not commented out in my suricata.yaml while testing.

payload-printable: yes # enable dumping payload in printable (lossy) format
http-body: yes # Requires metadata; enable dumping of HTTP body in Base64
http-body-printable: yes # Requires metadata; enable dumping of HTTP body in printable format
decode-base64: yes
decode-quoted-printable: yes

Is there anything else you can suggest to help display the Chinese characters?

Files

bad.pcap (2.08 KB) bad.pcap

Genina Po, 05/18/2022 12:37 AM

Actions

Copy link

Updated by Jason Ish about 3 years ago

We don't make any assumptions about the encoding of the data other than there might be some ascii chars in there. These buffers are just raw bytes as far as Suricata is concerned. I think to display non-ascii character sets we'd have to attempt to decode them as UTF-8. If you're lucky, the whole thing will decode as UTF-8 and we could log it as such, however if it didn't decode as UTF-8, we'd have to attempt to decode chunks of it as UTF-8 which could get expensive. So its just more consistent to log the ASCII set and the rest as unprintable.

When using the base64 logging (default), the data is logged in a loss-less format, so presentation tools could attempt to perform the conversion. I think it could just cause issues if we started to do this in Suricata on untrusted data in terms of performance, inconsistencies, and perhaps even an attack vector?

Actions

Copy link

Updated by Jason Ish over 2 years ago

Status changed from New to Closed
Assignee changed from OISF Dev to Jason Ish

Closing. Was told answer was sufficient via Discord.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Suricata

Custom queries

Support #5366

Displaying Chinese Characters in eve.json

Updated by Jason Ish about 3 years ago

Updated by Jason Ish over 2 years ago