Project

General

Profile

Actions

Bug #5320

closed

Key collisions in HTTP JSON eve-logs

Added by Gatewatcher Dev Team almost 2 years ago. Updated 10 months ago.

Status:
Closed
Priority:
Normal
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

Hi,

During development of issue #2485 (commit bef190f767828f240f1ef9718e72b187faedc5af), the content_range JSON field was added as part of the data exported by JsonHttpLogJSONBasic https://github.com/OISF/suricata/blob/bef190f767828f240f1ef9718e72b187faedc5af/src/output-json-http.c#L204.

In commit 6ba93d905feb1905e38d13ab9335aa3a51f706a4, converting JSON logging to the JSON builder, the name content_range stucked https://github.com/OISF/suricata/blob/6ba93d905feb1905e38d13ab9335aa3a51f706a4/src/output-json-http.c#L346.

Today, even though JsonHttpLogJSONBasic is no more, we still output the content-range header unconditionnnally under the name content_range. Unfortunately, this name collisions with the http_fields https://github.com/OISF/suricata/blob/master/src/output-json-http.c#L173 which is generated in EveHttpLogJSONCustom https://github.com/OISF/suricata/blob/master/src/output-json-http.c#L268

The end result is that the content_range field is outputed twice in JSON eve-log of type http. The first content_range value is a dict, generated by EveHttpLogJSONBasic https://github.com/OISF/suricata/blob/master/src/output-json-http.c#L249 and the second is a string, generated by EveHttpLogJSONCustom https://github.com/OISF/suricata/blob/master/src/output-json-http.c#L268 if the content_range is present in the configuration under outputs.eve-log.http.custom

Here is a example of an eve-log containing the duplicated key.

{"timestamp":"2022-05-02T13:48:57.583006+0000","flow_id":1598560542515831,"in_iface":"mon0","event_type":"http","src_ip":"192.0.2.1","src_port":57118,"dest_ip":"192.0.2.2","dest_port":80,"proto":"TCP","tx_id":0,"ether":{"src_mac":"01:02:03:04:05:06","dest_mac":"FF:02:03:04:05:06"},"community_id":"1:soC8KFwwmPd5FiVB/IAQ4FjZIaI=","http":{"hostname":"someserver.gatewatcher.com","url":"/hello_world","http_user_agent":"curl/7.68.0","http_content_type":"application/octet-stream","content_range":{"raw":"bytes 0-128/14221642647","start":0,"end":128,"size":14221642647},"accept":"*/*","range":"bytes=0-128","connection":"keep-alive","content_length":"129","content_range":"bytes 0-128/14221642647","content_type":"application/octet-stream","date":"Mon, 02 May 2022 13:48:00 GMT","last_modified":"Mon, 02 May 2022 02:47:02 GMT","server":"nginx/9.99.9","http_method":"GET","protocol":"HTTP/1.1","status":206,"length":129},"host":"probe.gatewatcher.com"}

We believe that having JSON parameter pollution (named after the equivalent issue HTTP parameter pollution) is an issue that may lead to confusion for eve-log consumers. Some consumers will consider the first occurrence while some others may choose to consider the last one, and the data type is also changing, which may lead to deserialisation issues, depending on the order of evaluation.

content_range is not the only field that is duplicated. For instance, the content-type is outputed twice; once under the name content_type as a custom http field https://github.com/OISF/suricata/blob/master/src/output-json-http.c#L174 and another time under the name http_content_type via EveHttpLogJSONBasic https://github.com/OISF/suricata/blob/master/src/output-json-http.c#L247 The problem is of less importance for content-type though, since the names do not collision; we just have the same value outputed twice under two separate keys.

We believe a "coherent yet redundant" fix would be to have EveHttpLogJSONBasic outputing the content-range value under a key named http_content_range, as it is done for the content-type. This would, at least, prevent the collision.
However, we believe a better fix would be to remove the content_type, content_range and all other duplicated infos from the custom header output, thus preventing key collision and info duplication.

What is your preferred approach?

Thank you.

Florian Maury & Tommy Boiret
Gatewatcher Dev Team


Related issues 1 (1 open0 closed)

Related to Suricata - Bug #6173: http: loss of backward compatibility in HTTP logs from v6 to v7NewOISF DevActions
Actions #1

Updated by Orion Poplawski over 1 year ago

This is causing problems ingesting the Suricata EVE data into ElasticSearch/OpenSearch due to the differing format (object vs string) of the duplicate entries. Please resolve this.

Actions #2

Updated by Orion Poplawski over 1 year ago

I'll also note that the content_range field (and maybe others) are not documented here: https://suricata.readthedocs.io/en/suricata-6.0.5/output/eve/eve-json-format.html#event-type-http

Actions #3

Updated by Philippe Antoine over 1 year ago

  • Status changed from New to In Review
  • Assignee changed from OISF Dev to Philippe Antoine
  • Target version changed from TBD to 7.0.0-rc1
Actions #4

Updated by Victor Julien about 1 year ago

  • Target version changed from 7.0.0-rc1 to 8.0.0-beta1
Actions #5

Updated by Philippe Antoine about 1 year ago

  • Target version changed from 8.0.0-beta1 to 7.0.0-rc2

This is clearly a bug, there is a simple fix, so I propose 7.0 ;-)

Actions #7

Updated by Philippe Antoine 10 months ago

  • Status changed from In Review to Closed
Actions #8

Updated by Victor Julien 9 months ago

  • Related to Bug #6173: http: loss of backward compatibility in HTTP logs from v6 to v7 added
Actions

Also available in: Atom PDF