Feature #6206
openInvestigate a more intuitive use of the timestamp field in traffic/metadata events
Description
As proposed by Victor, I would like to foster a discussion about the above topic. If there is already a consolidated opinion about this that I did not see, this ticket could be closed.
In many SOCs, analysts use the suricata traffic/metadata EVE events for further investigating alerts from different sources (e.g., edr, server logs, network ids) using timeline analysis and visualization. For creating these timelines, it is common to use the "timestamp" field as the primary database key.
For alerts, it is quite obvious that the timestamp field corresponds more or less to the packet that has triggered or completed the conditions for the alert. Thus, these events appear on the timeline as close as possible to the potential attack attempt.
In contrast to that, the timestamp in metadata/traffic events is AFAIK given by the last packet that belongs to the respective protocol event (e.g., last packet of a http response or of an smtp transaction). Some analysts have the opinion that this is contraintuitive on the timeline, since this may be way later than the relevant traffic event itself.
A more intuitive usage of the timestamp field in traffic events would be protocol-specific, e.g. the beginning of a transaction. This is not canonical and needs to be discussed.
But one step towards a more intuitive use of the timestamp field could be to use the startts of the respective flow event if there is one. This could (IMHO easily) be implemented in output-json.c.
Sure, by correlation via flow and community IDs it is possible to identify these fields and use them as primary key in the back-end. But
a) some environments are lacking a proper correlation engine for different event source pipelines,
b) if there is one, the correlation with the flow events may introduce a significant delay (e.g., for long-lasting flows), and
c) setting the timestamp in suricata to a different (already available) value is assumed to be computationally easier than correlating.
So I would be interested in your opinions about this.
Updated by Sascha Steinbiss over 1 year ago
One approach that we could discuss would be:
- Each metadata event (i.e. app-layer transaction) could be assigned a start and end timestamp, in a similar way to flow start/end timestamps. These should be part of the protocol-specific JSON metadata sub-object. The start and end timestamps could be collected and assigned in the app-layer parser. It is still not too clear though where the authoritative timestamps should come from though -- but if all else fails one could use the current time when creating or completing an app-layer transaction in the parser.
- That would allow a SIEM to decide on ingest whether to use the start or end timestamp as the canonical timestamp for that event, as that choice is often configurable -- like, for instance, picking the request start timestamp for HTTP txs.
- Sometimes a canonical timestamp is also desired in the EVE-JSON itself. In that case, it might make sense to make the specific timestamp to use configurable per EVE output and event type; example:
outputs: - eve-log: ... timestamps: http: start smb: end # dns: end # 'end' could still be the default? ... types: ...
Updated by Philippe Antoine about 1 year ago
- Related to Task #6443: Suricon 2023 brainstorm added