Feature #4172
openSplit eve.json into multiple files based on alert severity
Description
We currently have an overly complicated setup, which we'd love to see Suricata support out of the box. We split our eve.json alerts based on severity. High severity alerts (1-3) get logged to our centralized logging infrastructure. The other alerts are only stored on disk, in order to help with debugging or manual investigations, but so as to not overwhelm our logging infrastructure.
We currently use logstash to read in the eve.json file, and to write out eve_severityX.json files, which then get handled appropriately. Another nice benefit is that we can define per-file log rotation and retention policies. Due to their low volume, the severity 1 and 2 files can be kept almost indefinitely, while 4 and 5 are rotated and deleted much more aggressively.
Logstash does work for this, but it falls behind if there's a sudden burst of alerts, and we're doubling the disk I/O unnecessarily. There are other ways to achieve this, but they all add complexity and fragility to our setup.
We tied severity to "how much do we care about this alert," so this setup makes a lot of sense for us, but I'm open to other options if they help us split alerts in some fashion.
Updated by Andreas Herz almost 4 years ago
- Tracker changed from Bug to Support
What traffic rate are we talking about and how much data overall?
Updated by Vlad Grigorescu almost 4 years ago
Tracker changed from Bug to Support
To be clear, I'm interested in adding this feature, but would rather not do the work if this is not seen as and the PR would be rejected. Alternatively, if this would require a significant overhaul of the way logs are written today, it might not be worth it. I'm a core Zeek developer, but my Suricata dev work has been fairly limited. :-)
We have 48 production NSM systems deployed today, with more on the way. Generally, they're sitting on 10G links, some of which will be upgraded to 40G in the near future. While we have enough Splunk license (2 TB/day) to be able to ingest everything, alerts with severity 4+ are of very limited use and only serve to decrease the signal-to-noise ratio of the data in Splunk (as well as make searches slower). We do periodically evaluate those, but only for targeted investigations or for evaluating which signatures to promote/disable.
Based on how we set severities, we see each level having at least one order of magnitude more alerts than the one before it. During normal operations, severity 4+ accounts for over 90% of alerts. However, we periodically will see legitimate traffic cause a storm of decode errors or other severity 4+ alerts. Over the weekend, some of our systems saw the eve.json log grow ~500x faster than usual, due to request_uri_not_seen in some very active and long-lived connections. Suricata and the disk I/O could cope with this, but our log processing pipeline could not. After crashing, it would need to re-read the huge file from the beginning, so the instability continued.
The main benefit to the proposed change is enabling more flexible workflows out of the box. Once the log files are split up, you can choose what you send where. Or, you can get some protection against informational alert spikes by having custom log rotation policies per-severity. Even if you don't post-process the logs, you can get some benefits -- searching for an attack in severity 1-3 logs is 10x faster and will yield better results than searching all your logs.
Updated by Victor Julien almost 4 years ago
- Related to Feature #821: conditional logging: output steering added
Updated by Jason Ish almost 4 years ago
So we already support multiple eve outputs where you can separate by event type. I wonder if we could add very simple filter on the alert type...
severity: "= 1" severity: "> 1"
Or something.
Then you could specify multiple outputs just with the alert event type, and filter on the severity. I don't think this would require too much work.
Updated by Flo Sfe almost 3 years ago
@Vlad Grigorescu
Have you tried filebeats? I am ingesting a few(~6) million alerts /day without any issue onto my new Elastic SIEM
Updated by Jason Ish almost 3 years ago
The update brought this feature to my attention again. I think for most users, the alerts are actually a lot less in volume than all the other event types. I'm going to guess that its typically a problem most users won't have, so I wonder if the fix is better, different log processing tools external to Suricata?
I think the plugin API for 6 and master is sufficient for developing complex output. It gets a stream (or multiple streams) of JSON events that could then be parsed, processed, filtered redirected as needed. Trying to bring too much log handling directly into Suricata would probably over complicate, othan than simple filters like I mention in https://redmine.openinfosecfoundation.org/issues/4172#note-4.
Updated by Andreas Herz almost 3 years ago
- Tracker changed from Support to Feature
Updated by Philippe Antoine 5 months ago
- Assignee set to OISF Dev
- Target version set to TBD