Project

General

Profile

Actions

Feature #2303

closed

file-store enhancements (aka file-store v2): deduplication; hash-based naming; json metadata and cleanup tooling

Added by Jason Ish almost 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Effort:
Difficulty:
Label:

Description

At Suricon 2017 enhancement around file-store were discussed that are better implemented as a file-store v2 rather than adding more options to the current file store.

Deduplication

Log only one instance of each file. This can be done by using a hash based naming scheme. SHA-256 will be used. When Suricata has determined that the file will be closed, it will be renamed to the SHA-256 of its content.

A directory scheme that uses the hash will be used. This will be a directory of 256 entries, 00 to ff and each file will be put into the directory that matches the first 2 characters of its SHA-256.

When renaming a file to its SHA-256, if it already exists it will simply be "touched" and the working copy deleted.

Deduplication will not occur for the metadata files as its still useful to track each occurrence of a file. Perhaps the meta data files could following a naming scheme like SHA256.<timestamp>.json?

This also removes the need for a waldo file.

Metadata as JSON

The metadata should be logged as JSON in a similar format to that of the fileinfo record.

Cleanup Tool

Introduce a core supported tool for clean up of extracted files that could be run interactively or via cron. Options should exist to delete files older than some duration, or to enforce a certain size on disk. Python could be used here as we have existing tooling in Python.

Related Tickets
https://redmine.openinfosecfoundation.org/issues/1201
https://redmine.openinfosecfoundation.org/issues/1949


Subtasks 2 (0 open2 closed)

Feature #1201: file-store metadata in JSON formatClosedJason Ish06/08/2014Actions
Feature #1949: only write unique filesClosedJason Ish11/10/2016Actions

Related issues

Related to Task #2309: SuriCon 2017 brainstormNewVictor JulienActions
Related to Feature #1948: allow filestore name configuration optionsClosedJason Ish11/10/2016Actions
Related to Documentation #2286: doc: document best practices around handling file extractionClosedJason IshActions
Actions #1

Updated by Andreas Herz almost 4 years ago

  • Target version set to TBD
Actions #2

Updated by Victor Julien almost 4 years ago

  • Related to Task #2309: SuriCon 2017 brainstorm added
Actions #3

Updated by Victor Julien almost 4 years ago

  • Related to Feature #1948: allow filestore name configuration options added
Actions #4

Updated by Victor Julien almost 4 years ago

  • Related to Documentation #2286: doc: document best practices around handling file extraction added
Actions #5

Updated by Victor Julien over 3 years ago

  • Target version changed from TBD to 4.1beta1
Actions #6

Updated by Victor Julien over 3 years ago

  • Status changed from Assigned to Closed
Actions

Also available in: Atom PDF