Feature #2303: file-store enhancements (aka file-store v2): deduplication; hash-based naming; json metadata and cleanup tooling
only write unique files
Current behavior for filestore is to extract all. It could be useful to keep state and only write a given file once (maybe per run of Suricata?) For example if 15 users download a popular PE file, we'll end up with 15 copies of the same file on disk. Somewhat related to https://redmine.openinfosecfoundation.org/issues/1948 in that writing to hash for filename would avoid wasted disk space, but not actual time Suricata spends writing files to disk.
Updated by Victor Julien almost 5 years ago
The file store already starts writing files that are still being transferred. I'm not sure how we can reliably determine duplicate files before we've seen the whole file. In that case we've already started writing it to disk, except perhaps tiny files.