Support #1609: 3.0RC1 file extraction - Suricata - Open Information Security Foundation

Actions

Copy link

Support #1609

closed

3.0RC1 file extraction

Added by hao chen over 9 years ago. Updated about 9 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Affected Versions:

Label:

Description

For suricata-3.0RC1 file extraction, when we download file by using the chrome browser, the aim file was truncated into several file.x and file.x.meta
We suspect:
(1) suricata-3.0RC1 cannot support the file extraction for file which is downloaded by the multi-thread, so the aim file was truncated into several file.x and file.x.meta. How do we merge these splitted files?

(2) suricata-3.0RC1 cannot support the file extraction for file which is downloaded by the breakpoint resume, so the aim file was truncated into several file.x and file.x.meta. How do we merge these splitted files?

Thank you so much for your generous helps to the beginner!

Files

Download all files

test1.pcap (1.29 MB) test1.pcap	the aim pdf file was splitted into two parts .	hao chen, 11/30/2015 09:26 PM
test.pcap (7.85 MB) test.pcap	the aim pdf file was splitted into three parts.	hao chen, 11/30/2015 09:26 PM

Actions

Copy link

Updated by Anoop Saldanha over 9 years ago

Hao,

Not sure what's the aim file is(is that the pdf file?), but I do see data
in the pcaps for (2). For http, it can be (1) or (2), which doesn't
matter, since it ends up being pretty much the same in the end, since the
file is anyways downloaded in chunks(for (1) I presume you
meant multiple flows for multi-threads).

You files consumer has to parse the files and rearrange them.

Actions

Copy link

Updated by Samiux A over 9 years ago

I think I encounter to the similar problem on 3.0rc1 (github version). When using browser (no matter what browser is) to download a large size file (from about 100Mb to over 1Gb), the download will timeoff and the download will fail in final. However, when using wget to download, even it encounters timeout, it can download with more tries and the download is completed.

I am running Suricata in af_packet mode and md5 as well as filestore are applied. I think the problem may be libhtp. This problem also happened in 2.1dev without md5 and filestore too. I have tested the problem many times in different network with the same problem.

Actions

Copy link

Updated by Peter Manev over 9 years ago

Subject changed from For suricata-3.0RC1 file extraction, when we download file by using the chrome browser, the aim file was truncated into several file.x and file.x.meta. Maybe multi-thread download or breakpoint resume? to 3.0RC1 file extraction
Priority changed from High to Normal

hao chen -
When testing file extraction you should always test with "wget" or similar - otherwise browser cache comes into play and can affect the extraction.

As with regards to the target pdf file that you are trying to extract:
What is the name and MD5 sum of the file that you are trying to extract? (so I can make the relevant test).

However i was able to extract the PDF file -

root@LTS-64-1:~/Tests/bug-1609/log # cat files/file.2.meta
TIME:              11/29/2015-10:07:44.695509
PCAP PKT NUM:      1583
SRC IP:            211.90.29.21
DST IP:            192.168.1.113
PROTO:             6
SRC PORT:          80
DST PORT:          59770
APP PROTO:         http
HTTP URI:          /download/B/B/6/BB69622C-AB5D-4D5F-9A12-B81B952C1169/CloudDesignPatternsBook-PDF.pdf
HTTP HOST:         download.microsoft.com
HTTP REFERER:      https://www.baidu.com/link?url=A6ZGZ7goq0nschg-sRhg74tPyaLPk3vXr3SaBCJzXU_dFiwagk29ni0iWuaJZBXxDHevRmXP1yRvbjJKy0L7mF8xijGPNJpoc7akxvEYqX29jX2RqJH-yGvh14mNXedeazlm4nFYKRKNBNartn1kl3L2tzCgu1fQRkEXhtonzwy&wd=&eqid=8bc4cae00000749200000002565ac05a
HTTP USER AGENT:   Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36
FILENAME:          /download/B/B/6/BB69622C-AB5D-4D5F-9A12-B81B952C1169/CloudDesignPatternsBook-PDF.pdf
MAGIC:             data
STATE:             CLOSED
MD5:               6919ce4308f4f769898f71dfbe1e2ddf
SIZE:              6046167
root@LTS-64-1:~/Tests/bug-1609/log #

The file utility on Linux does not recognize the file as pdf either - so most likely there is some problem with the completeness of the pcap and/or traffic:

root@LTS-64-1:~/Tests/bug-1609/log # file files/file.2
files/file.2: data
root@LTS-64-1:~/Tests/bug-1609/log #

Actions

Copy link

Updated by Victor Julien over 9 years ago

Tracker changed from Bug to Support
Assignee deleted (~~Victor Julien~~)

Actions

Copy link

Updated by Victor Julien about 9 years ago

Status changed from New to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Suricata

Custom queries

Support #1609

3.0RC1 file extraction

Updated by Anoop Saldanha over 9 years ago

Updated by Samiux A over 9 years ago

Updated by Peter Manev over 9 years ago

Updated by Victor Julien over 9 years ago

Updated by Victor Julien about 9 years ago