Project

General

Profile

Actions

Support #1609

closed

3.0RC1 file extraction

Added by hao chen almost 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Affected Versions:
Label:

Description

For suricata-3.0RC1 file extraction, when we download file by using the chrome browser, the aim file was truncated into several file.x and file.x.meta
We suspect:
(1) suricata-3.0RC1 cannot support the file extraction for file which is downloaded by the multi-thread, so the aim file was truncated into several file.x and file.x.meta. How do we merge these splitted files?

(2) suricata-3.0RC1 cannot support the file extraction for file which is downloaded by the breakpoint resume, so the aim file was truncated into several file.x and file.x.meta. How do we merge these splitted files?

Thank you so much for your generous helps to the beginner!


Files

test1.pcap (1.29 MB) test1.pcap the aim pdf file was splitted into two parts . hao chen, 11/30/2015 09:26 PM
test.pcap (7.85 MB) test.pcap the aim pdf file was splitted into three parts. hao chen, 11/30/2015 09:26 PM
Actions #1

Updated by Anoop Saldanha almost 6 years ago

Hao,

Not sure what's the aim file is(is that the pdf file?), but I do see data
in the pcaps for (2). For http, it can be (1) or (2), which doesn't
matter, since it ends up being pretty much the same in the end, since the
file is anyways downloaded in chunks(for (1) I presume you
meant multiple flows for multi-threads).

You files consumer has to parse the files and rearrange them.

Actions #2

Updated by Samiux A almost 6 years ago

I think I encounter to the similar problem on 3.0rc1 (github version). When using browser (no matter what browser is) to download a large size file (from about 100Mb to over 1Gb), the download will timeoff and the download will fail in final. However, when using wget to download, even it encounters timeout, it can download with more tries and the download is completed.

I am running Suricata in af_packet mode and md5 as well as filestore are applied. I think the problem may be libhtp. This problem also happened in 2.1dev without md5 and filestore too. I have tested the problem many times in different network with the same problem.

Actions #3

Updated by Peter Manev almost 6 years ago

  • Subject changed from For suricata-3.0RC1 file extraction, when we download file by using the chrome browser, the aim file was truncated into several file.x and file.x.meta. Maybe multi-thread download or breakpoint resume? to 3.0RC1 file extraction
  • Priority changed from High to Normal

hao chen -
When testing file extraction you should always test with "wget" or similar - otherwise browser cache comes into play and can affect the extraction.

As with regards to the target pdf file that you are trying to extract:
What is the name and MD5 sum of the file that you are trying to extract? (so I can make the relevant test).

However i was able to extract the PDF file -

root@LTS-64-1:~/Tests/bug-1609/log # cat files/file.2.meta
TIME:              11/29/2015-10:07:44.695509
PCAP PKT NUM:      1583
SRC IP:            211.90.29.21
DST IP:            192.168.1.113
PROTO:             6
SRC PORT:          80
DST PORT:          59770
APP PROTO:         http
HTTP URI:          /download/B/B/6/BB69622C-AB5D-4D5F-9A12-B81B952C1169/CloudDesignPatternsBook-PDF.pdf
HTTP HOST:         download.microsoft.com
HTTP REFERER:      https://www.baidu.com/link?url=A6ZGZ7goq0nschg-sRhg74tPyaLPk3vXr3SaBCJzXU_dFiwagk29ni0iWuaJZBXxDHevRmXP1yRvbjJKy0L7mF8xijGPNJpoc7akxvEYqX29jX2RqJH-yGvh14mNXedeazlm4nFYKRKNBNartn1kl3L2tzCgu1fQRkEXhtonzwy&wd=&eqid=8bc4cae00000749200000002565ac05a
HTTP USER AGENT:   Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36
FILENAME:          /download/B/B/6/BB69622C-AB5D-4D5F-9A12-B81B952C1169/CloudDesignPatternsBook-PDF.pdf
MAGIC:             data
STATE:             CLOSED
MD5:               6919ce4308f4f769898f71dfbe1e2ddf
SIZE:              6046167
root@LTS-64-1:~/Tests/bug-1609/log #

The file utility on Linux does not recognize the file as pdf either - so most likely there is some problem with the completeness of the pcap and/or traffic:

root@LTS-64-1:~/Tests/bug-1609/log # file files/file.2
files/file.2: data
root@LTS-64-1:~/Tests/bug-1609/log #

Actions #4

Updated by Victor Julien over 5 years ago

  • Tracker changed from Bug to Support
  • Assignee deleted (Victor Julien)
Actions #5

Updated by Victor Julien over 5 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF