3.0RC1 file extraction
For suricata-3.0RC1 file extraction, when we download file by using the chrome browser, the aim file was truncated into several file.x and file.x.meta
(1) suricata-3.0RC1 cannot support the file extraction for file which is downloaded by the multi-thread, so the aim file was truncated into several file.x and file.x.meta. How do we merge these splitted files?
(2) suricata-3.0RC1 cannot support the file extraction for file which is downloaded by the breakpoint resume, so the aim file was truncated into several file.x and file.x.meta. How do we merge these splitted files?
Thank you so much for your generous helps to the beginner!
Updated by Anoop Saldanha almost 6 years ago
Not sure what's the aim file is(is that the pdf file?), but I do see data
in the pcaps for (2). For http, it can be (1) or (2), which doesn't
matter, since it ends up being pretty much the same in the end, since the
file is anyways downloaded in chunks(for (1) I presume you
meant multiple flows for multi-threads).
You files consumer has to parse the files and rearrange them.
Updated by Samiux A almost 6 years ago
I think I encounter to the similar problem on 3.0rc1 (github version). When using browser (no matter what browser is) to download a large size file (from about 100Mb to over 1Gb), the download will timeoff and the download will fail in final. However, when using wget to download, even it encounters timeout, it can download with more tries and the download is completed.
I am running Suricata in af_packet mode and md5 as well as filestore are applied. I think the problem may be libhtp. This problem also happened in 2.1dev without md5 and filestore too. I have tested the problem many times in different network with the same problem.
Updated by Peter Manev almost 6 years ago
- Subject changed from For suricata-3.0RC1 file extraction, when we download file by using the chrome browser, the aim file was truncated into several file.x and file.x.meta. Maybe multi-thread download or breakpoint resume? to 3.0RC1 file extraction
- Priority changed from High to Normal
hao chen -
When testing file extraction you should always test with "wget" or similar - otherwise browser cache comes into play and can affect the extraction.
As with regards to the target pdf file that you are trying to extract:
What is the name and MD5 sum of the file that you are trying to extract? (so I can make the relevant test).
However i was able to extract the PDF file -
root@LTS-64-1:~/Tests/bug-1609/log # cat files/file.2.meta TIME: 11/29/2015-10:07:44.695509 PCAP PKT NUM: 1583 SRC IP: 188.8.131.52 DST IP: 192.168.1.113 PROTO: 6 SRC PORT: 80 DST PORT: 59770 APP PROTO: http HTTP URI: /download/B/B/6/BB69622C-AB5D-4D5F-9A12-B81B952C1169/CloudDesignPatternsBook-PDF.pdf HTTP HOST: download.microsoft.com HTTP REFERER: https://www.baidu.com/link?url=A6ZGZ7goq0nschg-sRhg74tPyaLPk3vXr3SaBCJzXU_dFiwagk29ni0iWuaJZBXxDHevRmXP1yRvbjJKy0L7mF8xijGPNJpoc7akxvEYqX29jX2RqJH-yGvh14mNXedeazlm4nFYKRKNBNartn1kl3L2tzCgu1fQRkEXhtonzwy&wd=&eqid=8bc4cae00000749200000002565ac05a HTTP USER AGENT: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36 FILENAME: /download/B/B/6/BB69622C-AB5D-4D5F-9A12-B81B952C1169/CloudDesignPatternsBook-PDF.pdf MAGIC: data STATE: CLOSED MD5: 6919ce4308f4f769898f71dfbe1e2ddf SIZE: 6046167 root@LTS-64-1:~/Tests/bug-1609/log #
The file utility on Linux does not recognize the file as pdf either - so most likely there is some problem with the completeness of the pcap and/or traffic:
root@LTS-64-1:~/Tests/bug-1609/log # file files/file.2 files/file.2: data root@LTS-64-1:~/Tests/bug-1609/log #