Bug #7754
openhttp.host and http.host.raw contain the same Host header value twice, with a delimiter
Description
I ran into some strange behaviour when drafting a rule to detect RFC non-compliant characters within the HTTP host header field in which the value of the header appears to be entered into the buffer twice, delimited by ', ', is this intended behaviour?
This rule was firing many times (500k+ hits) across the Emerging Threats QA session today which baffled all of us:
alert http any any > [$HOME_NET,$HTTP_SERVERS] any (msg:"ET HUNTING Non-RFC Compliant HTTP Host Header Observed"; flow:established,to_server; http.host; pcre:"/^.*?[^A-Za-z0-9\\.\[\]:%]/"; classtype:bad-unknown; sid:54000000; rev:1;)
When investigating the PCAPs associated with the alerts, nothing stood out to us. We looked into the hexdumps of those HTTP requests and we verified that the PCAPs were not corrupt in some strange way.
From there we dumped the buffer directly from Suricata-7.0.3 and while I can't share actual domains or PCAPs that we observed in this ticket, I can share that the content structure of the buffer (hexdump) is the same as below.
http.host;
00000000 74 65 73 74 2e 67 6f 6f 67 6c 65 2e 63 6f 6d 2c |test.google.com,|
00000010 20 74 65 73 74 2e 67 6f 6f 67 6c 65 2e 63 6f 6d | test.google.com|
I thought this was an error so I created the following rule:
alert http any any -> [$HOME_NET,$HTTP_SERVERS] any (msg:"ET HUNTING Non-RFC Compliant HTTP Host Header Observed"; flow:established,to_server; http.host; content:"|2c 20|"; classtype:bad-unknown; sid:54000000; rev:1;)
And sure enough, writing a rule to specifically detect the delimiter observed in the dumped buffer gave us alerts.