Feature #6424: HTTP/2 - http.host behavior when both :authority pseudo header and host header are present - Suricata - Open Information Security Foundation

Actions

Copy link

Feature #6424

open

HTTP/2 - http.host behavior when both :authority pseudo header and host header are present

Added by Brandon Murphy almost 2 years ago. Updated 4 months ago.

Status:

Feedback

Priority:

Normal

Assignee:

OISF Dev

Target version:

9.0.0-beta1

Effort:

Difficulty:

Label:

Description

Consider the following rules and attached pcap.

The attached pcap contains the following HTTP/2 Header Frame which contains an :authority and host header.

Stream: HEADERS, Stream ID: 1, Length 53, GET /
    Length: 53
    Type: HEADERS (1)
    Flags: 0x05, End Headers, End Stream
    0... .... .... .... .... .... .... .... = Reserved: 0x0
    .000 0000 0000 0000 0000 0000 0000 0001 = Stream Identifier: 1
    [Pad Length: 0]
    Header Block Fragment: 828441882f91d35d055c87a78790518496c5b0ed5387497ca589d34d1f7a8fdc41d7e94c…
    [Header Length: 224]
    [Header Count: 9]
    Header: :method: GET
    Header: :path: /
    Header: :authority: example.com
    Header: :scheme: https
    Header: accept-encoding: gzip, deflate
    Header: accept-language: fr-FR
    Header: accept: text/html
    Header: user-agent: Scapy HTTP/2 Module
    Header: host: foo.com

Rules

alert http any any -> any any (msg:"test - authority and host in header names"; http.header_names; content:"|0d 0a 3a|authority|0d 0a|"; content:"|0d 0a|host|0d 0a|"; nocase; sid:1;)
alert http any any -> any any (msg:"test - example.com"; http.host; content:"example.com"; sid:2;)
alert http any any -> any any (msg:"test - foo.com"; http.host; content:"foo.com"; sid:3;)

Only sid:1 and sid:2 fire indicating that only the :authority pseudo header is being used to form the buffer for http.host

RFC 9110 (HTTP Semantics) - https://www.rfc-editor.org/rfc/rfc9110#name-host-and-authority states:

A user agent MUST generate a Host header field in a request unless it sends that information as an ":authority" pseudo-header field.

RFC 9113 (HTTP/2) states

The recipient of an HTTP/2 request MUST NOT use the Host header field to determine the target URI if ":authority" is present.

Clients MUST NOT generate a request with a Host header field that differs from the ":authority" pseudo-header field. A server SHOULD treat a request as malformed if it contains a Host header field that identifies an entity that differs from the entity in the ":authority" pseudo-header field

An intermediary that forwards a request over HTTP/2 MAY retain any Host header field.

These indicate that a request containing both an :authority pseudo header and a host header is perfectly valid. A common case would be where an intermediary host forwards an HTTP/1 request over HTTP/2.

Proposed Solution:

Method 1.
Support multi-buffer matching for http.host. This would allow the :authority and host header to both be parsed into http.host and allow backwards compatibility with existing rules.

Method 2.
Treat this occurance the same as when multiple occurrences of the same http header are found within http.header. This behavior is currently that values are comma and space separated as documented within http.header
https://docs.suricata.io/en/latest/rules/http-keywords.html#http-header-and-http-header-raw

Note: An additional feature request for an event to be fired when the `:authority` and `Host` header do not match will be created.

Files

authority_and_host_2.pcap (1.12 KB) authority_and_host_2.pcap

Brandon Murphy, 10/27/2023 06:45 PM

Subtasks 1 (0 open — 1 closed)

Related issues 2 (1 open — 1 closed)

Actions

Copy link

Updated by Victor Julien over 1 year ago

Status changed from New to Feedback
Assignee changed from OISF Dev to Brandon Murphy
Target version changed from TBD to 8.0.0-beta1

Brandon, you mention looking into this further, do I remember that correctly?

Actions

Copy link

Updated by Brandon Murphy over 1 year ago

My intention would to research how often both host and :authority occur to help guide priority of implementing a solution. If that research is not important, then I'd prefer not to spend the time doing it. The time commitment to research it is unknown to me, as I don't even know how I'd go about collecting that data.

Let me know if that would be a requirement to move forward, if so, I'll start working on it.

Though in my interpretation of the RFC, it's possible for both host and :authority to be present. As such, IMO regardless of how often it happens a mature HTTP/2 implementation should have a solution for when it does occur.

Actions

Copy link

Updated by Victor Julien over 1 year ago

For reference: in HTTP1, we have this logic in libhtp

                // The host information appears in the URI and in the headers. The
                // HTTP RFC states that we should ignore the header copy.

                // Check for different hostnames.
                if (bstr_cmp_nocase(hostname, tx->request_hostname) != 0) {                    
                    tx->flags |= HTP_HOST_AMBIGUOUS;
                }

                // Check for different ports.
                if (((tx->request_port_number != -1)&&(port != -1))&&(tx->request_port_number != port)) {
                    tx->flags |= HTP_HOST_AMBIGUOUS;
                }

So it uses the host from the URI and sets an event http.host_header_ambiguous.

Actions

Copy link

Updated by Victor Julien over 1 year ago

Subtask #6624 added

Actions

Copy link

Updated by Victor Julien over 1 year ago

I like the multi-buffer idea. Will give it some more thought. It may even make sense to extend that to the similar HTTP/1.1 issue.

Actions

Copy link

Updated by Philippe Antoine over 1 year ago

I do not get the use case of the multi-buffer thing if there is an event to detect if values are different... (doing that now)
The RFC says the HTTP2 server must use the authority field
We can still use the keyword http.request_header with `Host: value` if needed

Actions

Copy link

Updated by Philippe Antoine over 1 year ago

Related to Feature #6425: HTTP/2 - new app-layer-event when `:authority` and `host` headers do not match added

Actions

Copy link

Updated by Brandon Murphy over 1 year ago

The big difference is that http.host supporting multi-buffer allows existing rules to function without change if :authority pseudo header and host header are present.

Actions

Copy link

Updated by Philippe Antoine over 1 year ago

The big difference is that http.host supporting multi-buffer allows existing rules to function without change if :authority pseudo header and host header are present.

Indeed. But I fail to see which rule and scenario will be relevant here.
Either we have normal traffic, and :authority is the right "host", or we have something malicious like SSRF attempt, and we will want a rule with both buffers...

What am I missing ?

Actions

Copy link

#10

Updated by Brandon Murphy over 1 year ago

There are about 4k rules that use http.host and ~690 that use the Host|3a header literal within a buffer.

The problem that I'm faced with is: Which of these ~4600 rules will need updated to address a different method of detecting the value of the HTTP Host header besides http.host if the authority header is also present? I can't answer this question without significant time investment tracking down each use case and make sure the signature will still function as desired in the event that HTTP/2 traffic is inspected.

The other problem is that without making a change, we are mandating a different method of Host header value content inspection, but only when both the Authority pseudo header is also present. No rule writer will remember this funky use case of having to use:

http.request_header; header_lowercase; content:"host|3a 20|"; content:"foo.com"; nocase; within:7; endswith

From my experience, this is what will happen instead.
They will try to use http.host, see it doesn't work and think "It's literally the host header?, Why isn't it in the http.host buffer?" and then they will look though will have to go through the source code, PRs, and redmine. Finally this find this ticket and when they do, they'll think to themselves "Well, that's a strange edge case, let me detect the HTTP host header a different way" and they will have wasted several hours.

I suppose if the documents are updated to reflect this one use case, at least they'll feel dumb when they finally figure it all out after those several hours.

Instead, a multi buffer implementation will allow both the Host header value and the :authority pseudo header value with the http.host keyword. This allows me to not having investigate 4600 rules and potentially rewrite them and allows rule writers to continue using a common-sense method of inspecting the Host header value regardless of other headers existence. It does all of this without changing the existing behavior of the HTTP/2 overloading of the :authority pseudo header to the http.host buffer.

Actions

Copy link

#11

Updated by Philippe Antoine over 1 year ago

Thanks for the long explanation.

But this multi-buffer may bring another problem by adding false positives.

Disclaimer : I tend to see FPs as worse than FNs. What is your view here ?

Which of these ~4600 rules will need updated to address a different method of detecting the value of the HTTP Host header besides http.host if the authority header is also present?

These rules have been to meant to match on either HTTP1 only, or HTTP whatever version (I guess there is no HTTP2 only here)
And these rules have been meant to match either the logical concept (so host for HTTP1 and authority for HTTP2), or specifically the host header.

If Suricata does the multi-buffer thing, the rules meant to match the logical concept, will cause FPs on HTTP2 traffic with the host header matching and not the authority.
And the way to avoid the FP would be deduplicating the rule into its http1 and http2 version. Whereas the current implementation lets you express all cases with one rule only (working for both HTTP1 and HTTP2).
Or the multibuf implementation may come with a http.authority_or_host keyword that has the current behavior to avoid rule duplication.

This allows me to not having investigate 4600 rules and potentially rewrite them and allows rule writers to continue using a common-sense method of inspecting the Host header value regardless of other headers existence. It does all of this without changing the existing behavior of the HTTP/2 overloading of the :authority pseudo header to the http.host buffer.

Instead of investigating, we are doing a bet.
The bet is about http.host match more often. So we can have only less FNs, and only more FPs.
Without investigating, I bet that the current behavior and the multi-buffer one behave the same on the vast majority of the cases.
What do you think of this analysis ? Are you willing to bet ?

The other problem is that without making a change, we are mandating a different method of Host header value content inspection, but only when both the Authority pseudo header is also present. No rule writer will remember this funky use case of having to use:

Not sure I fully understand this sentence sorry... Could you rephrase it ?

http.request_header; header_lowercase; content:"host|3a 20|"; content:"foo.com"; nocase; within:7; endswith

What do you mean to match here ?

By the way, would you rewrite all the rules using Host|3a to use http.host instead ?

Actions

Copy link

#12

Updated by Brandon Murphy over 1 year ago

Philippe Antoine wrote in #note-11:

Disclaimer : I tend to see FPs as worse than FNs. What is your view here ?

I think it depends on the defenders resources and priorities.

FNs cause a false sense of security. If I have a detection/prevention of X and X not actually being detected/prevented is a very poor position to be in for network defenders. FPs can be tuned, but you might not find out about FNs until the ransomware has been deployed.

FPs cause wasted time, alert fatigue and generally devalue the trust in IDS signatures. Defenders can easily overlook TPs in the mass of FPs. Depending on the volume of FPs, this could result in either short term or long term negative effects on the team as a whole.

I'm not in a position to decide what is more important. I am in a position to, at least try, to write the most accurate, performant and meaningful network detection rules possible to protect all users of the internet.

Which of these ~4600 rules will need updated to address a different method of detecting the value of the HTTP Host header besides http.host if the authority header is also present?

These rules have been to meant to match on either HTTP1 only, or HTTP whatever version (I guess there is no HTTP2 only here)

The vast majority of rules using http.host are designed to match an IOC or a part of a domain. In these cases, a rule writer isn't likely to care what HTTP version it is or if it's in the :authority pseudo header or the Host header. This generally works because of the "overloading" of :authority to the http.host buffer. But when they are both present only :authority works unless the rule is rewritten.

Example:
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"ET INFO DYNAMIC_DNS HTTP Request to a *.ddns.name Domain"; flow:established,to_server; http.host; content:".ddns.name"; endswith; classtype:bad-unknown; sid:2018221; rev:6; metadata:created_at 2011_12_15, updated_at 2020_04_28;)

And these rules have been meant to match either the logical concept (so host for HTTP1 and authority for HTTP2), or specifically the host header.

The vast majority of these rules rules were all written when there was only one option: the host header within HTTP/1. The concept of an authority header hadn't even been though of.
Now there are multiple options Host header for HTTP/1, authority for HTTP/2, and Host header for HTTP/2. In the vast majority of cases, I don't think I'll care. I won't care which one the content is in because if it's in either authority or host header it's likely involved in the request somehow, even when both are present.

Whereas the current implementation lets you express all cases with one rule only (working for both HTTP1 and HTTP2).

except the use case in which I don't care which header the content is in but they are both present.

come with a http.authority_or_host keyword that has the current behavior to avoid rule duplication.

That keyword name is confusing to match the current behavior. I might suggest "http.authority_or_host_when_authority_is_not_present" to better communicate the current behavior.

This allows me to not having investigate 4600 rules and potentially rewrite them and allows rule writers to continue using a common-sense method of inspecting the Host header value regardless of other headers existence. It does all of this without changing the existing behavior of the HTTP/2 overloading of the :authority pseudo header to the http.host buffer.

Instead of investigating, we are doing a bet.
The bet is about http.host match more often. So we can have only less FNs, and only more FPs.
Without investigating, I bet that the current behavior and the multi-buffer one behave the same on the vast majority of the cases.

They don't technically behave the same at all. The current solution overwrites a buffer when both headers are present. The multibuffer option creates two buffers when both headers are present.

I agree that, without investigating, the majority of benign HTTP/2 requests without any intermediary hosts and without any downgrading along the request path will not contain both headers.

I found it interesting that the ambiguity around :authority and host header is actually considered an "HTTP/2 Exploit Primitive" https://portswigger.net/research/http2#primitives.

Detecting deliberate ambiguity created by an attacker is a different use case than the vast majority of signatures that currently use http.host. Detecting deliberate ambiguity would very likely involve app-layer-event:http2.authority_host_mismatch; created from https://redmine.openinfosecfoundation.org/issues/6425 (thank you for that work)

The other problem is that without making a change, we are mandating a different method of Host header value content inspection, but only when both the Authority pseudo header is also present. No rule writer will remember this funky use case of having to use:

Not sure I fully understand this sentence sorry... Could you rephrase it ?

The point is, without a solution multi-buffer rule writers will have to
1) identify the use case (authority pseudo header, and host header are present but different)
2) understand that that this is a special use case with suricata
3) know that in this case http.host buffer is not actually the HTTP Host header.

We are forcing rule writers to understand and remember this strange use case. But they won't.

http.request_header; header_lowercase; content:"host|3a 20|"; content:"foo.com"; nocase; within:7; endswith

What do you mean to match here ?

As you suggested here:

Philippe Antoine wrote in #note-6:
We can still use the keyword http.request_header with `Host: value` if needed

This is an example rule of inspecting the value of the host header when the authority header is also present.

By the way, would you rewrite all the rules using Host|3a to use http.host instead ?

Depends. But those ~680 are good candidates for review to determine why they aren't using http.host and determine if they should be rewritten.

I also want to correct something in this statement

Philippe Antoine wrote in #note-6:
The RFC says the HTTP2 server must use the authority field

https://www.rfc-editor.org/rfc/rfc9113

This is not correct, infact, it actually leaves out the :authority pseudo header from the required pseudo headers

All HTTP/2 requests MUST include exactly one valid value for the ":method", ":scheme", and ":path" pseudo-header fields, unless they are CONNECT requests (Section 8.5)

and indicates a case where it must not be generated (emphasis added)

An intermediary that forwards a request over HTTP/2 MUST construct an ":authority" pseudo-header field using the authority information from the control data of the original request, unless the original request's target URI does not contain authority information (in which case it MUST NOT generate ":authority"). Note that the Host header field is not the sole source of this information; see Section 7.2 of [HTTP].

Actions

Copy link

#13

Updated by Philippe Antoine over 1 year ago

Thanks again Brandon.

I think it boils down to one thing I do not see

Do you have a pcap of such a case : both authority and host headers present, different values, and the host is the value used by the server ?
Maybe one with intermediary host ? (I do not grasp that well enough)
In the pcap of https://redmine.openinfosecfoundation.org/issues/6425 do you want http.host to match on foo.com ? Because example.com is the server responding, right ?
I had (I assume wrongly) understood from your previous comments and the RFCs that this does not exist...

Also, would you have a http2 pcap without authority pseudo-header, but a Host header that should be taken into account ? (that would be currently missing multibuf or single buf)

Actions

Copy link

#14