Feature #8019
openmpm: support `endswith` condition to the fast pattern in hyperscan
Description
Currently, the hyperscan integration in Suricata cannot apply the endswith condition to the fast pattern.
I tested with the following rule:
```
drop dns any any -> any any (msg:"block 1001721"; dns.query; content:".com"; endswith; fast_pattern; pcre:"/^([^.]+\.)*(abc\.com|def\.com|ghi\.com)$/"; sid:1001721; rev:1;)
```
The fast pattern only checks if dns.query contains ".com" rather than checking if it ends with ".com".
Additional info and steps to reproduce:
1. `multi_tld.pcap` was generated by sending a DNS and http request to all domains in unique_tld_domains.txt file.
2. We ran the Suricata profiling using multi_tld.pcap and sample.rules.
rule_perf.log shows that the rule was checked 56 times, which is equivalent to number of domain which have `.com` as substring (`grep "\.com" pcaps/scripts/unique_tld_domains.txt | wc -l`). I would expect the check time to be only 1 check, since there is only 1 domain that ends with `.com`
Files
Updated by Victor Julien 9 days ago
- Subject changed from Include `endswith` condition to the fast pattern in hyperscan integration to mpm: support `endswith` condition to the fast pattern in hyperscan
- Status changed from New to Feedback
I don't know if we can meaningfully address this. I've experimented with this years ago but the hyperscan API doesn't provide a way to do this. So my approach was to update the patterns to go from pattern to pattern$. This was technically functional, but there was a steep performance drop. I checked this with one of the hyperscan devs at the time and they confirmed that this is not a well supported pattern.
Hyperscan calls the match callback just once per pattern, so we can't iterate the matches ourselves until we find the endswith one either, like we do in our ac implementations.
I do still have the patch so I can share it but I think that is a dead end.
My experiments were in 2021, so it's possible something changed on the hyperscan side since then, but given the state of the open source hyperscan, I highly doubt it.
If avoiding the needles checks is important ac can be used. It's less performant overall, but perhaps we can add support for specifying and mpm-algo per buffer (so dns.query). If that is of interest we'll track that in a separate ticket.
Updated by Victor Julien 9 days ago
I'm attaching the WIP patch I did at the time. It shouldn't be hard to get it going with a current git tree.