Project

General

Profile

Actions

Feature #6802

open

Support Domain rollup using existing dataset library

Added by Francois Methot 2 months ago. Updated 3 days ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
Effort:
medium
Difficulty:
medium
Label:

Description

Support domain rollup using specialized Matcher leveraging dataset code.

The matcher would navigate the input buffer string, backward, and for each . (dot) would query the dataset for the presence of the subdomain.
ex:
api.google.com on the inspection buffer:

iterate the string backward, and stop at the first dot:
com -> check the dataset
keep going
google.com -> check the dataset
api.google.com -> check the dataset

It would introduce a new signature keyword:
dns.query; domain-rollup <dataset-name>;

The matcher would automatically automatically perform a dataset:isset internally using the DatasetLookup function directly

An optimization that could be explored is to support a new type of dataset type: domain
In this case the domain would be calculated in reverse order when they are added to the dataset
if we add google.com to the dataset, it would be stored as hash of moc.elgoog
when we navigate the inspection buffer in reverse, it would compute the hash as it move along the char byte array.
upon reaching a . (dot), the hash is ready to be check, no need rehash the string.


Related issues 2 (2 open0 closed)

Related to Suricata - Feature #5639: Allow dataset to match on extracted domainIn ReviewEric LeblondActions
Related to Suricata - Feature #5681: datasets: add more transform layers to match on domainsNewOISF DevActions
Actions #1

Updated by Jason Ish 10 days ago

  • Related to Feature #5639: Allow dataset to match on extracted domain added
  • Related to Feature #5681: datasets: add more transform layers to match on domains added
Actions #2

Updated by Francois Methot 10 days ago

Support domain rollup matching against existing dataset funcitonnality (endswith like behavior)

Option 1 - Domain rollup using specialized Matcher

The matcher would navigate the input buffer string, backward, and for each . (dot) would query the dataset for the presence of the subdomain.
ex:
api.google.com on the inspection buffer:

iterate the string backward, and stop at the first dot:
com -> check the dataset
keep going
google.com -> check the dataset
api.google.com -> check the dataset

It would introduce a new signature keyword:
dns.query; domain-rollup <dataset-name>;

The matcher would automatically automatically perform a dataset:isset internally using the DatasetLookup function directly

Option 2 - Add support to new "domain" Dataset type to enable subdomain matching

Config ex:

datasets:

domain-block:

type: domain
state: domain-block.lst

Signature implementation

dataset:set-> add domain to the associated dataset
dataset:isset-> return true if domain matcher (as described in option 1) find any subdomain in the associated dataset
dataset:isnotset-> return true dataset:isset return false;

In this case the domain could be calculated in reverse order when they are added to the dataset
if we add google.com to the dataset, it would be stored as hash of moc.elgoog
when we navigate the inspection buffer in reverse, it would compute the hash as it move along the char byte array.
upon reaching a . (dot), the hash is ready to be check, no need rehash the string.

Actions #3

Updated by Eric Leblond 6 days ago

Hello François, what is your final goal ? Is the domain keyword like implemented in https://github.com/OISF/suricata/pull/8155 enough for your need ? We still need a discussion on the crate to use but the concept was looking ok for OISF team.

Actions #4

Updated by Francois Methot 3 days ago · Edited

Eric Leblond wrote in #note-3:

Hello François, what is your final goal ? Is the domain keyword like implemented in https://github.com/OISF/suricata/pull/8155 enough for your need ? We still need a discussion on the crate to use but the concept was looking ok for OISF team.

Our end goal is to use datasets to store domain of any subdomain length to match as subdomain.

So we could have IOC like
- test1.com
- test2.test3.com
- test4.test5.test6.com
- any.number.of.subdomains.test7.test8.test9.com

Added to a dataset and allowing to match dns.query/http.host that ends with these subdomain sequence.
We are keen on using dataset because domain can be added/removed very quickly without reloading rules.

The only drawback of this algorithm is that a long domain from the wire like "any.number.of.subdomains.test7.test8.test9.com" will trigger multiple dataset check (8 checks in this case).
But our tests showed that dataset hash check performance is great.

Actions

Also available in: Atom PDF