Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-05T12:07:30Z</p> <ul><li><strong>Related to</strong> <i><a class="issue tracker-5 status-1 priority-5 priority-high3" href="/issues/3195">Task #3195</a>: tracking: rustify all input</i> added</li></ul> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-05T12:11:18Z</p> <ul><li><strong>Subject</strong> changed from <i>rules: use rule for tokenizing rules</i> to <i>rules: use rust for tokenizing rules</i></li></ul> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-05T12:40:43Z</p> <ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-1 priority-4 priority-default child" href="/issues/1926">Bug #1926</a>: rule parsing: wrong content checked for fast_pattern (snort compatibility)</i> added</li></ul> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-06T11:31:11Z</p> <ul></ul><p>It would be nice if this could replace much of our pcre use in parsing. The pcre use is both for tokenizing and input validation. The tokenizing works well, but the input validation less so. Its hard to produce clear errors that are better than "the regex said no".</p> <p>If the rust based code does all the tokenizing, we'd need more input value validation to make up for it.</p> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-06T11:31:57Z</p> <ul></ul><p>(by Jason Ish. Moved from <a class="issue tracker-5 status-1 priority-5 priority-high3" title="Task: tracking: rustify all input (New)" href="https://redmine.openinfosecfoundation.org/issues/3195">#3195</a>)</p> <p>Victor Julien wrote:</p> <blockquote> <p>It would be nice if this could replace much of our pcre use in parsing. The pcre use is both for tokenizing and input validation. The tokenizing works well, but the input validation less so. Its hard to produce clear errors that are better than "the regex said no".</p> <p>If the rust based code does all the tokenizing, we'd need more input value validation to make up for it.</p> </blockquote> <p>The way I see is it the top level parser will give you a tuple of (keyword, value), but will not have done any validation of that value to make sure its correct for that keyword. It will be up to the handler for that keyword to parse that like it is now. So it would get rid of pcre in this outer tokenizer, but the parsing of the values would be on a keyword by keyword basis.</p> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-06T11:35:03Z</p> <ul></ul><p>I think this highest level tokenizing is already done w/o pcre since some time.</p> <p>Not sure I see much value in a rust crate that would just do the highest level of tokenizing. I was more thinking about having a rule parser that could be the single source of 'truth' for Suricata rule parsing and validation. This mean it would have to be much more aware of the individual keywords and their syntax. Maybe this isn't feasible.</p> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-06T14:06:46Z</p> <ul></ul><p>Its a mix of effort and re-usability I think.</p> <p>A tokenizer would satisfy requirements of preprocessing rules - such as Suricata-Update, or basic enable/disables.</p> <p>You typically might tokenize, then pass off to the parser. We could implement that as a standalone module, but is a lot more work as it has to understand every keyword. You'd want to parse all the values into some struct that could then be use by that keyword implementation, so it doesn't need to reparse the value again. We could have a generic one for keywords that are unknown to the parser, which would allow us to implement keywords over time.</p> <p>But I see it as a tokenizer, which would then feed to a parser.</p> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2019-11-06T18:54:05Z</p> <ul></ul><p>Ok, I guess I see little to no value in just a simple high level tokenizer in Rust. The current code is fast and simple, so we wouldn't gain much.</p> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2020-11-02T08:01:27Z</p> <ul><li><strong>Related to</strong> <i><a class="issue tracker-5 status-1 priority-4 priority-default parent" href="/issues/4095">Task #4095</a>: tracking: unify rule keyword value parsing</i> added</li></ul> </article> <article> <h1>Suricata - Feature #3317: rules: use rust for tokenizing rules</h1> <p>2021-11-26T13:21:13Z</p> <ul><li><strong>Parent task</strong> set to <i>#4855</i></li></ul> </article> </main></body></html>