Feature #2488: HTML Parsing / Buffers - Suricata - Open Information Security Foundation

Actions

Copy link

Feature #2488

open

HTML Parsing / Buffers

Added by Jason Williams almost 8 years ago. Updated over 6 years ago.

Status:

New

Priority:

Normal

Assignee:

OISF Dev

Target version:

TBD

Effort:

high

Difficulty:

high

Label:

Description

We write a lot of signatures on the contents of html in file_data. It would be awesome to be able to do some parsing/buffering here to avoid having to go through the whole file_data buffer. Alternatively perhaps this could be some kind of transform?

Some quick off the top of my head example html:

<html>
    <head>
        <title>Meerkat HQ</title>
        <!--Meerkat HQ cloned by z001ie -->
        <link rel="stylesheet" href="./z001ie_files/css/meerkats.css">
        <link rel="shortcut icon" href="./z001ie_files/images/favicon.gif" type="image/gif"/>
        <script src="./z001ie_files/jquery_003_002.html"></script>
    <script>
        function IsEmpty() {
            var x = document.forms["login"]["user"].value;
            var y = document.forms["login"]["pass"].value;
            if (x == "") {
                document.getElementById("ErrorBox").style.display = "block"; 
                document.getElementById("ErrorUser").style.display = "block"; 
                return false;
        }
     }
    </script>
    </head>
    <body>
        <form id="signon" name="login" action="login.php" method="post" autocomplete="off" onsubmit="return IsEmpty();">
            <input type="text" id="userid" placeholder="Username" class="required" name="user" value="" autocomplete="off">
            <input type="password" placeholder="Password" class="required" id="passwd" name="pass" value="" autocomplete="off">
            <input type="submit" class="signin" value="Sign On" onclick="return IsEmpty();">                                
        </form>
    </body>
</html>

I think that the following buffers could be very useful for detection to avoid parsing all of file_data (like parsing all of http_header)

html_title¶

literal:<title>Meerkat HQ</title>
rule: html_title; content:"Meerkat HQ"; nocase;

html_comment¶

literal comment: @ rule: @html_comment; content:"cloned by z00lie"; nocase;

html_resources¶

literal resources: (there are a few)

<link rel="stylesheet" href="./z001ie_files/css/meerkats.css"> <link rel="shortcut icon" href="./z001ie_files/images/favicon.gif" type="image/gif"/> <script src="./z001ie_files/jquery_003_002.html"></script>

rule: html_resource; content:"/z001ie"; nocase;

literal javascript:¶

function IsEmpty() { var x = document.forms["login"]["user"].value; var y = document.forms["login"]["pass"].value; if (x == "") { document.getElementById("ErrorBox").style.display = "block"; document.getElementById("ErrorUser").style.display = "block"; return false; } }

rule: html_javascript; strip_whitespace; content:"varx=document.forms[|22|login|22|][|22|user|22|]"

html_form¶

literal form:
<form id="signon" name="login" action="login.php" method="post" autocomplete="off" onsubmit="return IsEmpty();"> <input type="text" id="userid" placeholder="Username" class="required" name="user" value="" autocomplete="off"> <input type="password" placeholder="Password" class="required" id="passwd" name="pass" value="" autocomplete="off"> <input type="submit" class="signin" value="Sign On" onclick="return IsEmpty();"> </form>
rule: html_form; content:".php"; content:"method=|22|post|22|"; nocase; content:"onsubmit=|22|return IsEmpty()|3b|"; nocase; content:"user"; nocase; content:"pass"; nocase; distance:0;

Or maybe as a transform?

file_data; extract_html_title; content:"Meerkat HQ"; file_data; extract_html_comment; content:"cloned by z00lie"; nocase; file_data; extract_html_resources; content:"/z001ie"; nocase; file_data; extract_html_javascript; strip_whitespace; content:"varx=document.forms[|22|login|22|][|22|user|22|]"; file_data; extract_html_form; content:".php"; content:"method=|22|post|22|"; nocase; content:"onsubmit=|22|return IsEmpty()|3b|"; nocase; content:"user"; nocase; content:"pass"; nocase; distance:0;

Related issues 1 (1 open — 0 closed)