https://redmine.openinfosecfoundation.org/https://redmine.openinfosecfoundation.org/favicon.ico?17011170022012-10-16T12:41:43ZOpen Information Security FoundationSuricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=23742012-10-16T12:41:43ZIgnacio Sanchezsanchezmartin.ji@gmail.com
<ul><li><strong>Assignee</strong> set to <i>Ignacio Sanchez</i></li><li><strong>Target version</strong> set to <i>TBD</i></li><li><strong>Start date</strong> changed from <i>10/15/2012</i> to <i>10/29/2012</i></li></ul><p>OK. It will involve changing the meaning of some of the current format strings such as %u (which in suricata means URL including query string, but in apache mod_log it means remote user), and adding the missing ones such as %C.</p>
<p>Suricata: <a class="external" href="https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Custom_http_logging">https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Custom_http_logging</a><br />Apache mod_log_config module: <a class="external" href="http://httpd.apache.org/docs/2.2/mod/mod_log_config.html">http://httpd.apache.org/docs/2.2/mod/mod_log_config.html</a></p>
<p>I will take the implementation of this feature request.</p>
<p>Peter: Could you please attach a pcap file together with the several apache mod_log_config expected outputs, so that I can use it for my tests?</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=23772012-10-17T01:34:51ZPeter Manevpetermanev@gmail.com
<ul></ul><p>Hi,<br />First of all - thank you for taking the initiative.</p>
<p>The previously attached tar contains such an output - is that what you wee asking for Ignacio?</p>
<p>To test/create this I just visited the default <a class="external" href="http://127.0.0.1">http://127.0.0.1</a> after an Apache install.</p>
<p>thank you</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=23832012-10-18T15:10:02ZIgnacio Sanchezsanchezmartin.ji@gmail.com
<ul></ul><p>Hi,</p>
<p>Yes, but there is no pcap file.</p>
<p>With a pcap file the testing process is easier for me. I run suricata against the pcap, and then I diff the http.log output with the one you have provided. The pcap file could be generated by running "tcpdump -s0 -w tests.pcap -i lo -n"</p>
<p>Each test would be made of (1) pcap file (2) customformat (3) expected apache log output, where the customformat is the actual apache log format string which generated the expected log output.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=24032012-10-25T09:06:03ZErik Cerik.j.clark@nasa.gov
<ul></ul><p>We could try to pull this, but i am not sure how we could clean the data up so that it didnt include SBU.... Give me a few, I will get you the config output we would use and what it would look like.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=25102012-11-19T13:56:56ZCharles Smutzcharles.smutz@lmco.com
<ul></ul><p>I'm putting this here because this is the most recent thread on custom HTTP logging. These are also relatively minor things.</p>
<p>The log customization capability is awesome.</p>
<p>The original implementation didn't support cookie parsing. I'd love to see cookie parsing, ex "%{Foobar}C" work.</p>
<p>I'd also like to propose an extension that would allow for the specification of a maximum length for a given custom format string. For example, if you wanted to limit the Referer header to the first 100 characters you do the following: "%[100]{Referer}i" which would be the same as "%{Referer}i" but would be truncated to 100 characters if the Referer is longer than that.<br />This would be helpful for people who want to write data to logs that is usually relatively small but can be very large in some cases (Referer is a good example). While disk size/speed might be an issue for some users, this is more likely to be used to remove clutter from logs to make them more easily readable by humans or to deal with limitations in logs sizes for machine consumers such as environments that use syslog.</p>
<p>Note that this is an extension that is not currently in the apache custom log syntax but also doesn't make the format string incompatible with apache style format specifications. If this additional syntax is not used, there is no adverse impact.</p>
<p>Both of these seem like relatively small things, but could be very useful.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=25172012-11-22T10:20:09ZVictor Julienvictor@inliniac.net
<ul></ul><p>Charles Smutz wrote:</p>
<blockquote>
<p>The original implementation didn't support cookie parsing. I'd love to see cookie parsing, ex "%{Foobar}C" work.</p>
</blockquote>
<p>Not sure I get this notation. What would this do?</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=25332012-11-26T11:50:19ZCharles Smutzcharles.smutz@lmco.com
<ul></ul><p>Victor Julien wrote:</p>
<blockquote>
<p>Charles Smutz wrote:</p>
<blockquote>
<p>The original implementation didn't support cookie parsing. I'd love to see cookie parsing, ex "%{Foobar}C" work.</p>
</blockquote>
<p>Not sure I get this notation. What would this do?</p>
</blockquote>
<p>Given you have a request header as follows (from <a class="external" href="http://en.wikipedia.org/wiki/HTTP_cookie">http://en.wikipedia.org/wiki/HTTP_cookie</a>):</p>
<p>GET /spec.html HTTP/1.1<br />Host: <a class="external" href="http://www.example.org">www.example.org</a><br />Cookie: name=value; name2=value2<br />Accept: */*</p>
<p>%{Cookie}i should print the whole request cookie value:</p>
<p>name=value; name2=value2</p>
<p>But %{Foobar}C should print the value of the individual cookie "Foobar".</p>
<p>Ex.</p>
<p>%{name}C should print the value of the "name" cookie:</p>
<p>value</p>
<p>%{name2}C should should print the value of the "name2" cookie:</p>
<p>value2</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=25472012-11-27T11:38:47ZCharles Smutzcharles.smutz@lmco.com
<ul></ul><p>All, please correct me if this is not the right place to post these.</p>
<p>I've got two minor nits I'd like to point out.</p>
<p>I'm not sure if this is the same or related to what Eric's issue (<a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: literal \t (x09) in mod_log_config (Closed)" href="https://redmine.openinfosecfoundation.org/issues/600">#600</a>), but I don't think we should be doing URI filtering on the literal values in the custom log format (possibly also %t/LOG_HTTP_CF_TIMESTAMP/strftime).<br />I agree with filtering all special characters in data coming from the HTTP data, but there should be no reason to do this filtering on IDS admin controlled/predictable data. <br />For example, if I put a "\t" in the custom format string, I want the log to be printed with a tab character not the escaped literal "\x09". Again, I think this escaping should be removed only for admin controlled values, that this escaping occurs on raw HTTP data is very useful. Lack of this sort of escaping would make the text logs unreliable and susceptable to all sorts of issues.<br />I don't see a huge difference between escaping the white space that apache does as (ex. "\t") instead of Suricata currently does (ex. "\x09").</p>
<p>To be clear, if the tab character is in the sysadmin defined literals, this should be printed as a tab character in the log. If this is data taken from the network data such as urls, header values, etc, it should be escaped as "\t" or \x09"--either one being fine for me.</p>
<p>I mention tab here because some may want to make tab delimited logs, which due to the escaping of tabs in network data, would be a reliable delimiter.</p>
<p>This is the change I made to disable the URI escaping:</p>
<pre>
switch (httplog_ctx->cf_nodes[i]->type){
case LOG_HTTP_CF_LITERAL:
/* LITERAL */
- PrintRawUriBuf((char *)aft->buffer->buffer, &aft->buffer->offset,
- aft->buffer->size, (uint8_t *)httplog_ctx->cf_nodes[i]->data,
- strlen(httplog_ctx->cf_nodes[i]->data));
+ MemBufferWriteString(aft->buffer, httplog_ctx->cf_nodes[i]->data);
break;
</pre>
<p>I think it would be reasonable to make a similar change to %t/LOG_HTTP_CF_TIMESTAMP/strftime but can't see it being an issue in practice.</p>
<p>My second nit is with the status code (<span>s/LOG_HTTP_CF_RESPONSE_STATUS) when it is an HTTP redirect a la HTTP 301/302. The current implementation which puts the Location header in the response code is likely to break a lot of things, including any post processing of the logs. This functionality is already available through the "</span>{Location}o" format string for those who want such, so there is no need for this functionality to be built into the status code. I recommend this additional functionality be removed. It is redundant at best and makes logs unparseable for many uses.</p>
<p>Ex. Remove the whole block of code that begins as follows:</p>
<pre>
/* Redirect? */
if (tx->response_headers != NULL &&
tx->response_status_number > 300 &&
tx->response_status_number < 303)
{
</pre>
<p>If you'd like me to provide patches for these changes, I'd be happy to.</p>
<p>Again, I consider these minor nits.</p>
<p>The custom log functionality is most useful. I'm not bent on this functionality needing to mirror apache's syntax exactly (the differences in format strings, lack of modifiers, etc are fine by me) but users will benefit from being able to define any log format of their choice and should be able to replicate web server formats to the degree possible given innate differences between the two. I also don't think Suricata should limit the custom log functionality to only what is found in apache. I've already proposed a (backwards compatible) extension for limiting length of data. I could see other data available in the future that would be useful to put in a custom log format including alerts, payload hashes, etc that would certainly be a superset of apache's custom log syntax.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=27852013-02-13T15:33:32ZIgnacio Sanchezsanchezmartin.ji@gmail.com
<ul></ul><p>Ok. I have sent a pull request with the following changes:</p>
<p>Added support for %{cookiename}C<br />Added support for the definition of maximun length. ie: %[50]{user-agent}i<br />Some small bugfixes<br />Added the modifications suggested by Charles Smutz</p>
<p><a class="external" href="https://github.com/inliniac/suricata/pull/282">https://github.com/inliniac/suricata/pull/282</a></p>
<p>Any feedback will be welcomed.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=27862013-02-13T15:49:43ZVincent Fang
<ul></ul><p>I was wondering if the %{}C feature was ever added or if this feature request is on hold?</p>
<p>Also I'm not sure if I should make a new feature request or add on to this one, but a truncation would be nice too [40] to specify the max number of characters the http.log should display for a field before truncating, in case there's too much info and the admin think it's ok to cut some information out. Example the http URL</p>
<p>[40]%u</p>
<p>would only show the first 40 characters of the URL.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=27872013-02-13T16:35:57ZIgnacio Sanchezsanchezmartin.ji@gmail.com
<ul></ul><p>Yes, this is precisely what it has been added in the above mentioned PR (in addition to some bug fixes and the Charles' modifications).</p>
<p>You should now be able to use [40]%u and the %{}C feature. Please let me know your results if you test it.</p>
<p>Vincent Fang wrote:</p>
<blockquote>
<p>I was wondering if the %{}C feature was ever added or if this feature request is on hold?</p>
<p>Also I'm not sure if I should make a new feature request or add on to this one, but a truncation would be nice too [40] to specify the max number of characters the http.log should display for a field before truncating, in case there's too much info and the admin think it's ok to cut some information out. Example the http URL</p>
<p>[40]%u</p>
<p>would only show the first 40 characters of the URL.</p>
</blockquote> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=27912013-02-15T15:17:09ZVincent Fang
<ul></ul><p>I must be doing something wrong. Ignacio can you tell me which repository I should be cloning from</p>
<p><a class="external" href="https://github.com/inliniac/suricata.git">https://github.com/inliniac/suricata.git</a></p>
<p>or</p>
<p><a class="external" href="https://github.com/owlsec/suricata.git">https://github.com/owlsec/suricata.git</a></p>
<p>I did a clone from inliniac and the cookie nor the [] truncation work or I must be doing the syntax wrong. I tried %[10]u and [10]%u and neither worked.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=27922013-02-15T18:23:09ZVincent Fang
<ul></ul><p>Ok I went to Owlsec's github suricata repository</p>
<p>git clone <a class="external" href="https://github.com/owlsec">https://github.com/owlsec</a> .</p>
<p>and I made sure to change branches to</p>
<p>git checkout customhttplog</p>
<p>and with the new</p>
<p>customformat: "%a:%p -> %A:%P %[10]u %{bdfpc}C"</p>
<p>All uris are truncated to 10 characters long and cookies that have the name bdfpc have their values show up so looks good so far.</p>
<p>I'm also wondering if it's possible for you to add a small additional change of \t for tabs in the http.log. If I put in \t or the actual tab in the customformat:<br />the result is that I only get one whitespace character in place.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=27932013-02-15T21:09:08ZVincent Fang
<ul></ul><p>I apologize for the spam of updates, but I tested the customformat as follows</p>
<p>customformat: "%a:%p\t->\t%A:%P\t%[10]u\t%{bdfpc}C"</p>
<p>and the \t creates tabs in the http.log, so far everything works as expected.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=28102013-02-19T06:43:34ZPeter Manevpetermanev@gmail.com
<ul></ul><p>yes !</p>
<p>works - now i can get it to be parsed by "goaccess" an other apache log tools - pretty cool ....<br />however ....</p>
<p>1.<br />the problem is ...that if a value does not exist...ex:<br />\"%r\" <br />in the apache log ..access.log - apache substitutes whatever does not exist with "-" (dash , no quotes) ... in http.log if the value does not exist the printing function substitutes it with nothing ...</p>
<p>So in other words:<br />if there is no \"%r\" ... in the http request/log line - apache2 does the following<br />127.0.0.1 - - [19/Feb/2013:10:42:23 +0100] "GET / HTTP/1.1" 304 210 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0" <br />for<br />"%h <span>l %u %t \"%r\" %>s %O \"</span>{Referer}i\" \"%{User-Agent}i\"" </p>
<pre><code>in Suricata the following gets printed<br />127.0.0.1 [19/Feb/2013:10:42:23 +0100] "GET / HTTP/1.1" 304 210 "" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0" <br />for<br />"%h <span>l %u [%t] \"%r\" %>s %O \"</span>{Referer}i\" \"%{User-Agent}i\"" <br />in yaml</code></pre>
<p>we skip(print nothing) it if it does not exist...</p>
<p>2. <br />When we print %t - time format<br />apache prints it like :<br />[19/Feb/2013:10:42:23 +0100] by default with the []</p>
<p>to mimic that behavior in yaml I use [%t] .... may be not such a big deal ...</p>
<p>3. The %>s - is not working properly (if we are to make use of apache style log format)</p>
<p>I think in order to be made "fully" apache compatible (if custom logging is used that is) - we should follow those. <br />Just because there are a number of apache log parser tools freely available already... my suggestion.</p>
<p>Maybe we could have :<br /><pre>
custom: yes/no/apache
</pre><br />in yaml in order to make an option available for an apache log "compatibility" ?</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=29962013-05-15T13:58:07ZIgnacio Sanchezsanchezmartin.ji@gmail.com
<ul></ul><p>I have submitted a new pull request with the following changes:</p>
<p>Cookie is parsed now using uint8_t pointers (following Victor Julien PR comments) …<br />Changed buffer size to a power of 2 (8192) and cookie value extraction function to static (following Victor Julien PR comments)<br />Added %b for request size (Vincent Fang patch)<br />Writing "-" if an unknown % directive is used (Vincent Fang patch)<br />Fixed bug in cookie parser<br />Fixed format string issue logging literal values</p>
<p><a class="external" href="https://github.com/inliniac/suricata/pull/360">https://github.com/inliniac/suricata/pull/360</a></p>
<p>Peter: once the PR is accepted I can start looking into your 2nd and 3rd points (the 1st one should be ok now).</p>
<p>Any feedback is welcomed.</p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=30052013-05-24T15:40:20ZIgnacio Sanchezsanchezmartin.ji@gmail.com
<ul><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>New pull request adding syntax error handling.</p>
<p><a class="external" href="https://github.com/inliniac/suricata/pull/377">https://github.com/inliniac/suricata/pull/377</a></p> Suricata - Feature #602: availability for http.log output - identical to apache log formathttps://redmine.openinfosecfoundation.org/issues/602?journal_id=33482013-09-03T05:24:38ZVictor Julienvictor@inliniac.net
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Closed</i></li><li><strong>Target version</strong> changed from <i>TBD</i> to <i>2.0beta2</i></li></ul><p>Merged into master, thanks all!</p>