Bug #3483
closedSIP: Input not parsed when header values contain trailing spaces
Description
I have recently taken some time to dive into and try to get my head around Rust parser writing and started with looking at the SIP parser as an example.
I noticed that some events do not get parsed if there is whitespace between the last non-whitespace character and the final CRLF in a header line. For example, the PCAP from the Wireshark wiki https://wiki.wireshark.org/SampleCaptures?action=AttachFile&do=get&target=SIP_DTMF2.cap contains traffic in which the User-Agent string ends on two spaces (see below).
0000 00 0b cd 12 a6 72 08 00 6f 82 a7 b7 08 00 45 00 ..Í.¦r..o.§·..E. 0010 02 46 00 79 00 00 40 11 24 06 c0 a8 69 6e c0 a8 .F.y..@.$.À¨inÀ¨ 0020 69 69 13 c4 13 c4 02 32 c0 85 52 45 47 49 53 54 ii.Ä.Ä.2À.REGIST 0030 45 52 20 73 69 70 3a 31 39 32 2e 31 36 38 2e 31 ER sip:192.168.1 0040 30 35 2e 31 30 35 20 53 49 50 2f 32 2e 30 0d 0a 05.105 SIP/2.0.. 0050 56 69 61 3a 20 53 49 50 2f 32 2e 30 2f 55 44 50 Via: SIP/2.0/UDP 0060 20 31 39 32 2e 31 36 38 2e 31 30 35 2e 31 31 30 192.168.105.110 0070 3a 35 30 36 30 3b 62 72 61 6e 63 68 3d 7a 39 68 :5060;branch=z9h 0080 47 34 62 4b 32 30 37 33 37 0d 0a 52 6f 75 74 65 G4bK20737..Route 0090 3a 20 3c 73 69 70 3a 31 39 32 2e 31 36 38 2e 31 : <sip:192.168.1 00a0 30 35 2e 31 30 35 3a 35 30 36 30 3b 6c 72 3e 0d 05.105:5060;lr>. 00b0 0a 46 72 6f 6d 3a 20 32 35 30 33 20 3c 73 69 70 .From: 2503 <sip 00c0 3a 32 35 30 33 40 31 39 32 2e 31 36 38 2e 31 30 :2503@192.168.10 00d0 35 2e 31 30 35 3e 3b 74 61 67 3d 31 31 31 31 33 5.105>;tag=11113 00e0 0d 0a 54 6f 3a 20 32 35 30 33 20 3c 73 69 70 3a ..To: 2503 <sip: 00f0 32 35 30 33 40 31 39 32 2e 31 36 38 2e 31 30 35 2503@192.168.105 0100 2e 31 30 35 3e 0d 0a 43 61 6c 6c 2d 49 44 3a 20 .105>..Call-ID: 0110 33 30 37 30 40 31 39 32 2e 31 36 38 2e 31 30 35 3070@192.168.105 0120 2e 31 30 35 0d 0a 43 53 65 71 3a 20 31 20 52 45 .105..CSeq: 1 RE 0130 47 49 53 54 45 52 0d 0a 43 6f 6e 74 61 63 74 3a GISTER..Contact: 0140 20 22 32 35 30 33 22 20 3c 73 69 70 3a 32 35 30 "2503" <sip:250 0150 33 40 31 39 32 2e 31 36 38 2e 31 30 35 2e 31 31 3@192.168.105.11 0160 30 3a 35 30 36 30 3b 74 72 61 6e 73 70 6f 72 74 0:5060;transport 0170 3d 75 64 70 3e 0d 0a 45 78 70 69 72 65 73 3a 20 =udp>..Expires: 0180 33 36 30 30 0d 0a 4d 61 78 2d 46 6f 72 77 61 72 3600..Max-Forwar 0190 64 73 3a 20 37 30 0d 0a 53 75 70 70 6f 72 74 65 ds: 70..Supporte 01a0 64 3a 20 72 65 70 6c 61 63 65 73 0d 0a 55 73 65 d: replaces..Use 01b0 72 2d 41 67 65 6e 74 3a 20 53 49 50 20 49 50 2d r-Agent: SIP IP- 01c0 44 45 43 54 20 67 61 74 65 77 61 79 2c 20 4e 45 DECT gateway, NE 01d0 43 20 50 68 69 6c 69 70 73 20 55 6e 69 66 69 65 C Philips Unifie 01e0 64 20 53 6f 6c 75 74 69 6f 6e 73 20 20 0d 0a 41 d Solutions ..A 01f0 6c 6c 6f 77 3a 20 49 4e 56 49 54 45 2c 20 41 43 llow: INVITE, AC 0200 4b 2c 20 43 41 4e 43 45 4c 2c 20 42 59 45 2c 20 K, CANCEL, BYE, 0210 52 45 46 45 52 2c 20 4f 50 54 49 4f 4e 53 2c 20 REFER, OPTIONS, 0220 49 4e 46 4f 0d 0a 41 63 63 65 70 74 3a 20 61 70 INFO..Accept: ap 0230 70 6c 69 63 61 74 69 6f 6e 2f 73 64 70 0d 0a 43 plication/sdp..C 0240 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20 30 ontent-Length: 0 0250 0d 0a 0d 0a ....
When parsing this with the SIP parser enabled, there are no log entries (Suri 5.0.3 from master-5.0.x branch):
$ cat eve.json | jq -c 'select(.event_type == "sip")' | wc -l 0
I prepared a patch to correctly process this header, which makes the PCAP parseable and resulting in sensible logs:
$ cat eve.json | jq -c 'select(.event_type == "sip")' | wc -l 23
At the moment my patch does not preserve the trailing spaces in the parsed field. However, the patch would be even simpler if they would be kept in the parsed header fields. Not sure what might be the correct (or expected) way of handling this. (I'm inclined towards the latter).
Any ideas or comments? Would be happy to put in a PR (I'd make sure to include a test as well).
Cheers
Sascha