Project

General

Profile

Actions

Optimization #1223

closed

Might be faster to use memmem() for short length content instead of Boyer Moore

Added by Ken Steele almost 10 years ago. Updated almost 7 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
Effort:
Difficulty:
Label:

Description

Boyer Moore works best with longer needles. It is currently used for all length content compares. It might be faster to use glibc's memmem() function to case matches, given that it can be implemented with SIMD operations processing 8 or 16 bytes at a time.

Some investigation would be needed to determine if it is faster and for what length needles.

glibc doesn't provide memcasemem(), but a SIMD version could be written for this too, for the nocase matches.

Actions #1

Updated by Andreas Herz over 8 years ago

  • Assignee set to OISF Dev
  • Target version set to TBD
Actions #2

Updated by Victor Julien almost 7 years ago

  • Status changed from New to Rejected
  • Assignee deleted (OISF Dev)
  • Target version deleted (TBD)

We now have hyperscan spm to improve the inspection.

The memmem() function is a GNU extension that seems to be unavailable to mingw.

So closing this.

Actions

Also available in: Atom PDF