Bug #512


Bug #540: Multiple issues in defrag code

Fix logic of call to DefragTimeoutTracker()

Added by Eric Leblond over 11 years ago. Updated over 11 years ago.

Target version:
Affected Versions:


There is a logic problem with the way the DefragTimeoutTracker() function is called.
The only case were this function is called in inside DefragGetTracker()

DefragGetTracker(ThreadVars *tv, DecodeThreadVars *dtv, DefragContext *dc,
    DefragTracker *lookup_key, Packet *p)
  tracker = HashListTableLookup(dc->frag_table, lookup_key, sizeof(*lookup_key));
    if (tracker == NULL) {
        tracker = PoolGet(dc->tracker_pool);
        if (tracker == NULL) {
            /* Timeout trackers and try again. */
            DefragTimeoutTracker(tv, dtv, dc, p);

When the pool is empty, then we call DefragTimeoutTracker() to do some cleanup. This means we call an intensive function when we are short in ressource.
But the worst is not there, in case of real outage, the function has no locking/concurrency tracking and thus all packet threads call in loop the function to try to find some place.

It would be sane to run the function at a regular interval to avoid: increase in size and problem in case of empty pool.

Actions #1

Updated by Victor Julien over 11 years ago

Where is the "locking/concurrency tracking" missing?

Actions #2

Updated by Eric Leblond over 11 years ago

Victor Julien wrote:

Where is the "locking/concurrency tracking" missing?

In the code shows in the ticket: threads want to get a tracker, they wait before the lock. If one fail getting a tracker and if there is no timeout, then all other threads will fail to and call DefragTimeoutTracker.
What we could do is to add a last_timeouted on DefragContext which is a param of DefragTimeoutTracker. It could be set to current time inside DefragTimeoutTracker and by adding a test at start of DefragTimeoutTracker we can leave the function if previous cleaning was made less than, say, a second.

Actions #3

Updated by Victor Julien over 11 years ago

Ah ok, I didn't understand the description. Thought you said a lock was missing.

Actions #4

Updated by Victor Julien over 11 years ago

  • Status changed from New to Assigned
  • Assignee set to Victor Julien
  • Target version changed from 1.4beta1 to 1.4beta2
  • Parent task set to #540
Actions #5

Updated by Victor Julien over 11 years ago

  • Status changed from Assigned to Closed

Addressed by same patch as in #540.


Also available in: Atom PDF