Bug #512
closedBug #540: Multiple issues in defrag code
Fix logic of call to DefragTimeoutTracker()
Description
There is a logic problem with the way the DefragTimeoutTracker() function is called.
The only case were this function is called in inside DefragGetTracker()
DefragGetTracker(ThreadVars *tv, DecodeThreadVars *dtv, DefragContext *dc, DefragTracker *lookup_key, Packet *p) { ... tracker = HashListTableLookup(dc->frag_table, lookup_key, sizeof(*lookup_key)); if (tracker == NULL) { SCMutexLock(&dc->tracker_pool_lock); tracker = PoolGet(dc->tracker_pool); if (tracker == NULL) { /* Timeout trackers and try again. */ DefragTimeoutTracker(tv, dtv, dc, p);
When the pool is empty, then we call DefragTimeoutTracker() to do some cleanup. This means we call an intensive function when we are short in ressource.
But the worst is not there, in case of real outage, the function has no locking/concurrency tracking and thus all packet threads call in loop the function to try to find some place.
It would be sane to run the function at a regular interval to avoid: increase in size and problem in case of empty pool.
Updated by Victor Julien over 12 years ago
Where is the "locking/concurrency tracking" missing?
Updated by Eric Leblond over 12 years ago
Victor Julien wrote:
Where is the "locking/concurrency tracking" missing?
In the code shows in the ticket: threads want to get a tracker, they wait before the lock. If one fail getting a tracker and if there is no timeout, then all other threads will fail to and call DefragTimeoutTracker.
What we could do is to add a last_timeouted on DefragContext which is a param of DefragTimeoutTracker. It could be set to current time inside DefragTimeoutTracker and by adding a test at start of DefragTimeoutTracker we can leave the function if previous cleaning was made less than, say, a second.
Updated by Victor Julien over 12 years ago
Ah ok, I didn't understand the description. Thought you said a lock was missing.
Updated by Victor Julien over 12 years ago
- Status changed from New to Assigned
- Assignee set to Victor Julien
- Target version changed from 1.4beta1 to 1.4beta2
- Parent task set to #540
Updated by Victor Julien over 12 years ago
- Status changed from Assigned to Closed
Addressed by same patch as in #540.