Project

General

Profile

Actions

Task #3768

open

research: investigate branch prediction vs likely/unlikely macros

Added by Victor Julien almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
Effort:
Difficulty:
Label:

Description

Noting something while doing profiling on a older Xeon (E5-2697 v2). During pktgen with only UDP packets, this code sees lots of branch-misses in perf:

    p->udph = (UDPHdr *)pkt;

    if (unlikely(len < UDP_GET_LEN(p))) {
        ENGINE_SET_INVALID_EVENT(p, UDP_PKT_TOO_SMALL);
        return -1;
    }

    if (unlikely(len != UDP_GET_LEN(p))) {
        ENGINE_SET_INVALID_EVENT(p, UDP_HLEN_INVALID);
        return -1;
    }

    SET_UDP_SRC_PORT(p,&p->sp);
    SET_UDP_DST_PORT(p,&p->dp);

However then I rewrite it to look like this:

    p->udph = (UDPHdr *)pkt;

    if (likely(len >= UDP_GET_LEN(p))) {
        if (likely(len == UDP_GET_LEN(p))) {
            SET_UDP_SRC_PORT(p,&p->sp);
            SET_UDP_DST_PORT(p,&p->dp);

            p->payload = (uint8_t *)pkt + UDP_HEADER_LEN;
            p->payload_len = len - UDP_HEADER_LEN;

            p->proto = IPPROTO_UDP;

            return 0;
        } else {
            ENGINE_SET_INVALID_EVENT(p, UDP_HLEN_INVALID);
            return -1;
        }
    } else {
        ENGINE_SET_INVALID_EVENT(p, UDP_PKT_TOO_SMALL);
        return -1;
    }

the branch misses are gone (or reduced to the point that they don't show up in perf).

My assumption has always been that the likely/unlikely annotations would allow the compiler (and/or CPU?) to optimize this to have the same result, but that seems to be untrue.

Compiled with:
CC=gcc-8 CFLAGS="-ggdb" ./configure --prefix=/usr/ --sysconfdir=/etc/ --localstatedir=/var/ --disable-shared

No data to display

Actions

Also available in: Atom PDF