Alignment check in RPC code

Hi,

it seems that the XDR extractor, which is used by the RPC code checks
the alignment of a memory buffer before extracting data.

uint32 extract_XDR_uint32(const u_char*& buf, int& len)
    {
    /* ... check that buf has enough len */

    uint32 bits32 = XDR_aligned(buf) ? *(uint32*) buf :
            ((buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3]);

    buf += 4;
    len -= 4;

    return ntohl(bits32);
    }

inline int XDR_aligned(const u_char* buf)
    {
    return (((unsigned long) buf) & 0x3) == 0;
    }

However, on my Mac OS, buf it not 4-byte aligned. If this is the case
than the 32-bit value is byte-swapped (which is a bug, which makes the
testsuite fail on MAC OS X, which is why I discovered this code).

Mac OS X is fine with doing un-aligned reads. More general, with current
architectures and OSes, does one still need to check for 4- or 8-byte
alignments? If so, should this be done higher up in the Analyzer tree?
But my thought is that we can just remove the alignment check here.

cu
Gregor

However, on my Mac OS, buf it not 4-byte aligned. If this is the case
than the 32-bit value is byte-swapped (which is a bug, which makes the
testsuite fail on MAC OS X, which is why I discovered this code).

Weird, as I run the testsuite pretty much exclusively under MacOS X.
Perhaps there's something different in our environments such that it's
aligned in mine but not in yours.

Mac OS X is fine with doing un-aligned reads. More general, with current
architectures and OSes, does one still need to check for 4- or 8-byte
alignments?

I don't think it's prudent to throw out this check. There's a difference
between "predominant" architectures and OSes vs. "current". Your statement
is really about the former, and will potentially limit portability (in a
nasty way, i.e., a crash on a maybe-not-commonly-executed code path).

If so, should this be done higher up in the Analyzer tree?
But my thought is that we can just remove the alignment check here.

I think significantly better is to use:

  uint32 bits32 = XDR_aligned(buf) ? ntohl(*(uint32*) buf) :
    ((buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3]);

and remove the later call to ntohl().

    Vern

Gregor wrote:

Mac OS X is fine with doing un-aligned reads.

This surprises me. Isn't that a CPU property rather than a OS property?

Vern wrote:

I don't think it's prudent to throw out this check.

Ack.

  uint32 bits32 = XDR_aligned(buf) ? ntohl(*(uint32*) buf) :
    ((buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3]);

I'm wondering how much benefit we get by doing this if-else clause
vs. just always doing the safe version that doesn't care about
alignment? Is it worth it?

Robin

> uint32 bits32 = XDR_aligned(buf) ? ntohl(*(uint32*) buf) :
> ((buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3]);

I'm wondering how much benefit we get by doing this if-else clause
vs. just always doing the safe version that doesn't care about
alignment? Is it worth it?

Good point. Chucking the optimized version and just going with the second
line is the way to go.

    Vern

Mac OS X is fine with doing un-aligned reads.

This surprises me. Isn't that a CPU property rather than a OS property?

IIRC that depends. The CPU could generate an alignment fault on
unaligned load/store and the OS can then recover. But x86 have always
been able to handle unaligned memory access.
(I was actually just saying that on my system unaligned reads worked,
not that Mac OS X did the recovery).

  uint32 bits32 = XDR_aligned(buf) ? ntohl(*(uint32*) buf) :
    ((buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3]);

I'm wondering how much benefit we get by doing this if-else clause
vs. just always doing the safe version that doesn't care about
alignment? Is it worth it?

I'm also wondering alignment checks should be done higher up in the
analyzer tree? E.g., to guarantee to protocol analyzers that the buffer
in DeliverStream() and DeliverPacket() calls are 4- or 8-byte aligned
(e.g., by copying the data into an aligned buffer). OTOH, if we use
binpac++ in the future, than binpacc++ can do alignment checks.

BTW, It seems that binpac doesn't do alignment-safe data extraction. Is
it worth filing a ticket for that?

cu
Gregor

I'm also wondering alignment checks should be done higher up in the
analyzer tree? E.g., to guarantee to protocol analyzers that the buffer
in DeliverStream() and DeliverPacket() calls are 4- or 8-byte aligned

I don't see how this can work in general, because higher up the layout of
data in the buffer isn't known. Plus this is a potentially expensive cost
(copying the payload in bulk) that might not be needed at all, depending
on what the analyzer does with the payload.

    Vern