Ruoming Pang wrote:
>(b) it did the hash of an 8-byte
>struct composed of the prefix length and the input, instead of the
>prefix of the input (4 bytes).
I think this is not correct. For example, prefix 10.0.0.0/8 is
different from 10.0.0.0/9, and should be hashed differently; otherwise
If you add the prefix length, then you have a "fixed-prefix length
anonymizer." By this, I mean an anonymizer for each /X class. It's
equivalent to have a hash key for /4 addresses, and another for /5
addresses. This scheme only preserves prefixes when the /X suffix is
the same. For example, 10.0.0.0/8 and 10.0.0.0/9 will not have the
same prefix. Is this what the function is supposed to do?
With this scheme, you may also have collisions. It may be the case
that 0.0.0.0/24 hashes to 4.4.4.0/24, and that 255.255.0.0/16 hashes
to 4.4.0.0/16. The hashed addresses share a 16-bit prefix, while
the input addresses share no prefix.
BTW, if this is the case, for the code to work, you cannot change
prefix.len during the "for" loop in AnonymizeIPAddr_PrefixMD5::anonymize().
the 9th and 10th most significant bits will always be flipped the same
way. (Or, try to anonymize 128.0.0.0 with 1000 different keys, and see
how many distinct results one can get.)
That's an artifact of the padding being 0...0. The f_0 function is
a constant dependent on the hash key, so x'_0 is random. x'_1 to
x'_{31} are all the same, as PAD(x_0) = PAD(x_0 x_1) = ... =
PAD(x_0 ... x_{31}).
Therefore, you can get 0.0.0.0, 128.0.0.0, 127.255.255.255, and
255.255.255.255.
-Chema