BaseN encoding speed improvement
See original GitHub issueBy using this libdivide4j I was able to double the speed of your BaseN encoder for non powers of 2.
public class FastDivNRemainderEncoder extends BaseNEncoder {
private final int radix;
private final int length;
private final char padding;
private static final int UUID_INTS = 4;
private static final long HALF_LONG_MASK = 0x00000000ffffffffL;
private final FastDivision.Magic magic;
public FastDivNRemainderEncoder(BaseN base) {
super(base);
radix = base.getRadix();
length = base.getLength();
padding = base.getPadding();
magic = FastDivision.magicUnsigned((long) radix);
}
@Override
public String apply(@SuppressWarnings("null") UUID uuid) {
// unsigned 128 bit number
int[] number = new int[UUID_INTS];
number[0] = (int) (uuid.getMostSignificantBits() >>> 32);
number[1] = (int) (uuid.getMostSignificantBits() & HALF_LONG_MASK);
number[2] = (int) (uuid.getLeastSignificantBits() >>> 32);
number[3] = (int) (uuid.getLeastSignificantBits() & HALF_LONG_MASK);
char[] buffer = new char[length];
int b = length; // buffer index
// fill in the buffer backwards using remainder operation
while (!isZero(number)) {
final int[] quotient = new int[UUID_INTS]; // division output
final int remainder = remainder(number, quotient);
buffer[--b] = alphabet.get(remainder);
number = quotient;
}
// add padding to the leading
while (b > 0) {
buffer[--b] = padding;
}
return new String(buffer);
}
protected int remainder(int[] number, int[] quotient /* division output */) {
long temporary = 0;
long remainder = 0;
for (int i = 0; i < UUID_INTS; i++) {
temporary = (remainder << 32) | (number[i] & HALF_LONG_MASK);
// quotient[i] = (int) (temporary / divisor);
long q = FastDivision.divideUnsignedFast(temporary, magic);
// remainder = temporary % divisor;
long r = temporary - q * magic.divider;
quotient[i] = (int) q;
remainder = (int) r;
}
return (int) remainder;
}
private boolean isZero(int[] number) {
return number[0] == 0 && number[1] == 0 && number[2] == 0 && number[3] == 0;
}
}
I don’t really fully understand how the algorithm works but it speeds up divmod-ing quite a bit on non power of 2s. I also don’t understand why you modulus (%
) after you already done the division. Maybe the JIT is smart enough. Doing multi and then subtracting (like I did here) might speed it up as well.
I also have an in house encoder that I call Fast57 (but it really could be any BaseN) that divmods the MSB long and the LSB long separately using the fast division algorithm and then concatenates them. That approach is about twice as fast as even the improved BaseN encoder from the above however it can’t do arbitrary lengths of bytes since it relies on longs.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Making a copy of
libdivide4j
is also adding a dependency. I intend to keep this lib self-contained and efficient at the same time, although I know it’s an unattainable ideal as nothing is really independent. Alternatively, I could just copy part oflibdivide4j
, but I’m not confident in adding part of something I can’t fully understand.libdivide
is really magic for me.However, if a developer needs to speed up performance with
libdivide4j
, this can be done by injecting a custom division function. It can be considered an option. I didn’t notice any significant performance difference when pluggingFastDivision
wrapped in aCustomDivider
. I think the JIT compiler does a pretty good job of optimizing it for us. Please take a look at this benchmark I just did:I really appreciate the interest and advice other developers give this library. But I respectfully refuse to add
libdivide4j
as a dependency.I respect that and appreciate it. I wish more libraries did that. I was only pointing it out as you might have thought it was a ton of code but its only one class. If en/decoding was the primary goal of this project I would make a bigger deal about it but its not.
EDIT: To be clear I agree with your decision 👍