Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BaseN encoding speed improvement

See original GitHub issue

By using this libdivide4j I was able to double the speed of your BaseN encoder for non powers of 2.

public class FastDivNRemainderEncoder extends BaseNEncoder {

    private final int radix;
    private final int length;
    private final char padding;

    private static final int UUID_INTS = 4;
    private static final long HALF_LONG_MASK = 0x00000000ffffffffL;
    
    private final FastDivision.Magic magic;

    public FastDivNRemainderEncoder(BaseN base) {
        super(base);
        radix = base.getRadix();
        length = base.getLength();
        padding = base.getPadding();
        magic = FastDivision.magicUnsigned((long) radix);
    }

    @Override
    public String apply(@SuppressWarnings("null") UUID uuid) {

        // unsigned 128 bit number
        int[] number = new int[UUID_INTS];
        number[0] = (int) (uuid.getMostSignificantBits() >>> 32);
        number[1] = (int) (uuid.getMostSignificantBits() & HALF_LONG_MASK);
        number[2] = (int) (uuid.getLeastSignificantBits() >>> 32);
        number[3] = (int) (uuid.getLeastSignificantBits() & HALF_LONG_MASK);

        char[] buffer = new char[length];
        int b = length; // buffer index

        // fill in the buffer backwards using remainder operation
        while (!isZero(number)) {
            final int[] quotient = new int[UUID_INTS]; // division output
            final int remainder = remainder(number, quotient);
            buffer[--b] = alphabet.get(remainder);
            number = quotient;
        }

        // add padding to the leading
        while (b > 0) {
            buffer[--b] = padding;
        }

        return new String(buffer);
    }

    protected int remainder(int[] number, int[] quotient /* division output */) {

        long temporary = 0;
        long remainder = 0;

        for (int i = 0; i < UUID_INTS; i++) {
            temporary = (remainder << 32) | (number[i] & HALF_LONG_MASK);
            // quotient[i] = (int) (temporary / divisor);
            long q = FastDivision.divideUnsignedFast(temporary, magic);
            // remainder = temporary % divisor;
            long r = temporary - q * magic.divider;
            
            quotient[i] = (int) q;
            remainder = (int) r;
        }

        return (int) remainder;
    }

    private boolean isZero(int[] number) {
        return number[0] == 0 && number[1] == 0 && number[2] == 0 && number[3] == 0;
    }
}

I don’t really fully understand how the algorithm works but it speeds up divmod-ing quite a bit on non power of 2s. I also don’t understand why you modulus (%) after you already done the division. Maybe the JIT is smart enough. Doing multi and then subtracting (like I did here) might speed it up as well.

I also have an in house encoder that I call Fast57 (but it really could be any BaseN) that divmods the MSB long and the LSB long separately using the fast division algorithm and then concatenates them. That approach is about twice as fast as even the improved BaseN encoder from the above however it can’t do arbitrary lengths of bytes since it relies on longs.

Issue Analytics

State:
Created 2 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

fabiolimacecommented, Mar 14, 2022

Making a copy of libdivide4j is also adding a dependency. I intend to keep this lib self-contained and efficient at the same time, although I know it’s an unattainable ideal as nothing is really independent. Alternatively, I could just copy part of libdivide4j, but I’m not confident in adding part of something I can’t fully understand. libdivide is really magic for me.

However, if a developer needs to speed up performance with libdivide4j, this can be done by injecting a custom division function. It can be considered an option. I didn’t notice any significant performance difference when plugging FastDivision wrapped in a CustomDivider. I think the JIT compiler does a pretty good job of optimizing it for us. Please take a look at this benchmark I just did:

Benchmark                      Mode  Cnt     Score     Error   Units
InjectedWithCustomDivider     thrpt    5  3156,578 ±  23,137  ops/ms
IncludedAsDependency          thrpt    5  3225,171 ±  99,144  ops/ms  +2%

I really appreciate the interest and advice other developers give this library. But I respectfully refuse to add libdivide4j as a dependency.

0reactions

agentgtcommented, Mar 15, 2022

I respect that and appreciate it. I wish more libraries did that. I was only pointing it out as you might have thought it was a ton of code but its only one class. If en/decoding was the primary goal of this project I would make a bigger deal about it but its not.

EDIT: To be clear I agree with your decision 👍