In an earlier
blog
I was trying to describe my take on some of the key ideas behind the
disruptor pattern.
In this one I’ll focus on a tiny detail. One of many scattered around
the disruptor code base, as an example of the low level you get to when
writing low latency code.
and here it is -
To calculate the index in the ring buffer backing array (an int obviously) from the sequence (a long), the ring buffer performs a mod operation using the bitwise AND operator (and making sure the size ie. mask is a power of two minus one).
so the code looks something like:
entries[(int) (sequence & mask)]
or is it?
actually, if you look inside RingBuffer.java it looks like this:
entries[(int)sequence & mask]
ok - so we dropped the parentheses which means, we cast the long to an int before we do the bitwise operation (so its a long to integer casting
- bitwise operation on integers instead of bitwise operation on longs (the int is cast to a long) and then casting to int). Is that really worth us reading this far? I hear you ask…
well - again, this is just a tiny detail, one of many and certainly not the big idea behind the disruptor BUT - Look at the results for this small test I wrote:
In the test below I run both options Integer.MAX_VALUE times.
I repeat the test 10 times to make sure there are no hidden optimizations / loading costs that only may skew one run.
Read the test and try and guess how much worse will the “runLong” option will be (results below the code)
public final class Test
{
private final int mask = 3;
private final long value = 17L;
public static void main(String[] args)
{
int ti = 0;
int tl = 0;
Test test = new Test();
for (int i = 0, size = 10; i < size; i++)
{
final long t1 = System.currentTimeMillis();
for (int j = 0, size2 = Integer.MAX_VALUE; j < size2; j++)
{
test.runInt(test.value);
}
final long t2 = System.currentTimeMillis();
for (int j = 0, size2 = Integer.MAX_VALUE; j < size2; j++)
{
test.runLong(test.value);
}
final long t3 = System.currentTimeMillis();
tl += (t3 - t2);
ti += (t2 - t1);
System.out.println(“time int="+ ti + " - time long=” + tl);
ti = tl = 0;
}
}
private int runLong(final long l)
{
return (int) (l & mask);
}
private int runInt(final long l)
{
return (int) l & mask;
}
}
and here are the results (run on my laptop (a not so fast 64bit linux)):
time int=5 - time long=7458
time int=9 - time long=6474
time int=0 - time long=6462
time int=0 - time long=7293
time int=0 - time long=6367
time int=0 - time long=6395
time int=0 - time long=6318
time int=0 - time long=6363
time int=0 - time long=6353
time int=0 - time long=6323
Remember, this amount of milliseconds is repeating the operation 2 to the power of 31 minus one times, but it still is a huge difference, and in a low latency code like the disruptor it all adds up.