TLDR: Yesterday I mentioned on Twitter that I’d found a bad performance problem when writing to a large
ByteArrayOutputStream in Java. After some digging, it appears to be the case that there’s a bad bug in JDK6 that doesn’t affect correctness, but does cause performance to nosedive when a
ByteArrayOutputStream gets large. This post explains why.
Two of Impala’s server processes have both C++ and Java components (for reasons both historic and pragmatic). We often need to pass data structures from C++ to Java and vice versa, and mapping the C++ representation onto a Java one via JNI is too painful to contemplate. So instead we take advantage of the fact that Thrift is very good at generating equivalent data structures in different languages, and make every parameter to methods on the JNI boundary a serialised Thrift structure. That is, it’s a byte array that Thrift on both sides knows how to convert into a Thrift structure. So we pass byte arrays back and forth, and use Thrift to convert them to language-readable data structures. This works pretty well. (To see exactly how, start by reading frontend.cc and JniFrontend.java). We pay an extra copy or two, plus the CPU overhead of the serialisation, but the benefits in terms of usability and maintainability of the interface vastly outweigh some pretty nominal performance hits.
If the performance hit isn’t nominal, however, we have a problem. And this is what we observed earlier this week: one of the JNI methods was trying to pass a huge data structure back from Java to C++. Doing so was taking a long time - on the order of minutes. What was particularly of interest was that the performance dropped off a cliff: a data structure half the size was happily serialising in about 500ms. So we have a non-linear relationship between the size of the input and the cost of serialising it. We can’t really absorb that cost, so we had to understand the problem.
So how did we get there? Thrift’s Java serialisation implementation works by having a
TSerializer object, which contains a
write() on a Thrift structure with its
ByteArrayOutputStream as an argument. The Thrift structure then walks its members and writes object headers and then serialised data for each field in turn. The result is lots of small
write() calls to the
The first thing was to connect a profiler (YourKit, but honestly repeated
SIGHUP to get the stack trace would have worked). During the long serialisation period, almost all the time was spent inside
java.util.Arrays.copyOf, inside a method to write a
byte to a
ByteArrayOutputStream. Progress was being made - the item being written to the
ByteArrayOutputStream was changing - but it was taking an unreasonably long time to write each field.
ByteArrayOutputStream is not necessarily initialised with any estimate of the ultimate size of the byte array it wraps. So it needs a mechanism to resize when more space is required. The source for
ByteArrayOutputStream.write(byte, int, int) in JDK6 shows the (very standard) strategy it uses.
The first six lines just deal with parameter validation; they can be ignored from here on. Lines 8-9 are interesting: we compute the new size of the array after the write completes, and then, if that size is larger than the current size, we need to do something to compensate.
Line 10 is where that compensation happens.
Arrays.copyOf() creates a new array containing all the bytes from the original array, but with a larger size. The size of the new array is the maximum of twice the current length
(buf.length << 1) and the requested size of the array after the write completes (this is so that a large write that more than doubles the current size of the array can be accommodated). Performing this copy is expensive, but since the size of the array should grow exponentially, frequent copies are hopefully unlikely. C++'s vector does the same thing.
After that (lines 12-13) we copy in the argument, and update the tracked number of bytes in the array.
My working hypothesis was that
copyOf() was being called on every
write() (since that matched up with what the profiler was telling us). The source code tells us the only way that can happen is if
newcount is always larger than
buf.length. This leads to two possibilities:
newcount is getting large quickly, or
buf.length is getting large slowly. The former seems unlikely - Thrift serialisation works by writing many small byte arrays - so to support my hypothesis,
buf.length had to be growing slowly so that the
copyOf() branch was being taken much more frequently than we expected.
A session with JDB (a terrible, terrible debugger) confirmed this. During the slow serialisation period, the size of the array increased on every write only by the amount required to contain the write in progress. On every write of say 2 bytes, the array size would increase by exactly those 2 bytes and a copy would be taken. The array itself was about 1GB in size, so the copy was really expensive.
This leads us to the bug. The size of the array is determined by
Math.max(buf.length << 1, newcount). Ordinarily,
buf.length << 1 returns double
buf.length, which would always be much larger than
newcount for a 2 byte write. Why was it not?
The problem is that for all integers larger than
Integer.MAX_INTEGER / 2, shifting left by one place causes overflow, setting the sign bit. The result is a _negative_ integer, which is always less than newcount. So for all byte arrays larger than 1073741824 bytes (i.e. one GB), any write will cause the array to resize, and only to exactly the size required.
You could argue that this is by design for the following reason: the maximum size of any array in Java is
Integer.MAX_INTEGER (minus a few bytes for preamble). Any array larger than
Integer.MAX_INTEGER / 2 bytes long would become larger than that limit when doubling in size. However, the source for
ByteArrayOutputStream.write() could handle this case by setting the new length to
buf.length > Integer.MAX_INTEGER / 2 to give the array the maximum chance to grow with few copies.
The true fix is for us to cut down the size of the object we want to marshal, or to come up with some less expensive way of doing so (we could use a different
TSerializer implementation, for example). Still, it’s an unfortunate degradation an a fairly commonly used class, even if there are other, better ways of achieving the same thing.
In fact, JDK7 ‘fixed’ the issue by correctly dealing with overflow, but if the resulting doubled array-length was larger than
Integer.MAX_INTEGER, an exception is thrown. You can check by running this code on both JDK6 and JDK7: