Richard Searle's Blog

Thoughts about software

Java 7u6 String performance regression

Posted by eggsearle on September 13, 2012

I recently updated to Java 7u5 and found a > 10 times degradation in  the performance of some Scala parser combinator code. Some searching turned up a similar experience and bug report

Comparing the 7u5 and 7u6 code indicates a significant change in the implementation.

The older code used a char[] value, with indices. The substring/subsequence operations created a new instance that shared the value, and only changed the indices. The new implementation only has the char[] value, requiring a new copy to be created for every operation!

This change is not referenced in the release notes and the comments within the code do not provide any explanation.  

The Scala parser fortunately uses CharSequence, allowing the creation of a decorator over the String that restores the original behavior. 



6 Responses to “Java 7u6 String performance regression”

  1. Sanjay said

    Hi, I’m the author of the ‘experience’ and ‘bug report’ you quoted, and am still looking 😦 for a simple workaround. Would you mind explaining the last line of your post in a little more detail please? My tests (see code below) seem to indicate that parser’s created with Scala combinators can’t avoid this problem. The program prints out the time it takes to parse 10,000 numbers off the provided string, and the outputs are hugely different under 7u5 and 7u6.

    Thank you for any suggestions,

    package one
    import util.parsing.combinator.{RegexParsers, Parsers}
    import util.parsing.input.CharArrayReader
    object myParsers extends RegexParsers {
    val nbr = regex(“\\d+”.r)
    val nbrs = rep1(nbr)
    val stringBuffer = new StringBuilder
    def main(args: Array[String]) {
    for (i <- 1 to 10000)
    stringBuffer.append(i.toString).append(" ")
    val parserInput = stringBuffer.toString
    val t0 = System.currentTimeMillis
    val result = parseAll(nbrs, parserInput)
    val dt = System.currentTimeMillis – t0
    println("%s (%d): %s%n".format(result.successful, result.get.size, dt))

    • eggsearle said

      Use an implementation of CharSequence whose subsequence operation does not create a copy.

      For an initial test, I used javax.swing.text.Segment.
      That is not appropriate for production code, hence the Decorator.

      The Decorator code is not immediately to hand.

    • eggsearle said

      public class StringDecorator implements CharSequence{

      private final String contents;
      private final int offset;
      private final int length;

      public StringDecorator(String contents) {
      this.contents = contents;
      this.offset = 0;
      this.length = contents.length();

      private StringDecorator(String contents, int offset, int length) {
      this.contents = contents;
      this.offset = offset;
      this.length = length;

      public int length() {
      return length;

      public char charAt(int index) {
      return contents.charAt(index+offset);

      public CharSequence subSequence(int start, int end) {
      return new StringDecorator(contents,offset+start, end-start);

      public String toString() {
      return contents.substring(offset,offset+length);


    • eggsearle said

      Use the StringDecorator as follows

      val parserInput = new StringDecorator(stringBuffer.toString)

      7u5 – 47
      7u6 – 422
      7u6 with decorator – 47

  2. […] The saving grace comes from the observation by Richard Searle that “The Scala parser fortunately uses CharSequence”. (See his posting Java 7u6 String performance regression.) […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: