Richard Searle's Blog

Thoughts about software

scalaz contrib and Futures

Posted by eggsearle on July 24, 2013

The canonical combined Future example is

def c(i:Int):Future[Int] = ...
val f1 = c(5)
val f2 = c(2)
i1 <- f1
i2 <- f2
yield i1+i2

Creating the Futures within the for-comprehension would be much tidier, but then they would execute sequentially ruining the entire purpose.

The scalaz contrib project provides support for Scala 2.10 futures, allowing more expressive coding for some tasks.

This applicative functor expression is equivalent to the above code.

val r:Future[Int] = (c(5) |@| c(1))(_ + _)



Posted in Scala, scalaz | Leave a Comment »

Defaulting type with Scala implicits

Posted by eggsearle on July 13, 2013

This stackoverflow answer shows how a default type can be specified, using some interesting side effects of the Scala implicit rules.

Scala 2.10.1 allows the code to be further simplified

object DefaultsTo {
 implicit def default[B] = new DefaultsTo[B, B]
 implicit def overrideDefault[A,B] = new DefaultsTo[A,B]

The  T DefaultsTo Node looks rather strange but is actually the infix form of DefaultsTo[T,Node]


Posted in Scala | Leave a Comment »

Akka IO using byte[]

Posted by eggsearle on February 2, 2013

Used the code from and to send data serialized using ProtoBufs

These examples are all string based, primarily to make the code easy to test with curl.

The string representation of the length can be replaced with its 4 byte binary representation.
The payload is still a byte[], derived from a String for convenience


case s: String => handle.foreach {
h =>
val bb = ByteBuffer.allocate(4)
h write ByteString(bb)
h write ByteString(s.getBytes("US-ASCII"))



def readMessage: IO.Iteratee[String] =
for {
lengthBytes <- take(4)
len = lengthBytes.asByteBuffer.getInt()
bytes <- take(len)
} yield {



Posted in Uncategorized | 6 Comments »

LMAX Disruptor and h.264 processing

Posted by eggsearle on January 13, 2013

A recent project required the decryption of an MPEG TS multicast stream.

The first attempt simply looped, reading from a MulticastChannel , decrypting with a Cipher and writing via a DatagramChannel.
The decryption turned out to have negligible cost, which was surprising.

Unfortunately, this simple implementation dropped too many packets. The resultant video  was unwatchable, being riddled with artifacts.

Some buffering was obviously required.

The standard j.u.c classes did not resolve the problem. Packet loss remained a problem, perhaps due to the ongoing garbage collection.

The LMAX Disruptor provides a ring buffer containing preallocated elements, eliminating GC.
Its design provides very small and consistency latency, orders of magnitude better than ArrayBlockingQueue.

A simple two thread design turned out to suffice :

  1. Blocking read packet into ByteBuffer in the ring buffer
  2. Decrypt ByteBuffer in-place  and transmit from the ring buffer.

This approach generates zero garbage.

Any other design caused packet loss. That includes separating the decryption into a third thread.

The scheduling of the reading thread appears to be be the primary factor in avoiding packet loss. The system thus works best when there are more free cores than active threads. Further improvement would require manipulation of the scheduler, e.g. by pinning the thread to a specific core.

The result was very pleasing, requiring less than 100 LOSC.
A quad core 3.5 GHz Xeon suffices to handle four 1080P 30 Hz signals encrypted using AES-128.


Posted in Uncategorized | 1 Comment »

Play2 websocket performance

Posted by eggsearle on December 9, 2012

The realtime UI design was recreated using websockets and Play2.

Latency is ~ 1.6 ms and CPU load ~85%.

The latency is 10x better than the alternatives, which would be expected.
The loading did not increase 10x, which is helpful. 

Posted in Uncategorized | Leave a Comment »

Server Sent Events and Akka IO performance

Posted by eggsearle on November 11, 2012

The experiment was extended with an Akka IO implementation, using as the basis.

Latency is 16ms and CPU load 12%
The former is a little worse than the alternatives and the loading falls between them.
The IO implementation is merely a sample, so there is likely some performance improvement to be had.


Posted in Akka, Scala | Leave a Comment »

Akka HTTP Server example

Posted by eggsearle on November 10, 2012

I was unable to find the source code anywhere except scattered through the documentation page.

Posted in Uncategorized | Leave a Comment »

Play 2 Server Sent Events and Ajax performance

Posted by eggsearle on October 20, 2012

Consider a “dumb client” Single Page App where the domain logic resides on the server.
The UI events are delivered via Ajax, with the server responses flowing over SSE to update the UI.
This minimizes the size of Javascript code while (hopefully) retaining the responsive interaction.

The key question is thus latency and server load.

A simple experiment wires an Ajax call to the SSE event handler, where the call delivers a new data element over the SSE connection, forming an infinite loop.
That provides a simple measurement of the total request/response chain.

Both Play 2 and Jetty Continuations based implementations were tested, using Fedora Core 17, Google Chrome running on AMD Phenom II X6 1100T.

Implementation latency (ms) CPU Load (%) JRE Load (%)
Play 16 52 7
Jetty 12 5 1

CPU Load is reported by top and JRE Load by JConsole.

Play uses Iteratees and NIO, which are both designed for concurrency rather than raw throughput.
A 10x difference is a surprising high cost to pay for the concurrency and expressiveness of the Play API.

The proposed user case would have less than a score of clients, which would be feasible even with a vanilla block Servlet implementation.

Posted in play, Scala | 4 Comments »

Java 7u6 String performance regression

Posted by eggsearle on September 13, 2012

I recently updated to Java 7u5 and found a > 10 times degradation in  the performance of some Scala parser combinator code. Some searching turned up a similar experience and bug report

Comparing the 7u5 and 7u6 code indicates a significant change in the implementation.

The older code used a char[] value, with indices. The substring/subsequence operations created a new instance that shared the value, and only changed the indices. The new implementation only has the char[] value, requiring a new copy to be created for every operation!

This change is not referenced in the release notes and the comments within the code do not provide any explanation.  

The Scala parser fortunately uses CharSequence, allowing the creation of a decorator over the String that restores the original behavior. 


Posted in Uncategorized | 6 Comments »

Unusual java httpserver race condition

Posted by eggsearle on August 29, 2012

The JDK httpserver provides a simple implementation that can easily be applied to unit tests.

The following example would generate a response containing responseBytes.
It might be expected that the referencing code would only see the response when exchange.close() is called.
In reality, the response can be seen as soon as the write executes!

That does not matter in a real web server since the client cannot be aware of the internal processing of the server.
Unit tests intrinsically couple the client and server together so as to progress through the test steps and validate the state of both parties.

This race condition caused unit tests to randomly fail, which is always hard to diagnose.

private static class MockHandler implements HttpHandler{
   public void handle(HttpExchange exchange) throws IOException {

Posted in Uncategorized | Leave a Comment »