Jesper's Blog

Tuesday, September 9, 2014

Type Classes, Implicit Parameters and Instance Equality

In Scala the concept of implicit parameters can be used to, among other things, emulate Haskell type classes. One of the key differences between the two approaches is that the Haskell compiler statically guarantees that there will always be at most one instance of a type class for a specific type, while in Scala there can be any number of instances which are selected based on some quite complex scope searching rules. This guarantee is sometimes brought up as an advantage for Haskell type classes as you can for example easily write a union function for two sets with the same element types and be certain that both sets use the same ordering instance. In Scala you would typically store the ordering instance in the set type, but since two sets can refer to different instances which have the same static type, the compiler can't statically check if it's safe to call union on the sets. One option is of course add a runtime equality check of the ordering instances, but it would obviously be better to have a static check. In this post I'll describe a solution to get the same static type safety in Scala as in Haskell.

Implicit Instance Construction

The first thing to notice is that there are two basic ways to create an implicit instance in Scala, either as an immutable value or as a function which can take other implicit instances as arguments, for example:

// Very incomplete Ord type class as example

case class Ord[T](v: String)

implicit val intOrd = Ord[Int]("Int")

implicit def listOrd[T](implicit ord: Ord[T]) = Ord[List[T]]("List[" + ord.v + "]")

In this example when searching for an implicit Ord[Int] the Scala compiler will simply use the intOrd instance directly. However when searching for an implicit of type Ord[List[Int]] things gets a bit more complicated. In this case the Scala compiler figures out using the static types that it can create an instance by calling listOrd(intOrd), so it generates this call in the compiled code where the instance is requested. Further recursive calls can be generated as needed, so when searching for an implicit Ord[List[List[Int]] it will generate a call listOrd(listOrd(intOrd)), and so on.

Note that these function calls are performed every time an implicit instance is needed at runtime, there is no memoization or caching done. So, while for Ord[Int] the same instance will always be used (as it's a constant value), there will be multiple instances for List[Int] created at runtime.

Furthermore, implicit values and functions of the exact same types (but possibly with different implementations as in the example below) can be defined in other modules (or objects as they are called in Scala):

object o1 {
implicit val intOrd = Ord[Int]("o1.Int")

implicit def listOrd[T](implicit ord: Ord[T]) = Ord[List[T]]("o1.List[" + ord.v + "]")
}

object o2 {
implicit val intOrd = Ord[Int]("o2.Int")

implicit def listOrd[T](implicit ord: Ord[T]) = Ord[List[T]]("o2.List[" + ord.v + "]")
}

Considering all this it might seem a bit complicated to statically check that two implicit instances are indeed behaviorally equal and interchangeable. But the key insight here is that as long as no side effects are performed inside the implicit functions (and it's my strong recommendation not to perform any!), two separate instances created through the same implicit call chain generated by the Scala compiler will behave identically (except for instance identity checks of course). So, to statically check that two instances are equal (but not necessarily identical) all we need to do is to track the implicit call chain in the type of the implicit instance. So, let's get to it...

Phantom Types to the Rescue

Let's start by adding a phantom type, P, to the Ord type:

case class Ord[P, T](v: String)

This type is only used for static equality checks, e.g. if a: Ord[A, B] and b: Ord[A, B] then a and b are equal and can be used interchangeably (note that with the previous definition, Ord[T], this was not the case). Note that the P type can also be written as a type member in Scala, but doing that gave me problems with the type inference so I won't explore that road in this article.

Now we can easily define a unique type for the implicit values in each module object:

object o1 {

implicit val intOrd = Ord[this.type, Int]("o1.Int")

}

object o2 {

implicit val intOrd = Ord[this.type, Int]("o2.Int")

}

The this.type expression used for the P type parameter is evaluated to the singleton type of the module object containing the expression (i.e. o1.type respectively o2.type in this case). This means that o1.intOrd and o2.intOrd will not have the same static type anymore (Ord[o1.type, Int] vs Ord[o2.type, Int]), which is exactly what we wanted. Note that the use of this.type only works as long as there are just one implicit instance with the same type T in Ord (in this case Int). This is usually not a problem, and otherwise this.type can be replaced with arbitrary unique type defined in the module object.

Things get a bit trickier for implicit functions which have implicit parameters. Here we must construct a phantom type that is a contains both a unique module type and the phantom types of the implicit arguments. We can use a tuple type to accomplish this:

object o1 {

implicit val intOrd = Ord[this.type, Int]("o1.Int")

implicit def ordList[P, T](implicit ord: Ord[P, T]) = Ord[(this.type, P), List[T]]("o1.List[" + ord + "]")

}

object o2 {

implicit val intOrd = Ord[this.type, Int]("o2.Int")

implicit def ordList[P, T](implicit ord: Ord[P, T]) = Ord[(this.type, P), List[T]]("o2.List[" + ord + "]")

}

In this example the instance o1.ordList(o1.intOrd) would have type Ord[(o1.type, o1.type), List[Int]], o1.ordList(o1.ordList(o1.intOrd)) would have type Ord[(o1.type, (o1.type, o1.type)), List[List[Int]]] and so on. A combination of implicits from both modules o1.ordList(o2.intOrd) would have type Ord[(o1.type, o2.type), List[Int]].

Set Union Implementation

So, now that we have our phantom types we can quite easily write the type safe set type and the union function. However, there are two possible implementations, either we can get the Ord instance from an implicit parameter to each set function (like union):

object set1 {

class Set[P, T]

def union[P, T](s1: Set[P, T], s2: Set[P, T])(implicit ord: Ord[P, T]) = new Set[P, T]

// Dummy constructor

def set[P, T](v: T)(implicit ord: Ord[P, T]) = new Set[P, T]

}

or we can store it inside the Set object:

object set2 {

class Set[P, T](val ord: Ord[P, T])

def union[P, T](s1: Set[P, T], s2: Set[P, T]) = new Set[P, T](s1.ord)

// Dummy constructor

def set[P, T](v: T)(implicit ord: Ord[P, T]) = new Set[P, T](ord)

}

Either way we can't call union on sets with different P types. The first set implementation saves some memory and works quite well as the Scala compiler is pretty smart about where to look implicit arguments based on the function type arguments.

A Small Test

Here's a small test case to verify that we get a type error when trying to unify two sets with different Ord instances:

import set1._

object so1 {

import o1._

def listListIntSet() = set(List(List(1)))

}

object so2 {

import o2._

def listListIntSet() = set(List(List(1)))

}

val a1 = so1.listListIntSet()

val b1 = so1.listListIntSet()

val a2 = so2.listListIntSet()

val b2 = so2.listListIntSet()

union(a1, b1)

union(a2, b2)

// Compiler error: union(a1, a2)

Final Words

In this article I've shown that by using Scala's powerful type and module system it's possible to get the same guarantees for instance equality in Scala as you get in Haskell, but with the extra flexibility of being able to create multiple type class instances. IMHO implicit parameters are a strictly more powerful solution to ad hoc polymorphism than type classes, and they can also be used for more purposes. But feel free to prove me wrong. :-)

Interestingly enough using a similar technique of adding a phantom type parameter to for example the Ord a type class in Haskell it should make it possible to create multiple Ord instances for the same type a and still get the same guarantees for the set union function for example. Maybe this idea has been explored already?

Saturday, May 25, 2013

Sets and the Type of 1

I've been pondering the subject of types and values for some time and decided to materialize my thoughts in a blog post. Too keep things simple and easily readable I will reason about types from a set theory point of view, not using traditional type theory.

What is a Type?

Let's start by investigating what a type is. According to Wikipedia a type system can be defined as "a tractable syntactic framework for classifying phrases according to the kinds of values they compute". So, the type of a phrase (whatever that is) is simply the set of the values that can be computed from this phrase. In this post I will use the terms type and set interchangeably as I consider them to be the same thing (mathematically this might not be correct).

The Type of 1

So, what is the type of the value 1? In Scala (and Java, C/C++, C# etc.) the type of 1 is a 32-bit signed integer. This is strange as the value 1 is a member of many sets (in fact an infinite number of them), so why is this set chosen? Probably because of historical reasons and because it's a convenient type to do calculations with on most computers, but mathematically this type doesn't have any special meaning.

In Haskell the type of 1 is (Num t) => t. Indeed confusing, the type is defined in terms of a typeclass. This set includes a lot of values, including user defined ones. Really, Haskell must resort to typeclasses to define the type of 1?

I would argue that there are only two reasonable types for the value 1: the set containing all values and the set containing only the value 1. The set containing all values is really not useful in static type checking, so that leaves the set which only element is the value 1, {1} in set notation (also called a singleton set). The same reasoning would apply to all values. I find it remarkable that pretty much all commonly used programming languages gets this simple type wrong. How can one expect a type system to be useful if it can't even infer the correct type of the most basic expressions?

The Type of a Function

A (total) function is a relation that maps each element of one set, let's call it the input set, to exactly one element of another (or the same) set, let's call it the output set. This relation can be modelled as a set of pairs a -> b where a is a member of the input set and b is a member of the output set. The additional restriction on the relation set is that there must be exactly one pair a -> * in the set (where * represents any member of the output set).

So, let's apply this model to a simple function in Scala:

val foo = (b : Boolean) => if (b) 1 else "a"

What is the only reasonable type of foo? Well, the input set of foo is Boolean which is equal to {true, false}. The output set of foo is {1, "a"}. If the input value is true, foo will output the value 1 and if the input value is false, foo will output value "a". So we can write the type of foo as {true -> 1, false -> "a"}. In Scala the type of foo is Boolean => Any, which is a pretty useless type as Any is the set of all values (i.e. the type doesn't tell us anything about the value). In Haskell the corresponding definition won't even type check. Seriously, these are state of the art programming languages, is this the best we can do?

Subtyping

Well, what if we have:

val addOne : Int => Int
val x : {1} = 1
addOne(x) // Type error?

No, this should not be a type error. The first thing to notice is that the set Int contains the value 1 and thus {1} is a subset of Int. This means that it's perfectly safe to apply the function addOne on the input set {1} as we're guaranteed that the relation set addOne contains a mapping for each element in the input set. In type theory one would say that {1} is a subtype of Int.

Final Words

I've only skimmed the surface of how to apply the concept of sets and values to programming languages. The concept can be extended to ADT's, pattern matching, type classes etc. I think there is a lot to gain from thinking about types this way because there is something fundamentally broken in the way types are implemented in most programming languages. Do you agree?

Monday, March 11, 2013

A More Efficient Option

Updated: 2013-03-12

Scala's Option type is a big improvement in type safety over Java's null checking and NullPointerException's. Unfortunately when wrapping a value in Some there is a quite big overhead: creation of one additional object containing a reference to the value. It would be more efficient if we could represent Some as the unboxed pointer value and encode None as the null value, but at the same time preserving the type safety as we get with the Option type (for example Kotlin uses this approach in it's builtin support for nullable types). Well, we can do exactly this with value classes introduced in Scala 2.10:

final case class Option[+T](value: Any = null) extends AnyVal with Product with Serializable {
  private def unsafeGet = value.asInstanceOf[T]
  def isEmpty: Boolean = value == null
  ...
}

The reason that the class parameter is of type Any and not T is that it's not allowed to create a value of type Nothing. We still want to be able to create a None value of type Option[Nothing] though so we delay the unsafe cast to T (using the unsafeGet method) until the value is actually required and we've checked that it's actually there using the isEmpty method.

The code is on GitHub if someone wants to try it out. It's pretty much a dropin replacement for the standard Scala Option type.

Memory Usage Comparisons

Below is a comparison in memory usage between the scala.Option type and the unboxed AnyVal option type when allocating 1M objects each containing one optional value:

scala.Some: 33 MB
scala.None: 19 MB
scala.Some : Any: 34 MB
scala.None : Any: 19 MB
AnyVal Some: 19 MB
AnyVal None: 19 MB
AnyVal Some : Any: 34 MB
AnyVal None : Any: 34 MB

As one would expect, memory usage for Some is almost halved and for None the memory usage is unchanged. The downside of using the AnyVal option is that it uses more memory when a None is upcasted to a super type because then the compiler will box the reference. I would assume this is a quite rare case though.

Tuesday, May 8, 2012

My Take on Haskell vs Scala

I've used both Haskell and Scala for some time now. They are both excellent and beautifully designed functional programming languages and I thought it would be interesting to put together a little comparison of the two, and what parts I like and dislike in each one. To be honest I've spent much more time developing in Scala than Haskell, so if any Haskellers feel unfairly treated please write a comment and I will correct the text.

This is a highly subjective post, so it doesn't contain many references to sources. It helps if the reader is somewhat familiar with both languages.

With that said, let's start the comparison.

Syntax

When it comes to syntax Haskell wins hands down. The combination of using white space for function and type constructor application, and currying makes the code extremely clean, terse and easy to read. In comparison Scala code is full of distracting parenthesis, curly bracers, brackets, commas, keywords etc. I find that I've started using Haskell syntax when writing code on a white board to explain something to a colleague, which must be an indication of it's simplicity.

Type Inference

The type inference in Haskell just works, and does so reliably. The type inference in Scala is clearly worse, but not too bad considering the complexity of the type system. Most of the time it works quite well, but if you write slightly complicated code you will run into cases where it will fail. Scala also requires type annotations in function/method declarations, while Haskell doesn't. In general I think it's a good idea to put type annotations in your public API anyway, but for prototyping/experimentation it's very handy to not be required to write them.

Subtyping

By being Java compatible, Scala obviously has subtyping, while Haskell has not. Personally I'm on the fence whether subtyping is a good idea or not. On one hand it's quite natural and handy to be able specify that an interface is a subtype of some other interface, but on the other hand subtyping complicates type system and type inference greatly. I'm not sure the benefits outweighs the added complexity, and I don't think I'm the only one in doubt.

Modules vs Objects

The Haskell module system is very primitive, it's barely enough to get by with. In contrast Scala's object/module system is very powerful allowing objects to implement interfaces, mixin traits, bind abstract type members etc. This enables new and powerful ways to structure code, for example the cake pattern. Of course objects can also be passed around as values in the code. Objects as modules just feels natural IMHO.

Typeclasses vs Implicit Parameters

Typeclasses in Haskell and implicit parameters in Scala are used to solve basically the same problems, and in many ways they are very similar. However, I prefer Scala's solution as it gives the developer local, scoped control over which instances are available to the compiler. Scala also allows explicit instance argument passing at the call site which is sometimes useful. The idea that there is only one global type class instance for a given type feels too restricted, it also fits very badly with dynamic loading which I discuss in a later section. I also don't like that type class instances aren't values/objects/records in Haskell.

Scala has more advanced selection rules to determine which instance is considered most specific. This is useful in many cases. GHC doesn't allow overlapping instances unless a compiler flag is given, and even then the rules for choosing the most specific one are very restricted.

Lazy vs Strict Evaluation

Haskell has lazy evaluation by default, Scala has strict evaluation by default. I'm not a fan of lazy evaluation by default, perhaps because I've spent much time programming languages like assembly and C/C++ which are very close to the machine. I feel that to able to write efficient software you must have a good knowledge of how the execution machine works and for me lazy evaluation doesn't map well to that execution model. I'm simply not able to easily grasp the CPU and memory utilization of code which uses lazy evaluation. I understand the advantages of lazy evaluation, but being able to get predicable, consistent performance is a too important part of software development to be overlooked. I'm leaning more towards totality checking, as seen in newer languages like Idris, combined with advanced optimizations to get many of the benefits of lazy evaluation.

Type Safety

In Haskell side effects are controlled and checked by the compiler, and all functions are pure. In Scala any function/method can have hidden side effects. In addition, for Java compatibility the dreaded null value can be used for any user defined type (although it's discouraged), often resulting in NullPointerException. Also exceptions are used quite often in Java libraries and they are not checked by the Scala compiler possibly resulting in unhandled error conditions. Haskell is definitely a safer language to program in, but assuming some developer discipline Scala can be quite safe too.

Development Environment

The development environment for Scala is, while not on par with the Java's, quite nice with good Eclipse and IntelliJ plugins supporting code completion, browsing, instant error highlighting, API help and debugging. Haskell doesn't have anything that comes close to this (and no, Leksah is not there :-) ). And I don't buy the argument that you don't need a debugger for Haskell programs, especially for imperative code a good debugger is an invaluable tool. The developer experience in any language is much improved by a good IDE, and Scala is way ahead here.

Update: The EclipseFP Eclipse plugin for Haskell looks promising.

Runtime System

I like the JVM, it's a very nice environment to run applications in. Besides the state of the art code optimizer and garbage collectors, there are lots of tools available for monitoring, profiling, tuning, load balancing for the JVM that just works out of the box with Scala. Being able to easily load and unload code dynamically is also useful, for example in distributed computing.

Haskell has a much more static runtime system. I don't have much experience with runtime tools for Haskell, but my understanding it doesn't have the same amount of tools as the JVM.

Performance

The compilers for Haskell (GHC) and Scala (scalac) are very different. GHC performs quite complex optimizations on the code during compilation, while scalac mainly just outputs Java bytecode (with some minor optimizations) which is then dynamically compiled and optimized by Hotspot at runtime. Unfortunately things like value types and proper tailcall elimination, which are supported by GHC, are currently not implemented in the JVM. Hopefully this will be rectified in the future.

One thing Haskell and Scala have in common is that they both use garbage collection. While this is convenient in many cases and sometimes even necessary, it can often be a source of unnecessary overhead and application latency. It also gives the developer very little control over the memory usage of an application. GHC supports annotation of types to control boxing, and Hotspot does a limited form of escape analysis to eliminate heap allocation in some cases. However there is much room for improvement in both environments when it comes to eliminating heap allocations. Much can be learned from languages like Rust, BitC, ATS and even C++. For some applications it's critical to be able to have full control over memory management, and it's very nice to have that in a high level language.

Final Words

Haskell and Scala are both very powerful and practical programming languages, but with different strengths and weaknesses. Neither language is perfect and I think there is room for improvements in both. I will definitely continue to use both of them, at least until something better shows up. Some new interesting ones on my radar are Rust, ATS and Idris.

Thursday, May 13, 2010

Scala Stream Fusion and Specialization, Part 2

Note: The full source code for this post can be found here.

Scala specialization is maturing and in Scala 2.8 RC2 some parts of the standard library has been specialized (the function and tuple classes). So, it's time to update the stream fusion tests I wrote about in a previous blog post with proper specialization.

It should be quite straight forward to fully specialize the filter and map iterators used in my previous post. This turns out to not be the case as the current specialization implementation contains a quite severe limitation:

A method is only specialized for a type parameter annotated as specialized if the method type signature contains the type parameter in a "naked" position (or an array of the "naked" type parameter).

See this ticket for more information. Here are some examples of how this limitation affects method specialization:


trait A[@specialized T] {
  def m1(t : T) : Boolean                  // Will be specialized for T
  def m2(f : T => Boolean)                 // Will NOT be specialized for T
  def m3[@specialized U](fn : T => U) : T  // Will be specialized for T, but NOT for U
  def m4()                                 // Will NOT be specialized
}

It easy to see how this will cause some problems when implementing specialized iterators and higher order functions.

The iterator trait from my previous post has to be modified slightly as the next method would not get specialized due to the limitation mentioned above:


trait SIterator[@specialized T] {
  def hasNext : Boolean
  def next() : T
}

This interface is similar to the Java and Scala iterator interfaces.

The implementations of array and map iterators are straight forward, but the filter iterator is quite tricky:


  final class FilterIterator[@specialized T](iter : SIterator[T], pred : T => Boolean) extends SIterator[T] {
    private var hasElem = false
    private var elem : T = findNext()

    def hasNext = hasElem

    def next() = {
      val r = elem
      findNext()
      r
    }

    def findNext() : T = {
      while (iter.hasNext) {
        elem = iter.next()

        if (pred(elem)) {
          hasElem = true
          return elem
        }
      }

      hasElem = false
      elem
    }
  }

Normally findNext wouldn't return anything, but that would cause it to not be specialized, so the return type must be set to T.

The fold method is awkward to use as a dummy parameter has to be added so that all type parameters appear in "naked" positions:


  def fold[@specialized T, @specialized U](iter : SIterator[T], fn : (U, T) => U, v : U, dummy : T) = {
    var r = v

    while (iter.hasNext) {
      r = fn(r, iter.next())
    }

    r
  }

This leaves us with the following code for the map-filter-sum calculation:


  def mapFilterSum(a : Array[Int]) = {
    val ai = new ArrayIterator(a, 0, a.length)
    val s = new FilterIterator[Int](new MapIterator[Int, Int](ai, _ * 3 + 7), _ % 10 == 0)
    fold[Int, Int](s, _ + _, 0, 0)
  }

Now to the benchmarks. Remember that we want to compare the performance to this while loop:


  def mapFilterSumLoop(a : Array[Int]) = {
    var i = 0
    var r = 0

    while (i < a.length) {
      val v = a(i) * 3 + 7

      if ((v % 10) == 0)
        r += v

      i += 1
    }

    r
  }

Compiling with -optimize and running with -server gives the following benchmarks times:


Java version: 1.6.0_20
Loop: (4936,-423600172)
Specialized iterators: (5935,-423600172)

So, 4936 microseconds for the while loop compared to 5935 microseconds for the specialized iterators, about 20% performance penalty. This relatively small difference in running times indicates that all boxing/unboxing has been eliminated. Nice indeed.

It would be nice to clean up the syntax a bit for the filter and map operations. This turns out to be harder than expected, again due to the limitation in the specialization implementation. Dummy parameters have to be added to the filter and map methods:


  trait SIterator[@specialized T] {
    def hasNext : Boolean
    def next() : T
    def filter(pred : T => Boolean, dummy : T) = new FilterIterator[T](this, pred)
    def map[@specialized U](fn : T => U, dummy : T, dummy2 : U) = new MapIterator[T, U](this, fn)
  }

Even with the dummy parameters the map method is not specialized correctly which leads to boxing and inferior performance.

In conclusion I would say that specialization still needs some work before it can be useful in a collection library. It's a very powerful feature and hopefully it can be improved to reach its full potential in future Scala releases.

Saturday, March 6, 2010

Scala Stream Fusion and Specialization

Updated: Fixed code links and added view and stream benchmarks.

Inspired by the successful results of Haskell stream fusion (see Evolving Faster Haskell Programs (now with LLVM!) for some impressive optimizations) I was thinking if a similar concept is applicable to Scala collections. It turns out that with a combination of iterators and specialization it's possible to achieve similar optimizations in Scala.

The goal of stream fusion is essentially to optimize code like this:


  def scalaLibrarySum(a : Array[Int]) = a.map(i => i * 3 + 7).filter(i => (i % 10) == 0).foldLeft(0)(_ + _)

into code like this:


  def mapFilterSumLoop(a : Array[Int]) = {
    var i = 0
    var r = 0

    while (i < a.length) {
      val v = a(i) * 3 + 7

      if ((v % 10) == 0)
        r += v

      i += 1
    }

    r
  }

If you run the scalaLibrarySum method in Scala it will create two intermediate arrays with the results of the map and filter operations. This is totally unnecessary for this calculation as the functions passed to filter and map are side effect free and thus the result of the function applications can be performed lazily just before the result is needed in the fold operation. This is basically how the mapFilterSumLoop method works.

Besides creating intermediate arrays, boxing of primitive values must be avoided if we want to have any chance of competitive performance (the Haskell libraries contain specialized instances to avoid boxing). Fortunately Scala supports specialization of type parameters in version 2.8, which enables us to avoid boxing while still writing generic code. Unfortunately this feature seems to be quite buggy at the moment, just by playing around with a simple example I encountered two bugs (tickets #3148 and #3149). So, the code below contain some specialization done by hand. Hopefully these bugs will be fixed so that the code can be fully generalized.

The biggest difference compared to stream fusion in Haskell is that I use impure iterators in the Scala code. This is not as nice as the pure stream code used in Haskell, but it's a fact that Hotspot isn't nearly as good at optimizing pure functional code as GHC. Hotspot works best if fed imperative style loops.

Here's the definitions of the specialized functions and iterators I use in the benchmark below:


  // Specialized Function1
  trait Fn1[@specialized I, @specialized O] {
    def apply(a : I) : O
  }

  // Specialized Function2
  trait Fn2[@specialized I1, @specialized I2, @specialized O] {
    def apply(a1 : I1, a2 : I2) : O
  }

  // Specialized iterator
  trait SIterator[@specialized T] {
    def hasMore : Boolean
    def current : T
    def next()
  }

In addition to this I've defined array, filter and map iterators. Unfortunately these are not generic due to the problems with the specialize feature:


  class IntArrayIterator(a : Array[Int], var index : Int, endIndex : Int) extends SIterator[Int] {
    def next() = index += 1
    def current = a(index)
    def hasMore = index < endIndex
  }

  // Optimally this would be: class FilterIterator[@specialized T](iter : SIterator[T], pred : Fn1[T, Boolean]) extends SIterator[T]
  class FilterIterator(iter : SIterator[Int], pred : Fn1[Int, Boolean]) extends SIterator[Int] {
    def hasMore = iter.hasMore

    def next() = {
      iter.next()
      findNext()
    }

    def findNext() = {
      while (iter.hasMore && !pred(iter.current))
        iter.next()
    }

    def current = iter.current

    findNext()
  }

  // Optimally this would be: class MapIterator[@specialized U][@specialized T](iter : SIterator[T], fn : Fn1[T, U]) extends SIterator[U]
  class MapIterator(iter : SIterator[Int], fn : Fn1[Int, Int]) extends SIterator[Int] {
    def next() = iter.next()
    def current = fn(iter.current)
    def hasMore = iter.hasMore
  }

The fold function is straightforward and generic:


  def fold[@specialized T, @specialized U] (iter : SIterator[T], fn : Fn2[U, T, U], v : U) = {
    var r = v

    while (iter.hasMore) {
      r = fn(r, iter.current)
      iter.next()
    }

    r
  }

The map-filter-sum function can now be written using iterators:


  def mapFilterSum(a : Array[Int]) = {
    val filter = new Fn1[Int, Boolean] {def apply(a : Int) = (a % 10) == 0}
    val map = new Fn1[Int, Int] {def apply(a : Int) = a * 3 + 7}
    val s = new FilterIterator(new MapIterator(new IntArrayIterator(a, 0, a.length), map), filter)
    fold(s, new Fn2[Int, Int, Int] {def apply(a1 : Int, a2 : Int) = a1 + a2}, 0)
  }

The full iterator code can be found here. Compile the code using the latest Scala 2.8 build with the -Yspecialize flag. The optimize flag doesn't seem to have much effect on the performance.

I've benchmarked four different implementations of the map-filter-sum calculation:

The while loop shown above

The while loop split up into map, filter and fold functions with intermediate array results passed between them

The version using specialized iterators

The Scala library implementation shown above

Same as Scala library function but with a view instead

Same as Scala library function but with a stream instead

The benchmark is performed by taking the minimum execution time of 200 runs of each of the functions on an array of 1 million integers. Running the application with latest OpenJDK 7 (Hotspot version "build 17.0-b10") and the flags "-server -XX:CompileThreshold=100" I get the following results:


Loop: (4990,-423600172)
Loop with intermediate arrays: (6690,-423600172)
Specialized iterators: (5367,-423600172)
Scala array: (46444,-423600172)
Scala view: (39625,-423600172)
Scala stream: (63210,-423600172)

The first result value is the minimum execution time in microseconds, the second value is the result of the calculation. As you can see the method using specialized iterators is almost as fast as the single while loop. Hotspot has inlined all the iterator code, not bad! Using intermediate arrays is about 25% slower than specialized iterators. Using the Scala library is about 7-9 times slower! Clearly this bad result is a consequence of boxing taking place. Using a view is fastest here as it also avoids intermediate array creation.

The conclusion from this simple experiment is that it's certainly possible to write collection libraries with a nice high level interface and at the same time have excellent performance. When Scala specialization support is improved hopefully this power be available to all Scala programmers.

The full benchmark code can be found here.

Saturday, September 12, 2009

Type Lists and Heterogeneously Typed Arrays

In previous blog posts (HList in Scala and HList in Scala Revisited (or Scala Metaprogramming Works!)) I described a way to encode heterogeneously typed lists, HList, in Scala. This data type can be used as a generic replacement for the TupleX types found in the Scala library. In this post I will generalize the concept of type lists and introduce a new data type, HArray.

Note: the MetaScala code linked to in this post must be compiled using a recent Scala 2.8.0 build. I've given up on making it work with 2.7.x.

Type Lists

In HList the types of the elements and the element values are intertwined in the same type. It doesn't have to be this way though and the types of the elements can be tracked separate from the actual element values. To do this we create an purely abstract type (it has no instances) which models a list of types, let's call it TList. Here's a simple implementation:


  // Abstract base type
  sealed trait TList

  // The empty type list
  final class TNil extends TList

  // A pair of a type and a type list
  final class TCons[H, T <: TList] extends TList {
    type Head = H
    type Tail = T
  }

  // For infix notation
  type ::[H, T <: TList] = TCons[H, T]

which would allow us to build type lists using infix notation, for example:


  type SomeTypes = Int :: Boolean :: String :: TNil

This is only the basic definition of TList, in practice you want to add lots of functionality to this type, like append, length, indexing, reverse etc. The full TList code is available in MetaScala. TList can be used for implementing HList, but also heterogeneously typed arrays which I describe below.

Heterogeneously Typed Arrays

With a heterogeneously typed array I mean an array where each element can have a unique type which is tracked at compile time (the length of the array is also tracked at compile time). This ensures that the element values are handled in a type safe way. The values are stored in an Array[Any] and are cast to the correct type when extracted from the array (remember that Java generics are implemented using casts as well so this is no different from using the TupleX classes, or a Java ArrayList). The element types are tracked using a TList.

Here's part of the implementation which shows the prepend, append and nth methods:


final class HArray[L <: TList](private val elems : Array[Any]) extends HSeq[L] {
    ...

    def ::[T](v : T) = {
      val a = new Array[Any](elems.length + 1)
      a(0) = v
      Array.copy(elems, 0, a, 1, elems.length)
      new HArray[T :: L](a)
    }

    def :::[L2 <: TList](l : HArray[L2]) = {
      val a = new Array[Any](elems.length + l.elems.length)
      Array.copy(l.elems, 0, a, 0, l.elems.length)
      Array.copy(elems, 0, a, l.elems.length, elems.length)
      new HArray[L2#Append[L]](a)
    }

    def apply[N <: Nat](implicit nth : INth[N]) : L#Nth[N] = elems(nth.index).asInstanceOf[L#Nth[N]]
    
    ...
}

Notice that the method implementations use simple array operations (array copy and indexing) so the performance should be comparable to using the TupleX classes in the Scala library (no benchmarks performed yet though). The full HArray implementation can be found here.

The HArray interface is similar to HList and virtually the same operations are supported. There are some usage examples in MetaScala:


  // Create a HArray of an Int, Boolean and a pair
  val a1 = 10 :: true :: (10.1, "Hello") :: HArrayNil

  // Extract the second element, note that the element type
  // information is preserved and we can safely perform a
  // boolean and operation
  val b = a1(_1) && false

  // Create another HArray using alternative syntax (faster)
  val a2 = HArray(1.1, "string", false)

  // Replace the second element in the list, it used to
  // be a String, but now it's an Int
  val a3 = a2.removeNth(_1).insert(_1, 14)

  // Type information preserved, we can use an Int operation
  // on the element
  val i = a3(_1) / 2

  // Append l2 to l1
  val a4 = a1 ::: a2

  // Statically check that the length of l4 is 6
  type T = Equal[_6, a4.Size]

Other TList Applications

There are other applications for the TList type, for example it can be used for modeling type unions like this:


  case class OneOf[TS <: TList](value : Any)

which would be a lot easier to use than Either when you have more than two choices. A problem with this solution is that it wouldn't play well with Scala's pattern matcher, you would to add need unnecessary default cases for example. It should be possible to create type matching functions though to alleviate this problem. More work is needed in this area and it might be material for a future blog post.

If you can think of other applications for type lists and sets I would be interested in hearing them.