Saturday, September 12, 2009

Type Lists and Heterogeneously Typed Arrays

In previous blog posts (HList in Scala and HList in Scala Revisited (or Scala Metaprogramming Works!)) I described a way to encode heterogeneously typed lists, HList, in Scala. This data type can be used as a generic replacement for the TupleX types found in the Scala library. In this post I will generalize the concept of type lists and introduce a new data type, HArray.

Note: the MetaScala code linked to in this post must be compiled using a recent Scala 2.8.0 build. I've given up on making it work with 2.7.x.

Type Lists

In HList the types of the elements and the element values are intertwined in the same type. It doesn't have to be this way though and the types of the elements can be tracked separate from the actual element values. To do this we create an purely abstract type (it has no instances) which models a list of types, let's call it TList. Here's a simple implementation:

// Abstract base type
sealed trait TList

// The empty type list
final class TNil extends TList

// A pair of a type and a type list
final class TCons[H, T <: TList] extends TList {
type Head = H
type Tail = T
}

// For infix notation
type ::[H, T <: TList] = TCons[H, T]

which would allow us to build type lists using infix notation, for example:

type SomeTypes = Int :: Boolean :: String :: TNil

This is only the basic definition of TList, in practice you want to add lots of functionality to this type, like append, length, indexing, reverse etc. The full TList code is available in MetaScala. TList can be used for implementing HList, but also heterogeneously typed arrays which I describe below.

Heterogeneously Typed Arrays

With a heterogeneously typed array I mean an array where each element can have a unique type which is tracked at compile time (the length of the array is also tracked at compile time). This ensures that the element values are handled in a type safe way. The values are stored in an Array[Any] and are cast to the correct type when extracted from the array (remember that Java generics are implemented using casts as well so this is no different from using the TupleX classes, or a Java ArrayList). The element types are tracked using a TList.

Here's part of the implementation which shows the prepend, append and nth methods:

final class HArray[L <: TList](private val elems : Array[Any]) extends HSeq[L] {
...

def ::[T](v : T) = {
val a = new Array[Any](elems.length + 1)
a(0) = v
Array.copy(elems, 0, a, 1, elems.length)
new HArray[T :: L](a)
}

def :::[L2 <: TList](l : HArray[L2]) = {
val a = new Array[Any](elems.length + l.elems.length)
Array.copy(l.elems, 0, a, 0, l.elems.length)
Array.copy(elems, 0, a, l.elems.length, elems.length)
new HArray[L2#Append[L]](a)
}

def apply[N <: Nat](implicit nth : INth[N]) : L#Nth[N] = elems(nth.index).asInstanceOf[L#Nth[N]]

...
}

Notice that the method implementations use simple array operations (array copy and indexing) so the performance should be comparable to using the TupleX classes in the Scala library (no benchmarks performed yet though). The full HArray implementation can be found here.

The HArray interface is similar to HList and virtually the same operations are supported. There are some usage examples in MetaScala:

// Create a HArray of an Int, Boolean and a pair
val a1 = 10 :: true :: (10.1, "Hello") :: HArrayNil

// Extract the second element, note that the element type
// information is preserved and we can safely perform a
// boolean and operation
val b = a1(_1) && false

// Create another HArray using alternative syntax (faster)
val a2 = HArray(1.1, "string", false)

// Replace the second element in the list, it used to
// be a String, but now it's an Int
val a3 = a2.removeNth(_1).insert(_1, 14)

// Type information preserved, we can use an Int operation
// on the element
val i = a3(_1) / 2

// Append l2 to l1
val a4 = a1 ::: a2

// Statically check that the length of l4 is 6
type T = Equal[_6, a4.Size]


Other TList Applications

There are other applications for the TList type, for example it can be used for modeling type unions like this:

case class OneOf[TS <: TList](value : Any)

which would be a lot easier to use than Either when you have more than two choices. A problem with this solution is that it wouldn't play well with Scala's pattern matcher, you would to add need unnecessary default cases for example. It should be possible to create type matching functions though to alleviate this problem. More work is needed in this area and it might be material for a future blog post.

If you can think of other applications for type lists and sets I would be interested in hearing them.