Tuesday, May 8, 2012

My Take on Haskell vs Scala

I've used both Haskell and Scala for some time now. They are both excellent and beautifully designed functional programming languages and I thought it would be interesting to put together a little comparison of the two, and what parts I like and dislike in each one. To be honest I've spent much more time developing in Scala than Haskell, so if any Haskellers feel unfairly treated please write a comment and I will correct the text.

This is a highly subjective post, so it doesn't contain many references to sources. It helps if the reader is somewhat familiar with both languages.

With that said, let's start the comparison.

Syntax

When it comes to syntax Haskell wins hands down. The combination of using white space for function and type constructor application, and currying makes the code extremely clean, terse and easy to read. In comparison Scala code is full of distracting parenthesis, curly bracers, brackets, commas, keywords etc. I find that I've started using Haskell syntax when writing code on a white board to explain something to a colleague, which must be an indication of it's simplicity.

Type Inference

The type inference in Haskell just works, and does so reliably. The type inference in Scala is clearly worse, but not too bad considering the complexity of the type system. Most of the time it works quite well, but if you write slightly complicated code you will run into cases where it will fail. Scala also requires type annotations in function/method declarations, while Haskell doesn't. In general I think it's a good idea to put type annotations in your public API anyway, but for prototyping/experimentation it's very handy to not be required to write them.

Subtyping

By being Java compatible, Scala obviously has subtyping, while Haskell has not. Personally I'm on the fence whether subtyping is a good idea or not. On one hand it's quite natural and handy to be able specify that an interface is a subtype of some other interface, but on the other hand subtyping complicates type system and type inference greatly. I'm not sure the benefits outweighs the added complexity, and I don't think I'm the only one in doubt.

Modules vs Objects

The Haskell module system is very primitive, it's barely enough to get by with. In contrast Scala's object/module system is very powerful allowing objects to implement interfaces, mixin traits, bind abstract type members etc. This enables new and powerful ways to structure code, for example the cake pattern. Of course objects can also be passed around as values in the code. Objects as modules just feels natural IMHO.

Typeclasses vs Implicit Parameters

Typeclasses in Haskell and implicit parameters in Scala are used to solve basically the same problems, and in many ways they are very similar. However, I prefer Scala's solution as it gives the developer local, scoped control over which instances are available to the compiler. Scala also allows explicit instance argument passing at the call site which is sometimes useful. The idea that there is only one global type class instance for a given type feels too restricted, it also fits very badly with dynamic loading which I discuss in a later section. I also don't like that type class instances aren't values/objects/records in Haskell.

Scala has more advanced selection rules to determine which instance is considered most specific. This is useful in many cases. GHC doesn't allow overlapping instances unless a compiler flag is given, and even then the rules for choosing the most specific one are very restricted.

Lazy vs Strict Evaluation

Haskell has lazy evaluation by default, Scala has strict evaluation by default. I'm not a fan of lazy evaluation by default, perhaps because I've spent much time programming languages like assembly and C/C++ which are very close to the machine. I feel that to able to write efficient software you must have a good knowledge of how the execution machine works and for me lazy evaluation doesn't map well to that execution model. I'm simply not able to easily grasp the CPU and memory utilization of code which uses lazy evaluation. I understand the advantages of lazy evaluation, but being able to get predicable, consistent performance is a too important part of software development to be overlooked. I'm leaning more towards totality checking, as seen in newer languages like Idris, combined with advanced optimizations to get many of the benefits of lazy evaluation.

Type Safety

In Haskell side effects are controlled and checked by the compiler, and all functions are pure. In Scala any function/method can have hidden side effects. In addition, for Java compatibility the dreaded null value can be used for any user defined type (although it's discouraged), often resulting in NullPointerException. Also exceptions are used quite often in Java libraries and they are not checked by the Scala compiler possibly resulting in unhandled error conditions. Haskell is definitely a safer language to program in, but assuming some developer discipline Scala can be quite safe too.

Development Environment

The development environment for Scala is, while not on par with the Java's, quite nice with good Eclipse and IntelliJ plugins supporting code completion, browsing, instant error highlighting, API help and debugging. Haskell doesn't have anything that comes close to this (and no, Leksah is not there :-) ). And I don't buy the argument that you don't need a debugger for Haskell programs, especially for imperative code a good debugger is an invaluable tool. The developer experience in any language is much improved by a good IDE, and Scala is way ahead here.
Update: The EclipseFP Eclipse plugin for Haskell looks promising.

Runtime System

I like the JVM, it's a very nice environment to run applications in. Besides the state of the art code optimizer and garbage collectors, there are lots of tools available for monitoring, profiling, tuning, load balancing for the JVM that just works out of the box with Scala. Being able to easily load and unload code dynamically is also useful, for example in distributed computing.

Haskell has a much more static runtime system. I don't have much experience with runtime tools for Haskell, but my understanding it doesn't have the same amount of tools as the JVM.

Performance

The compilers for Haskell (GHC) and Scala (scalac) are very different. GHC performs quite complex optimizations on the code during compilation, while scalac mainly just outputs Java bytecode (with some minor optimizations) which is then dynamically compiled and optimized by Hotspot at runtime. Unfortunately things like value types and proper tailcall elimination, which are supported by GHC, are currently not implemented in the JVM. Hopefully this will be rectified in the future.

One thing Haskell and Scala have in common is that they both use garbage collection. While this is convenient in many cases and sometimes even necessary, it can often be a source of unnecessary overhead and application latency. It also gives the developer very little control over the memory usage of an application. GHC supports annotation of types to control boxing, and Hotspot does a limited form of escape analysis to eliminate heap allocation in some cases. However there is much room for improvement in both environments when it comes to eliminating heap allocations. Much can be learned from languages like Rust, BitCATS and even C++. For some applications it's critical to be able to have full control over memory management, and it's very nice to have that in a high level language.

Final Words

Haskell and Scala are both very powerful and practical programming languages, but with different strengths and weaknesses. Neither language is perfect and I think there is room for improvements in both. I will definitely continue to use both of them, at least until something better shows up. Some new interesting ones on my radar are Rust, ATS and Idris.