The Curious Failure of Functional Programming for Parallel Applications
by Malte Skarupke
Long title, but it is interesting how functional programming has mostly failed to gain traction in parallel and concurrent applications. People think that functional programming should become more popular as we do more and more parallel programming, but that doesn’t happen. As an example this blog post made the rounds recently which has this quote in it:
However, the key to multicore programming is controlling mutation. This is why functional languages have been steadily gaining ground in concurrency and parallelism.
Which I would disagree with. Not the controlling mutation part, but the steadily gaining ground part. Let’s look at a few examples:
I work in video games and the most parallel thing we do is programming shaders. And there isn’t a single popular functional shader programming language. (In fact I’m not sure if there are any at all, but I’d bet that someone probably wrote one as a hobby project) Same thing for general purpose GPU compute languages.
When looking at areas other than gaming the most common parallel programming is probably web server programming and the big data software that came out of that. But looking for example at Google App Engine’s supported languages you find Java, Python, PHP and Go. Amazon AWS has SDKs for several more languages but again none of them are functional. (you can of course run anything you want on Amazon EC2, but their SDKs for the provided services are a good indicator for what languages people have used in the past) Since everyone supports Java, you could run Scala or Clojure and I’m sure people do, but again it is curious that the most parallel application in this field, Hadoop, is written in Java.
So why doesn’t functional programming take off? In general functional programming doesn’t take off because it’s weird. But functional programming fans have often claimed that people would overcome that weirdness in order to get the easier parallelism that functional programming offers. So why didn’t that come true?
It turns out that you if you really want to scale, you have to design for parallelism from the ground up. And if you have done that, it doesn’t really matter which language you use. Shader programming would not benefit from being functional because you simply don’t have to think about controlling mutation. Same thing for service oriented architectures and map reduce.
Yes, functional programming makes concurrency easier because some problems simply can’t happen, but it turned out that Amdahl’s law is the bigger problem. And you’re going to run into that even in functional languages. It doesn’t matter how well you control your mutable state: If your design requires that step A has to happen in order to do step B, you can’t parallelize that. Once you have a design that pushes Amdahl’s law back far enough where you can ignore it, it doesn’t matter much which language you choose to implement that design.
I guess the way that I think about it is that multicore programming is more natural if you avoid shared state, which is not necessarily the same thing as avoiding mutation. Purely functional languages avoid mutation, but if you have a lazy evaluation thunk that is shared across multiple cores then you have shared state under the hood. Shaders avoid shared state, but allow mutation.
I noticed this when I program in Java or Python: I get annoyed very quickly because I can’t reason about what happens to an object. As soon as I pass it to someone they can do whatever they want. Will they keep it alive? Will they modify it? I don’t know except if I look at their code. In C++ I can reason about my objects much more easily. I have it and if you want it you need to make a copy and I don’t care about your copy. (this is true even for heap allocated objects because if I will delete it you are still limited with what you can do with it unless you copy it)
This is especially important for threaded programming. In C++ I can pass a vector from thread A to thread B and I know that nobody can see the vector but thread B. Not even thread A. (or anyone else who has seen the vector in the past) Then if thread B returns the vector at the end of the computation thread A can see it again. And I can make sure that all of this is correct at compile time.
So there are two ways to do concurrency: Sharing with immutable objects, or don’t share and copy or move instead. People who say that functional programming helps with concurrent programming think that sharing with immutable objects is the solution. I think having both is kinda neat.
While you can find all kinds of spectacular claims about functional programming, I think one thing that is true is that using a functional style makes it easier to program shared-memory multiprocessors without going insane in the pit of data races, deadlocks and atomicity violations.
GPUs and distributed clusters (and FPGAs for that matter) are sufficiently different from the conventional processor model that it makes little sense to program any of them with conventional multithreading technology. It doesn’t matter whether the base language in question is imperative or functional. For GPUs you need completely different languages (CUDA, OpenCL, …) and for distributed computing you need fancy frameworks of one sort or another. Again I think in the latter context functional vs imperative doesn’t matter a great deal. You’re not directly sharing data/memory between workers in either case.
On shared-memory multiprocessor machines (e.g. what we have in our fancy phones now) there is efficiency to be gained by smart use of the memory sharing hardware. This is where the functional style (especially persistent data structures) have a chance to make an impact. Of course, one does not need to use a functional language to program in a functional style.