Threads Suck

concurrency erlang eventmachine queueing

Thu Aug 13 15:13:20 -0700 2009

Ricky Ho gives an excellent description of two competing models for concurrency: theads vs. sequential message passing.

There are basically two models for communication between concurrent executions. One is based on a “Shared Memory” model which one thread of execution write the information into a shared place where other threads will read from. Java’s thread model is based on such a “shared memory” semantics. The typical problem of this model is that concurrent update requires very sophisticated protection scheme, otherwise uncoordinated access can result in inconsistent data.

Unfortunately, this protection scheme is very hard to analyze once there are multiple threads start to interact in combinatorial explosion number of different ways. Hard to debug deadlock problem are frequently pop up. To reduce the complexity, using a coarse grain locking model is usually recommended but this may reduce the concurrency.

Erlang has picked the other model based on “message passing”. In this model, any information that needs to be shared will be “copied” into a message and send to other executions. In this model, each thread of execution has its state “completely local” (not viewable by other thread of executions). Their local state is updated when they learn what is going on in other threads by receiving their messages. This model mirrors how people in real life interact with each other.

I am increasingly of the mind that threads suck; and the sooner we all wean ourselves off of this dead-end model for concurrency, the better.

If you’re not using Erlang, other options for concurrency include:

  • An async model like EventMachine (without using defer), Twisted, or fibers. You have parallel execution paths but they explicitly yield back to the scheduler when they need to block on an external event (disk, network, waiting for another process). And yes, this is extremely similar to the non-preemptive mutitasking system that operating systems like Mac OS 9 and Windows 98 used. For some reason that I haven’t quite put my finger on yet, this feels like a very clean way to do things inside an individual program, even though it’s lame in an OS kernel.
  • Run multiple operating system processes, and communicate via a message bus like RabbitMQ, SQS, or Beanstalk. Note here that if you run a Mongrel or Thin cluster, you’re already using something fairly close to this pattern. Take the same idea, throw in a message bus and extend it to non-web processes, and you’ve got a share-nothing, message-passing system for concurrency. Multiple OS processes communicating via messages doesn’t give you all the benefits of processes running inside Erlang’s runtime, but it does give you the big one: horizontal scalability.

P.S. Here’s a pretty well-written counterargument: Why Events are a Bad Idea.