Sunday, November 05, 2006

Concurrency in a web platform with distributed computing

Right now, it's clear whatever platform is to be the key for a Web application success, it must be based on a client-server architecture, something with the look and feel of the so-called Web 2.0 applications. That is, I want to use a Web platform to implement the aplication GUI, just as if it was a desktop application.

Wrt. the concurrency model, I would like to have multiple clients accessing simultaneously to the same application. Off course faillures on certain nodes (either clients or servers) must also be considered. Lastly, I want to be able to make computations on both the client and the server side.

Without having a Web architecture, I can find three approaches to the concurrency aspect:

  1. Shared space: By using a memory space shared by all interested nodes, one can make applications by building (complex) contraptions with locks and semaphores. It's tedious, error-prone, and is generaly considered to be avoided in medium-large applications.
  2. Message passing: Using the Actors approach in which everything is an object that is operated solely by exchanging messages with it. Most of the times this is simplified to the model where only processes (or nodes, or threads) send messages to each other (like Erlang)

  3. Declarative data flow: this is the model used by Termite, and I've described it extensively in the previous post. In short each process can send messages, data, and execution-flow to another process.
From these three, the third seems to solve more issues:

  • Doesn't need the coding complexity needed in the shared data management;
  • Does not demand separate programming of paralel processes;
  • Declarative DSL's save lots of code lines, with abstractions based on continuations replication between nodes
With the Web architecture in mind, we must consider at least the following questions:

  • How are computations going to be transferred to a given client or back to the server?
  • How will the computation be transferred at any point of the application execution?

  • Given the bandwidth current limitations, how are computations going to be serializable? In processes, continuations, closures? What is the relevant minimal data set that need to be passed along the wire?
  • It is a fact that the client must have an engine (in ECMA Script, I suppose) to process some part of the application - either the user interface or the domain application logic. This engine must be prepared to receive this sort of computations from other nodes (as well as to send some back to them).
This questions are good guides to the next stage of my research. I'll make sure to post the answers I (hopefully) find.