Tuesday, November 28, 2006

On thread persistence and migration

Having contacted the Hop authors, I got a good feeling on their progresses, and I'm looking forward for the next (beta?) version. Also, they suggested me a PhD thesis, "A Portable Mechanism for Thread Persistence and Migration", whose details can be found in Wei Tao's Ph.D. Research Page.

Her dissertation is a good explanation on how a Java program may be adapted into having support for thread suspension and resume. She based her work on a complete code re-writing (both at source and byte-code level), following a (theoretically correct) set of transformations. In short, a program must emulate the creation of a continuation on each code sentence, using an exception mechanism to capture the state of the running program. This exception usage is handy, because after catching an exception the code may throw it again, or store (persistently) the execution state. Obviously there's the problem of non serializable resources having to be saved and restored, and the local, temporary variables. Tao, cleverly leaves this issue to the application creator, with two resolution strategies:

  • Creating a wrapper class to the resource, that knows how to close and recreate it (e.g., close a file port and re-opening it a few days later);
  • Setting the an existing object field to the resource placed in a local variable;
It's relevant to note that Tao relies on the use of the transient modifier, that makes such a field null when the object is serialized - having to be re-created on deserialization.

The process is heavily focused on the Java Virtual Machine, and the transformation is not proved to make such sense in other programming languages/environments. Nevertheless, the byte-code transformation strategy is well documented and several performance considerations are made throughout the dissertation. I, however, believe that this approach may be simplified - particularly the exception-based technique used to save/restore execution state - when the application is written in some other languages, like Scheme, that natively support first-class continuations.

Her motivation to this technique is well founded, focusing both portability (she may employ this transformations on several applications, and even with other programming languages, as long as their core is closely related to Java's) and code mobility (inevitably the best reason to save thread execution state and to be able to restore/resume it later, although "Moving long running applications from desktops to laptops and back also has clear appeal").

One interesting point she states from the beginning of the report is that making a thread persistently saved is quite similar to having it migrated to another place, the difference being that the store is the run-time environment of another process/machine.

If you're interested in execution-flow migration, thread state, and similar issues, I advise you to read this report, or at least to skim through it.

Monday, November 20, 2006

Mosquito Lisp

I stumbled upon Mosquito Lisp, and thought for a minute that it could be the adaptation to Termite's concepts into a more web-oriented strategy. But I was wrong, as it is only more focused on security measures, and leaves the main attraction in Termite behind - not only it does not allow the exchange of continuations, as it doesn't even consider them in the language (as with macros, now that we're at it!). This can make the dialect extremely simple to grasp and employ on real world cases, but it fails to make academic experiments with abstractions and other language tricks possible (at least throughout several machines!). Here's the link to the introduction:

Ephemeral Security - Introduction to Mosquito Lisp

Saturday, November 18, 2006

Application interaction models

I have assembled the interaction types I want to make possible in the near future. They assume a similar architecture to the one verified in current Web Applications, that is, a client-server architecture, a standard (mainstream) browser running on the client and an application container of your choice running on the server. But, unlike today's regular usage of this architecture, I want to allow some other application interaction models:

  • Client-Server code mobility, where the server may detect it is unable to compute every requested operation, and sends some of the (possibly still running) computations to the client for it to finish them. As I depict in the preceding diagram, after the client knows the code to the computation he needs to finish, it can gain the ability to perform it without accessing the server (unless there is some server-side information that needs to be fetched, obviously).

  • Disconnected Execution. This sort of interaction is interesting as the client is able to work while being offline (or at least, disconnected from the server). The only requirement is that all of the functionality (perhaps organized in functional modules) and the related domain data must be fetched from the server: As one can see, this may pose other challenges, such as domain data access control (e.g., I may not be able to have full control to some objects that were being used by the server to produce the interface, and thus I cannot have that (whole) object on the client side).
  • Server replication, to grant reliability in the provided application. While there are several approaches to do server replication, all of them must allow messages that arrive to a front-of-the-line server to be replicated to all others (or at least the effects of processing the received message). This can be done by the exchange of continuations (see the Termite approach, in my previous posts).
  • Inter-client interactions are the most uncommon model, when compared to existing uses of Web Applications. In the "several interactions" arrows can be two main types of communication. There can be only data exchange, like in instant messaging, white-boards or on-line games. But there may also be execution flow exchange. You can have two clients making parallel computations, in a sort of collaborative processing. You can also have ubiquitous execution, like having a smart-phone "pick up" the application execution from the desktop computer's browser and carrying on with it.
While this list may not be complete, it is my starting point. Off course I'll add more examples or complete/modify the existing ones if I see it necessary.

Have a nice weekend!

Wednesday, November 15, 2006

Mobile computations example

In an attempt to use a new Web diagramming tool (Gliffy), I just put into a simple time-line an example of what I want to make possible with my project:

Basically it is a client that starts by asking the server for a specific code execution. the server, however, opts to stop the execution midway, and transfers its work in progress to the client. Now a couple of noteworthy behaviors can be verified on the client:

  1. The client finishes the remainder of the computation, not contacting the server anymore in the process;
  2. The client now knows how to make that kind of computations, so it will never bug the server again for that.
I'll welcome any comment on this idea, specially constructive criticism :)

Tuesday, November 14, 2006

How distributed mobile computing resembles a JIT compiler

The Just In Time (JIT) compiler used in the Java Virtual Machines (JVM) is the tool responsible to compile Java bytecode into native platform code, thus granting the application improved execution speed on certain operations.

How can this be comparable to distributed mobile computing? Like I stated in previous posts, I intend to be able to migrate code that is currently being executed into other machines, and resume it's execution there. Let's analyze what this means from the client application's point of view:

  1. The client is viewing a page that accesses a procedure that is currently being executed on the server. Thus it calls it with a call (possibly an asynchronous one) to the server, and sets a callback function to receive the server answer;
  2. The server opts to send that computation to the client, so it does so (let's assume for now that it uses a Termite-like approach to do so);
  3. The client has now the task of storing the received computation is some JavaScript callable form, so that it can call it directly instead of using the RPC and callback he used before.
To do this, he changes the way he calls a given functionality. Which is pretty much the same thing the JIT compiler does. The JVM stores a couple of Virtual Tables (V-Tables) with the addresses of every functions. In the first, it stores pointers to the source code, while the second stores pointers to byte-compiled functions. The first times the code is called, the JVM uses the first V-Table, but calls the JIT to make the compilation and fill in the address in the second V-Table. This way, subsequent calls to the functions will be directed to the byte-compiled file, directly, in a transparent way for the application code.

This can be used as an inspiration for code mobility: Both the client and the server may store tables for each procedure that may be subject to migration, associated with the local address or the particular remote call mechanism, along with a flag that states which method is used currently. The problem still is what information is relevant for each function execution. This single table would suffice if we wanted static code migration. But if we want to migrate running code, we have to consider every function that can be used within the running code.

This topic is still fresh, and I will have to do some experimentations before I can have a more conclusive opinion. But I have a good feeling about this mechanism when associated with a Termite-like code migration technique!

Monday, November 13, 2006

Inria's Hop

I finally got to read both technical papers on Hop (found in Hop Home page), and I can now make a reasonable comparison between their approach and Termite's.

First let's describe Hop's in a few sentences. Being aimed at Web applications development, Hop is mainly a programming language that separates the application logic from the interface definition, while maintaining a single coding artifact for the application. The language is built upon Bigloo Scheme, so that servers may import several useful libraries. The core features of Hop are:

  • server services - defined just like a regular function, with the macro 'define-service', can do basically anything on the server.
  • interface - created with macros that resemble the HTML tags, interfaces are a declaratively constructed HTML tree. They can contain scripts with Scheme code of two different natures:

    • Client-side code, translated automatically from to JavaScript;
    • Server-side code, mixed within client-side scripts but prefixed with a special character. This code is translated to the appropriate asynchronous server calls, and every exchanged object is automatically mangled/de-mangled (JSON can be used to read server responses on the client).
One of the things that pops out when you see a Hop application is it's elegance, as everything is written in a single file using the same programming language, despite addressing the two important application "strata" (server logic and user interface). The continuous bidirectional communication (and data exchange) between the client and servers is accomplished via asynchronous remote function calls to the server and by setting events on the client that are triggered upon a server's reply - enabling both push and pull communication models.

Another interesting fact is the execution model Hop uses. It is based on several engines running for a single application: One for the UI processing, and other(s) to the application logic. The URL start-point loads the UI engine, and henceforth it is responsible for calling other engines as other functionality is needed.

But Hop isn't perfect. Let's take a closer look at the service model. As it is, Hop creates an URL for each server call needed in the UI strata. Each URL is stored in a server-side table, that is never cleansed - even temporary URL's used for anonymous services are kept indefinitely. This could be resolved (as the authors briefly point out) by exchanging closures (both the server function identifier and all the necessary variables in the environment) with the client.

By now, we can understand another weakness of Hop. Hop uses SCM2JS, which translates Scheme code into JavaScript and allows both languages to work seamlessly within the same memory space. However, as technically close as the languages may be, the translation fails in some key aspects, such as the Scheme support for continuations. The authors state that this could be done whether by the use of trampolines or by the translation into Continuation Passing Style (as specified by Henry Baker's paper), but would always create performance penalties. However, with support for continuations, Hop could easily get all Termite's mobile code functionality, like replication or execution migration between processes.

In my opinion, even if Hop can do such code migration with continuations, it's still weaker than Termite, as it's goals are (currently) too low. They want to build simple Web applications based on interactive operations. They wouldn't be able to build a peer-to-peer application with a Web Interface using code migration between clients, as the concept of client is mangled in the Scheme code as being the UI strata. Termite's approach, on the other hand, clearly considers processes, and that seems to make sense to me, as both clients and servers can currently play the same roles, with bidirectional client-server communication.

Before I wrap-up this Hop analysis, it's relevant to point out other weaknesses of the SCM2JS:

  • exact Scheme numbers are currently mapped to floating-point JavaScript numbers;
  • Errors in Scheme are not necessarily errors in JavaScript, as the translation leaves erroneous Scheme code in executable JavaScript code (like the concatenation of two objects, (+ "a" 2) is translated to "a"+2, which yields "a2" and not an error);
  • features like call/cc and eval constructs are not translatable, yet.

I hope Hop evolves into a great tool. Also, Bigloo Scheme, as well as Hop, need to be executable on Windows boxes, to gain the deserved popularity!

Wednesday, November 08, 2006

Disconnected browsing

One of my current activities' main goals is to produce the necessary infrastructure for a web application to be used offline. While the term disconnected browsing is becoming obsolete - Web applications now offer users a decent graphical (web) interface, not a bunch of HTML documents interlinked in a browseable manner - I'll stick with this denomination for the time being, until a better one comes arround. My research for now has been centered on the first Web architecture, the browseable type. Even so, up until 1997 there's not much public research on this matter, and the material I've found isn't what I'm looking for. In Web Intelligent Query - Disconnected Web Browsing using Cooperative Techniques - Kavasseri, Keating, Wittman, Joshi, Weerawarana (ResearchIndex) - it's clear that even then the Web model wasn't appropriate for disconnected browsing. They identified that, in order to browse the Internet, one would have to:
  • Know a set of known starting points;
  • Have a constant network connection;
  • Filter lots of useless information;
  • Put up with bandwidth shortage and frequent link faillures (mainly with mobile devices, but also by elected network disruption, e.g. for battery power saving).
The authors state right from the beggining that the proposed solution targets static Web information. I know, this makes this solution useless for today's Web model, but I wanted to read on, just to check if there was some valuable idea. But they propose an architecture based on an "intelligent" proxy, one that stands between the browser and the Web, and is responsible for making both IR (Information Retrieval), based on users requests and IG (Information Gathering), based on guessed pro-actively retrieval of information (Cooperative Information Gathering: A Distributed Problem Solving Approach - Oates, Prasad, Lesser (ResearchIndex)). They have some interesting related work, but the one that caught my eye was a paper on usage patterns detection, for production of browsing related tasks automations (Integrating Bottom-Up and Top-Down Analysis For Intelligent Hypertext - James, Margaret, Recker (ResearchIndex)). The concept of disconnected browsing for them demands a very strict flow, from the user (client) standpoint:
  1. The client requests a set of information;
  2. The proxy makes its magic, fetches data from the web - it is permanently online! - and processes the request;
  3. The client asks for the fetched results;
  4. The client processes the information;
  5. The client sends feedback to the proxy.
During this steps, the authors consider that being offline during steps 2 and 4 grants the client the designation of a disconnected browser... I intend to shed some light of the concept of disconnected execution of Web applications. This has lots of non-trivial obstacles, like
  • the existence of dynamic information;
  • the concurrent access to mutable, persistent server-side information;
  • the nature of the graphical interface being no longer a single (nor static) HTML page.
The usage model is also diffente, as the client must pre-fetch some portion of application and data, and then must be able to go offline and play arround with it. After he comes up online, the application must somehow synchronize his changed information with the server's - which, by the way, may envolve complex conflict resolution and/or merge rules for specific information types. From this paper, however, there are some good contributions to my project:
  • Disconnected information becomes much smaller if we know a priori what to download. So it would be a good idea to detect what the user commonly does, his usage patterns!
  • It may be usefull to have a separate agent responsible to fetch the application code/data to be downloaded to the client. As this can be computationally heavy task, having the web server do it may cause serious performance issues.
Right now I'll keep looking for other disconnected Web usage papers and projects - I'd like to see something recent, at least not older than 5 years! I know that when I reach Coda's documentation, I'll be too far from the model I want, but eventually I'll have to get down to it, as it has some good, well documented merging techniques.

Sunday, November 05, 2006

Concurrency in a web platform with distributed computing

Right now, it's clear whatever platform is to be the key for a Web application success, it must be based on a client-server architecture, something with the look and feel of the so-called Web 2.0 applications. That is, I want to use a Web platform to implement the aplication GUI, just as if it was a desktop application.

Wrt. the concurrency model, I would like to have multiple clients accessing simultaneously to the same application. Off course faillures on certain nodes (either clients or servers) must also be considered. Lastly, I want to be able to make computations on both the client and the server side.

Without having a Web architecture, I can find three approaches to the concurrency aspect:

  1. Shared space: By using a memory space shared by all interested nodes, one can make applications by building (complex) contraptions with locks and semaphores. It's tedious, error-prone, and is generaly considered to be avoided in medium-large applications.
  2. Message passing: Using the Actors approach in which everything is an object that is operated solely by exchanging messages with it. Most of the times this is simplified to the model where only processes (or nodes, or threads) send messages to each other (like Erlang)

  3. Declarative data flow: this is the model used by Termite, and I've described it extensively in the previous post. In short each process can send messages, data, and execution-flow to another process.
From these three, the third seems to solve more issues:

  • Doesn't need the coding complexity needed in the shared data management;
  • Does not demand separate programming of paralel processes;
  • Declarative DSL's save lots of code lines, with abstractions based on continuations replication between nodes
With the Web architecture in mind, we must consider at least the following questions:

  • How are computations going to be transferred to a given client or back to the server?
  • How will the computation be transferred at any point of the application execution?

  • Given the bandwidth current limitations, how are computations going to be serializable? In processes, continuations, closures? What is the relevant minimal data set that need to be passed along the wire?
  • It is a fact that the client must have an engine (in ECMA Script, I suppose) to process some part of the application - either the user interface or the domain application logic. This engine must be prepared to receive this sort of computations from other nodes (as well as to send some back to them).
This questions are good guides to the next stage of my research. I'll make sure to post the answers I (hopefully) find.

Termite - the light at the end of the tunnel?

Last month I started looking at Termite, a Scheme-like language to produce concurrent and parallel programs. It's now time to make the relevant comments, based on the paper (read it here) and directed to my PhD purpose - Webapps Interfaces development in the Web 2.0 new world.

The first thing worth making clear is the concept of distributed computations. These are concurrent programs (lightweight processes), potentially physically isolated. The data transfer between each process is made via message passing. This is based on the Actors approach, and can also be verified in Erlang. This model carries the burden of the need of functionality on the programs code. In other words, mutation cannot be done in Termite. Fortunatelly, it is possible to emulate the same behaviour by representing the "mutable" state in a process, and using a message dispatch mechanism (and representing an object in a separate process).

Processes are isolated, even if they are on the same physical node. This makes the concurrent model quite simple, as there is no shared memory, no mutexes to handle and, additionally, there is no need of a distributed garbage collector!

The message sending is done using a single mailbox per process, with assynchronous sending (the send and pray model!) and synchronous receiving. Also, Termite assumes that connections cannot be trusted as being reliable, thus assuming the possibility of failure. Since there is no message delivery guarantees, there also are no correct order of reception guarantees. Termite leaves the burden of dealing with this to the application code, by using timeouts mechanisms. Luckilly, Termite has a way of propagating exceptions between two related processes, using Erlang-like bi-directional links, but also using a one-way-only link between two processes. Finally, Termite supports tags in each sent messsage, thus being able to mimic a multiple-channel model in a single-channel architecture. This tag system can also be used in the timeout message detection system!

One interesting aspect is the worarround to make every object serializable. Non-serializable objects are proxied by using a process to represent them, so e.g., each port can be serialized using only a PID! This means we can send a message to the screen, or receive a message from an input device, which gives us a great sense of liberty on our programs!

When comparing to Erlang, aside from the Lisp-like syntax and the one-way links between processes, Termite also provides support for first class serializable continuations.

The related work also indicates that Kali uses a lazy closure code transmition between processes, along with a cache system, that could help improve the code!

After reading this, there are a couple of questions to ask about Termite: What does it take to build a web framework using this kind of comunicating processes, and what's missing to be able to fulfill my PhD requirements.

The first question is answered in the Termite paper, by Dynamite. Unfortunatelly, I wasn't able to find any proper documentarion/web page/binaries/other material related to it.

So let's talk about what Termite can do for me, and what it can't do. If we want to use the comunicating processes model, we can envision a client being a process that sends messages to a given node on the server. This node can be connected to others, making server replication a trivial task to do (the Termite paper illustrates this behaviour). I would like to have reliability on the message sending, so this would have to be added as a layer to the Termite framework, using some sort of timeout detection. I would also like to know that the delivery order is respected in every message, so that should also be granted at the message-sending protocol level, as well. Let's say that this would be enough, for now (and leave "minor" details like scalability and speed behind!). Since we can propagate messages between Termite processes, the trivial way to achieve this would be to integrate a Termite engine within a JavaScript client-engine. I can think of a couple of ways to do this. The first is simply to implement the Termite (and all of the Gambit-C dependencies) in Javascript. The other is to analyse the messages that need to be sent through processes, and produce a platform independent protocol (at least, to enable a JavaScript client to talk to a Termite one :) )

I know this is still a bit hard to achieve, and I should look into other approaches before trying this. But just imagine being able for a server to start by making all of the application computation. Then, as the server machine decides it is fed-up, or if it is simply bloated, then it spawns another server in a neighbour node, and transfers the computation to it. Also imagine a more complex load-balancing mechanism being implemented between the servers. It's all very interesting (I guess this is what should be being done in the Dynamite project!), but I must confess the fun part is between clients and servers. The server finds out, after a while, that the client is able to do some processing, and it transfers a continuation with the job to be done on the client-side. Since each process can update the running code during its execution, this would be easily achieved. Now the sky is the limit, since not only we can have code execution being swaped from the client to and from the server, but we also can envision collaboration between clients, without any kind of server interaction! If this isn't enough, lets remember that the communication is reliable. Oh, and that by now we already have a fault-tolerant (replicated) server system! We could even imagine a client being able to detach itself from its server, going offline, keep on working and then synchronizing with any server when it gets back online. The implications on these kind of scenarious are a bit different than the message passing model, as we need to consider the domain structure of the application, access control, the ammount of data the client needs to own in its machine to operate, among other important issues.

Sorry for the long post. I guess If you got this far, you might as well have some opinion on this matter, so please, leave a comment!

Ta ta...