Wednesday, November 08, 2006

Disconnected browsing

One of my current activities' main goals is to produce the necessary infrastructure for a web application to be used offline. While the term disconnected browsing is becoming obsolete - Web applications now offer users a decent graphical (web) interface, not a bunch of HTML documents interlinked in a browseable manner - I'll stick with this denomination for the time being, until a better one comes arround. My research for now has been centered on the first Web architecture, the browseable type. Even so, up until 1997 there's not much public research on this matter, and the material I've found isn't what I'm looking for. In Web Intelligent Query - Disconnected Web Browsing using Cooperative Techniques - Kavasseri, Keating, Wittman, Joshi, Weerawarana (ResearchIndex) - it's clear that even then the Web model wasn't appropriate for disconnected browsing. They identified that, in order to browse the Internet, one would have to:
  • Know a set of known starting points;
  • Have a constant network connection;
  • Filter lots of useless information;
  • Put up with bandwidth shortage and frequent link faillures (mainly with mobile devices, but also by elected network disruption, e.g. for battery power saving).
The authors state right from the beggining that the proposed solution targets static Web information. I know, this makes this solution useless for today's Web model, but I wanted to read on, just to check if there was some valuable idea. But they propose an architecture based on an "intelligent" proxy, one that stands between the browser and the Web, and is responsible for making both IR (Information Retrieval), based on users requests and IG (Information Gathering), based on guessed pro-actively retrieval of information (Cooperative Information Gathering: A Distributed Problem Solving Approach - Oates, Prasad, Lesser (ResearchIndex)). They have some interesting related work, but the one that caught my eye was a paper on usage patterns detection, for production of browsing related tasks automations (Integrating Bottom-Up and Top-Down Analysis For Intelligent Hypertext - James, Margaret, Recker (ResearchIndex)). The concept of disconnected browsing for them demands a very strict flow, from the user (client) standpoint:
  1. The client requests a set of information;
  2. The proxy makes its magic, fetches data from the web - it is permanently online! - and processes the request;
  3. The client asks for the fetched results;
  4. The client processes the information;
  5. The client sends feedback to the proxy.
During this steps, the authors consider that being offline during steps 2 and 4 grants the client the designation of a disconnected browser... I intend to shed some light of the concept of disconnected execution of Web applications. This has lots of non-trivial obstacles, like
  • the existence of dynamic information;
  • the concurrent access to mutable, persistent server-side information;
  • the nature of the graphical interface being no longer a single (nor static) HTML page.
The usage model is also diffente, as the client must pre-fetch some portion of application and data, and then must be able to go offline and play arround with it. After he comes up online, the application must somehow synchronize his changed information with the server's - which, by the way, may envolve complex conflict resolution and/or merge rules for specific information types. From this paper, however, there are some good contributions to my project:
  • Disconnected information becomes much smaller if we know a priori what to download. So it would be a good idea to detect what the user commonly does, his usage patterns!
  • It may be usefull to have a separate agent responsible to fetch the application code/data to be downloaded to the client. As this can be computationally heavy task, having the web server do it may cause serious performance issues.
Right now I'll keep looking for other disconnected Web usage papers and projects - I'd like to see something recent, at least not older than 5 years! I know that when I reach Coda's documentation, I'll be too far from the model I want, but eventually I'll have to get down to it, as it has some good, well documented merging techniques.