Guilhermesilveira's Blog

as random as it gets

Hypermedia and dynamic contracts: let my bandwidth rest!

with 10 comments

“Break it” to scale!

Many systems contain webpages that are very similar to user “custom pages”, where they can configure what they want to see, and every piece is aggregated from different sources into one single page.

In some cases, those are widget based frameworks as wicket and gwt that can be added to my custom page; in other cases you have aggregating portals.

An example of this kind of application (even though its not configurable) is a retail website containing four sections in its home page: the top 10, my orders, random items, and weird items.

In this case, all information come from the same source, but every part has a different probable validity if it is going to be cached. If the page is served as one big chunck of information, it will always be stale due to the random items section. “My orders” is stale only when I place a new order and, in the same way, the top 10 is only stale if any item is bought and surpasses the number of times the 10th item was bought so far.

One of the main issues with this type of pages which aggregate information from one or many sources with different expire-expectations is that cached versions in proxies and clients become stale faster than it should for some elements: once one of this providing sources publishes new information or is updated, the entire representation becomes stale..

Martin Fowler described once a well spread approach to allow those pages to be partially cached within local proxies and clients, thus sharing requested representations between multiple users.

The approach

Given the coffee scenario, one would create different json representations:

And finally an aggregating page:

<html>
<a class="lazy_load" href="http://restbucks.com/top_sellers">Top sellers</a>
<a class="lazy_load" href="http://restbucks.com/my_orders">My orders</a>
<a class="lazy_load" href="http://restbucks.com/random_items">Random items</a>
<a class="lazy_load" href="http://restbucks.com/weird_items">Weird items</a>

And then, for each lazy_load link, we create a div with its content:

<script>
$('.lazy_load').each(function(link) {
  uri = link.attr('href'); 
  div = $('
').load(uri); // cache hits! link.after(div); }); </script> </html>

This allows our proxies to cache each component in our page apart from the page itself: whenever one page’s content becomes stale in a proxy, only part of that page needs update.

In a web were most data can be cached and does not become stale so fast, this technique should usually lessen the amount of data being transfered between client and server.

All one needs to do is properly use the http headers for caching.

Remember that if your client supports either parallel requests to the server and/or keep-alive connection, the results might be even better.

Distributed systems? Linked resources?

Roy Fielding mentions that in the data view in REST systems, “small or medium-grain messages are used for control semantics, but the bulk of application work is accomplished via large-grain messages containing a complete resource representation.”

Pretty much in the same way as with the human web, a distributed system using the web as its infrastructure will gain the same cache benefits as long as they implement correct caching policies through http headers (and correct http verbs).

When your server provides a resource representation linking to a series of other related resources the client and proxies staying on the way will be allowed to cache each and every other resource on its own.

This approach results, again, in changes applied to one resource not affecting cached representations of other resources. An stale representation will not affect those accessing other resources within the same context.

Sometimes the decision whether to change latency for scalability might depend on how you think your clients will use your resources: in the human web mentioned above, the developer knew exactly how its clients would access it.

In distributed systems using REST, guessing how resources will be used can be dangerous as it allows you to tight couple yourself to this behaviour while published resources can and would be used in unforeseen ways.

Roy’s dissertation seems to apply here to balance things: “a protocol that requires multiple interactions per user action, in order to do things like negotiate feature capabilities prior to sending a content response, will be perceptively slower than a protocol that sends whatever is most likely to be optimal first and then provides a list of alternatives for the client to retrieve if the first response is unsatisfactory”.

Giving information that will help most cases is fine and providing links to further resources details allow you to balance between latency and scalability (due to caching) as you wish.

Dynamic contracts

This is only possible because we have signed dynamic contracts with our clients. They expect us to follow some formal format definition (defined in xhtml) and processes. How our processes are presented within our representations is the dynamic part of the contract.

While the fixed part can be validated with the use of schema validators, the dynamic part – the process – which is guided by our server needs to be validated through testing the behaviour of our applications: asserting that hypermedia guided transitions should be reflected in our application state.

Nowadays

On the other hand, many contemporary systems use the POST verb receiving a response including many representations at once or the GET verb without any cache related headers: thus not profiting from the web infrastructure at all. This could changed with one (or both) of the following:

  • use the GET verb with cache headers
  • use hypermedia and micro formats to describe relations between resources

Using it might present similar results as hypermedia+GET+cache headers in the human web – and some styles might already be providing support for it, although not being a constraint.

Note that in this case hypermedia is not driving the application state, but helping with scalability issues.

Progressive enhancement

Martin notes that this is a kind of progressive enhancement: although its definition is related to accessibility, its control over bandwidth benefits are similar to the approach mentioned ones.

Any other systems that use hyperlinks to “break” representations and scale?

Advertisements

Written by guilhermesilveira

December 10, 2009 at 9:15 am

10 Responses

Subscribe to comments with RSS.

  1. This might cause some issues around SEO, although the effect could be considered a positive one depending on your strategy.

    I’m big on avoiding ‘composite’ resources that derive their state from other resources – I made a post recently on rest-discsuss which touches on similar ground:

    http://tech.groups.yahoo.com/group/rest-discuss/message/14092

    So I wouldn’t describe this as ‘breaking’ representations, or resources – on the contrary, I think it’s better to describe the current trend for ‘compositing’ as a ‘break’ from constraints. I put down to a lack of decent hypertext formats.. unfortunately this kind of feature isn’t deemed as important as stuff like video tags so it’s unlikely we’ll see it addressed any time soon 🙂

    Also – I reckon it’s still ‘ok’ to describe this as hypermedia driving application state.

    Mike

    December 10, 2009 at 1:20 pm

  2. Maybe I’m missing something, but how is this different than what browsers do with the html img tag. The representation is loaded and rendered and then the images are loaded. If it works for imgs, I don’t see why it would not work for any other media type.

    Darrel Miller

    December 10, 2009 at 1:44 pm

    • One requires code on demand – the other doesn’t.

      It would be the same if you could do something like this:

      Mike

      December 10, 2009 at 2:07 pm

      • <div src=”some/fragment” />

        Mike

        December 10, 2009 at 2:08 pm

      • ‘a la’ iframe?

        guilhermesilveira

        December 10, 2009 at 2:11 pm

      • Similar – in HTML5 I think it’s called a ‘seamless iframe’.

        Mike

        December 10, 2009 at 2:19 pm

  3. Thanks for the reply Mike.

    You are right about the SEO, it can be either positive or negative. A search engine will finally understand that the “top 10 items” is one resource independent than the “weird items” and therefore one will not influence in the results from the other one.

    At the same time, the entry page (human web) wont be indexed correctly… maybe a different – server side gracefully degraded – page result for a googlebot agent? suggestions? (this choice loses visibility as you mentioned on the rest discuss list)

    The big question – as you mentioned on the 3rd paragraph – is whether there is a big resource which was broken into many pieces or many resources aggregated into one… it seems in the cases I mentioned it is the second option.

    I understood the point on the rest-discuss list, in that case it loses visibility once updates in the sub-resources will affect the original one.

    But in this approach there is no such loss, once the composite resource (the entry page) is a set of hypermedia links – your last paragraph. What do you think?

    regards

    guilhermesilveira

    December 10, 2009 at 1:48 pm

    • Yeah, I agree – we’re both addressing the issue of increased visibility by avoiding composite resources and opting for hyperlinks instead

      Mike

      December 10, 2009 at 2:12 pm

      • Great… understood, thanks.

        guilhermesilveira

        December 10, 2009 at 3:30 pm

  4. Hello Darrel,

    Nice comparison…
    Search engines do understand that information around the hypermedia link to an image is related to the linked resource and use it while evaluating your search. Or were you commenting not about SEO?

    guilhermesilveira

    December 10, 2009 at 2:00 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: