Guilhermesilveira's Blog

as random as it gets

Posts Tagged ‘patterns

Hypermedia and dynamic contracts: let my bandwidth rest!

with 10 comments

“Break it” to scale!

Many systems contain webpages that are very similar to user “custom pages”, where they can configure what they want to see, and every piece is aggregated from different sources into one single page.

In some cases, those are widget based frameworks as wicket and gwt that can be added to my custom page; in other cases you have aggregating portals.

An example of this kind of application (even though its not configurable) is a retail website containing four sections in its home page: the top 10, my orders, random items, and weird items.

In this case, all information come from the same source, but every part has a different probable validity if it is going to be cached. If the page is served as one big chunck of information, it will always be stale due to the random items section. “My orders” is stale only when I place a new order and, in the same way, the top 10 is only stale if any item is bought and surpasses the number of times the 10th item was bought so far.

One of the main issues with this type of pages which aggregate information from one or many sources with different expire-expectations is that cached versions in proxies and clients become stale faster than it should for some elements: once one of this providing sources publishes new information or is updated, the entire representation becomes stale..

Martin Fowler described once a well spread approach to allow those pages to be partially cached within local proxies and clients, thus sharing requested representations between multiple users.

The approach

Given the coffee scenario, one would create different json representations:

And finally an aggregating page:

<html>
<a class="lazy_load" href="http://restbucks.com/top_sellers">Top sellers</a>
<a class="lazy_load" href="http://restbucks.com/my_orders">My orders</a>
<a class="lazy_load" href="http://restbucks.com/random_items">Random items</a>
<a class="lazy_load" href="http://restbucks.com/weird_items">Weird items</a>

And then, for each lazy_load link, we create a div with its content:

<script>
$('.lazy_load').each(function(link) {
  uri = link.attr('href'); 
  div = $('
').load(uri); // cache hits! link.after(div); }); </script> </html>

This allows our proxies to cache each component in our page apart from the page itself: whenever one page’s content becomes stale in a proxy, only part of that page needs update.

In a web were most data can be cached and does not become stale so fast, this technique should usually lessen the amount of data being transfered between client and server.

All one needs to do is properly use the http headers for caching.

Remember that if your client supports either parallel requests to the server and/or keep-alive connection, the results might be even better.

Distributed systems? Linked resources?

Roy Fielding mentions that in the data view in REST systems, “small or medium-grain messages are used for control semantics, but the bulk of application work is accomplished via large-grain messages containing a complete resource representation.”

Pretty much in the same way as with the human web, a distributed system using the web as its infrastructure will gain the same cache benefits as long as they implement correct caching policies through http headers (and correct http verbs).

When your server provides a resource representation linking to a series of other related resources the client and proxies staying on the way will be allowed to cache each and every other resource on its own.

This approach results, again, in changes applied to one resource not affecting cached representations of other resources. An stale representation will not affect those accessing other resources within the same context.

Sometimes the decision whether to change latency for scalability might depend on how you think your clients will use your resources: in the human web mentioned above, the developer knew exactly how its clients would access it.

In distributed systems using REST, guessing how resources will be used can be dangerous as it allows you to tight couple yourself to this behaviour while published resources can and would be used in unforeseen ways.

Roy’s dissertation seems to apply here to balance things: “a protocol that requires multiple interactions per user action, in order to do things like negotiate feature capabilities prior to sending a content response, will be perceptively slower than a protocol that sends whatever is most likely to be optimal first and then provides a list of alternatives for the client to retrieve if the first response is unsatisfactory”.

Giving information that will help most cases is fine and providing links to further resources details allow you to balance between latency and scalability (due to caching) as you wish.

Dynamic contracts

This is only possible because we have signed dynamic contracts with our clients. They expect us to follow some formal format definition (defined in xhtml) and processes. How our processes are presented within our representations is the dynamic part of the contract.

While the fixed part can be validated with the use of schema validators, the dynamic part – the process – which is guided by our server needs to be validated through testing the behaviour of our applications: asserting that hypermedia guided transitions should be reflected in our application state.

Nowadays

On the other hand, many contemporary systems use the POST verb receiving a response including many representations at once or the GET verb without any cache related headers: thus not profiting from the web infrastructure at all. This could changed with one (or both) of the following:

  • use the GET verb with cache headers
  • use hypermedia and micro formats to describe relations between resources

Using it might present similar results as hypermedia+GET+cache headers in the human web – and some styles might already be providing support for it, although not being a constraint.

Note that in this case hypermedia is not driving the application state, but helping with scalability issues.

Progressive enhancement

Martin notes that this is a kind of progressive enhancement: although its definition is related to accessibility, its control over bandwidth benefits are similar to the approach mentioned ones.

Any other systems that use hyperlinks to “break” representations and scale?

Advertisements

Written by guilhermesilveira

December 10, 2009 at 9:15 am

To break or not to break? Java 7?

with one comment

There is a short slide show to illustrate some thoughts. There will be better ones in the near future.

When is the right timing to break compatibility of a public api regarding its previous versions?

Well, in the open source communites there is a common sense that a library is allowed to cause some migration if there is a minor change (i.e. 1.1.5 to 1.2.0).

Whenever has a major change (i.e. 1.2.0 to 2.0.0) it might be completely rewritten in such a way that even allows its users to adopt both versions at the same time.

Some projects use the version number as a marketing technique in order to keep themselves up-to-date with their competitors.

Some products are famous for, whenever a new release appears, breaking compatibility with code written so far, requiring all programmers to rewrite part of their code. If you check Visual Basic’s life, every one-to-two years there was a major release with (usually) incompatibility.

VB programmers were used to that issue and kept coding the old projects using the previous release until the project was finished. Companies just got used to it and learned how to live on.

If your code is well tested through the use of automated tests, updating a library or a compiler/language version is an easier task because all incompatibility issues will be found prior to packing a new release of your product to your clients. testing++!

If you do not write code for your tests, as soon as you update a library/compiler/language, well….. have fun as it will probably be an unique adventure.

The java developers

Unfortunately, there is still a big part of the java community who do not write automated tests. Aside with the java legacy code that exists lacking any lines of automated tests in the world, sticking to compatibility between java releases can be seen as a good thing for the world, in general.

But for those who already write their tests, all that care with compatibility might be seen as a overestimated issue: due to the tests, we are ready to change and embrance those changes.

The java 7 crew is aware that there is a lot more that we can add to the language, but afraid because it will not preserve high levels of compatibility and usability.

What happens if the language has to worry so much about compatibility? It will evolve so slow that other languages have the chance to overcome it. This is the danger that the language itself faces. Java might not lose its position but one can find a lot more people arguing about language changes that could be done to the language – but are not… because preserving compatibility has been a main issue.

At the same time, some of the changes proposed might create huge incompatibility issues for those users who still do not write tests or that software who was not written using TDD practices. There is also another document on the same issue on the internet.

This proposes that methods defining a void return type should be considered as returning this. This way one can easily use it with Builder-patterned apis:


final JFrame jFrame = new JFrame()
.setBounds(100, 100, 100, 100)
.setVisible(true);

There are a few issues with that proposal that do not match the “let’s keep it backwards compatible” saying.

The first thing is that, nowadays, builder (or construction pattern) apis are already created returning something specific instead of void, as we can see from Hibernate‘s API, and the new date and time api.

The second point is that builders are used nowadays to create Domain Specific Languages and their implementation in Java do not use a single object because it would create a huge (and nasty) class. DSL’s are usually built in Java by using different types, i.e. the criteria API from Hibernate.

Even the given example is actually no builder api… JFrame configuration methods and real-time-usage methods are all within… itself! There is no JFrameBuilder which would build a JFrame, i.e.:


Builder b = new Builder();
b.setTitle("Title").addButtonPanel().add(cancelButton()).add(okButton());
JFrame frame = b.build();

Notice that in the simple example above, it would be a good idea to have two different types (Builder and PanelBuilder) thus the language modification do not achieve what we want our code to look like (or be used like). Instead, it will only allow us to remove the variable name from appearing 10 times in our code, making it easier for programmers to write lines of code like this:


// ugly line which has too much information at once
JFrame frame = new JFrame().setA("a").setB(2,3).setC("c").setD("d").andOn().andOn().andOn();

But why does it go against Java’s saying ‘we should not break compatibility’? Because it creates a even higher degree of coupling between my code and the api i am using.

Well, imagine that I used the swing api as mentioned above. In a future release of Swing, some of those methods might have their signature changed and therefore break my existing code. Why would an api change their method return type? Well, because if the return type was defined as void so far, no one was using it… so I can change it.

It creates the same type of coupling found while using class-inheritance in Java while using APIs. Parent methods being invoked might change their signature

Well, it was true until today. If this functionality is approved for the Java API, it will make a simple task of changing a “void” return type to something useful a hard task, where I have to think about those who have tightly-coupled their code to mine.

The questions and answers which come to my mind are…
a) is the existing Java codebase around the world usually automated tested? unfortunately, no
b) does Java want to be backward compatible? this change will not help it
c) does it want to help the creation of dsls? this change is not the solution
d) does Java want us to avoid writing the variable name multiple times? my IDE already helps me with that

Written by guilhermesilveira

August 17, 2009 at 1:24 pm