Guilhermesilveira's Blog

as random as it gets

Posts Tagged ‘compatibility

Hypermedia and dynamic contracts: let my bandwidth rest!

with 10 comments

“Break it” to scale!

Many systems contain webpages that are very similar to user “custom pages”, where they can configure what they want to see, and every piece is aggregated from different sources into one single page.

In some cases, those are widget based frameworks as wicket and gwt that can be added to my custom page; in other cases you have aggregating portals.

An example of this kind of application (even though its not configurable) is a retail website containing four sections in its home page: the top 10, my orders, random items, and weird items.

In this case, all information come from the same source, but every part has a different probable validity if it is going to be cached. If the page is served as one big chunck of information, it will always be stale due to the random items section. “My orders” is stale only when I place a new order and, in the same way, the top 10 is only stale if any item is bought and surpasses the number of times the 10th item was bought so far.

One of the main issues with this type of pages which aggregate information from one or many sources with different expire-expectations is that cached versions in proxies and clients become stale faster than it should for some elements: once one of this providing sources publishes new information or is updated, the entire representation becomes stale..

Martin Fowler described once a well spread approach to allow those pages to be partially cached within local proxies and clients, thus sharing requested representations between multiple users.

The approach

Given the coffee scenario, one would create different json representations:

And finally an aggregating page:

<html>
<a class="lazy_load" href="http://restbucks.com/top_sellers">Top sellers</a>
<a class="lazy_load" href="http://restbucks.com/my_orders">My orders</a>
<a class="lazy_load" href="http://restbucks.com/random_items">Random items</a>
<a class="lazy_load" href="http://restbucks.com/weird_items">Weird items</a>

And then, for each lazy_load link, we create a div with its content:

<script>
$('.lazy_load').each(function(link) {
  uri = link.attr('href'); 
  div = $('
').load(uri); // cache hits! link.after(div); }); </script> </html>

This allows our proxies to cache each component in our page apart from the page itself: whenever one page’s content becomes stale in a proxy, only part of that page needs update.

In a web were most data can be cached and does not become stale so fast, this technique should usually lessen the amount of data being transfered between client and server.

All one needs to do is properly use the http headers for caching.

Remember that if your client supports either parallel requests to the server and/or keep-alive connection, the results might be even better.

Distributed systems? Linked resources?

Roy Fielding mentions that in the data view in REST systems, “small or medium-grain messages are used for control semantics, but the bulk of application work is accomplished via large-grain messages containing a complete resource representation.”

Pretty much in the same way as with the human web, a distributed system using the web as its infrastructure will gain the same cache benefits as long as they implement correct caching policies through http headers (and correct http verbs).

When your server provides a resource representation linking to a series of other related resources the client and proxies staying on the way will be allowed to cache each and every other resource on its own.

This approach results, again, in changes applied to one resource not affecting cached representations of other resources. An stale representation will not affect those accessing other resources within the same context.

Sometimes the decision whether to change latency for scalability might depend on how you think your clients will use your resources: in the human web mentioned above, the developer knew exactly how its clients would access it.

In distributed systems using REST, guessing how resources will be used can be dangerous as it allows you to tight couple yourself to this behaviour while published resources can and would be used in unforeseen ways.

Roy’s dissertation seems to apply here to balance things: “a protocol that requires multiple interactions per user action, in order to do things like negotiate feature capabilities prior to sending a content response, will be perceptively slower than a protocol that sends whatever is most likely to be optimal first and then provides a list of alternatives for the client to retrieve if the first response is unsatisfactory”.

Giving information that will help most cases is fine and providing links to further resources details allow you to balance between latency and scalability (due to caching) as you wish.

Dynamic contracts

This is only possible because we have signed dynamic contracts with our clients. They expect us to follow some formal format definition (defined in xhtml) and processes. How our processes are presented within our representations is the dynamic part of the contract.

While the fixed part can be validated with the use of schema validators, the dynamic part – the process – which is guided by our server needs to be validated through testing the behaviour of our applications: asserting that hypermedia guided transitions should be reflected in our application state.

Nowadays

On the other hand, many contemporary systems use the POST verb receiving a response including many representations at once or the GET verb without any cache related headers: thus not profiting from the web infrastructure at all. This could changed with one (or both) of the following:

  • use the GET verb with cache headers
  • use hypermedia and micro formats to describe relations between resources

Using it might present similar results as hypermedia+GET+cache headers in the human web – and some styles might already be providing support for it, although not being a constraint.

Note that in this case hypermedia is not driving the application state, but helping with scalability issues.

Progressive enhancement

Martin notes that this is a kind of progressive enhancement: although its definition is related to accessibility, its control over bandwidth benefits are similar to the approach mentioned ones.

Any other systems that use hyperlinks to “break” representations and scale?

Written by guilhermesilveira

December 10, 2009 at 9:15 am

Hypermedia: making it easier to create dynamic contracts

with 4 comments

The human web and christmas gifts

You have been buying books at amazon.com for 5 years now: typing http://www.amazon.com in your browser, searching for your book, adding it to the cart and entering your credit card information.

But this year, on December 15th 2009 something new happens. Amazon has launched an entire new “christmas discount program” and in their front page there is a huge ad notifying their clients about this new item.

How do you react?

“Contract violated! I am not buying anything today.”

The key issue in loosely coupled systems is the ability to evolve one side without implying in any modifications on the other part.

As some Rest guys agree, hypermedia content was the factor which allowed such situations to happen in the human web without clients screaming “i don’t know what to do now that there is a black friday clearance!” or “there is a new link in this page, let me email the ‘webmaster’ and complain about it“.

In the human web, some contracts are agreed upon and validated through end-to-end tests. Some companies will use tools as selenium-rc, webdriver or cucumber to drive their tests and ensure that expected behaviour by their clients does not break with a new release of their software.

Those tests do not validate all content, though, giving space for what is called forward-compatibility: the system is free to create new functionalities without breaking previous expected behaviour.

But my rest-client is not human

In the non-human web, the most well known media type used is xml, although not hypermedia-capable. There are a couple of ways to create forward or backward-compatible schemas that check xml structures, but – unfortunately – usually
fixed schemas will not invest part of its contract in order to making it forward-compatible
: its an optional feature.

One option is to create “polymorphic” types through xsd schemas, which will get nasty if your system evolves continuously – not once every year – and you find yourself in a schema-hell situation.

One easy solution is to accept anything in too many places, which seems odd.

What are we missing then? According to Subbu Allamaraju, in RESTful applications, “only a part of the contract can be described statically, and the rest is dynamic and contextual”: you tell your client that they can believe you will not break the statically contract – you might use some schema validation to do that – and it’s up to you on the server side to not do it on the dynamic part.

Some might think it sounds too loose… let’s recall the human web again:

  • xhtml allows you to validate your system’s fixed contract
  • it’s up to you not to remove an important form used throughout the buying process

So, what are the dynamic parts of my “contract”?

In a RESTful application the contract depends on its context, which is highly affected by three distinct components:

1. your resource’s state

If a person had his application denied to open an account, your resource representation will not offer a “create_loan” request. A denied application is an information regarding its state.

While your company and application evolves, its common to find ourselves in a position where new states appear.

2. your resource’s relations

In a book store (i.e. amazon a few years ago), a book might have a category associated with it so you can access other similar books:

<book>
<name>Rest if you do not want to get tired</name>
<link rel="category" href="http://www.caelumobjects.com/categories/self-help" />
</book>
A couple years later, your system might add extra relations, as "clients which bought this book also recommend"

<book>
 <name>Rest if you do not want to get tired</name>
 <link rel="category" href="http://www.caelumobjects.com/categories/self-help" />
 <link rel="recommendation" href="http://www.caelumobjects.com/books/take-a-shower-with-a-good-soap-if-you-need-to-rest" />
</book>

When your company and application evolves, its common to find ourselves in a position where new relations appear.

3. your resource’s operations

In a REST application, your resource operation’s are represented by HTTP verbs: supporting a new one will not affect clients which use all other available verbs so far.

In the RPC/Webservices world, new operations would be implemented creating new remote procedures or services.

But how can my clients be sure that I will not break the dynamic contract?

Pretty much in the same way that you do in the human web: it’s your word.

In the human web, how do we guarantee that we will not remove or break some functionality the user expects to be there? We end-to-end automatically test its behaviour.
Our word (our tests) is the only reason to rest without worries that we will not break our client’s expectations. The same holds on the non-human web.

The dynamic contract should be throughly tested in order to not break our client’s expectations.

There are other approaches (as client-aware contracts) which might add some extra coupling between both sides.

HTTP+XML+ATOM gives us the possibility to work with both the fixed (schema validated) and dynamic (test validated) contract.

As Bill Burke pointed in a comment, “you can design your XML schemas to be both flexible and backward compatible ” and “companies, users, developers desire this contract”.

That’s the good points of using schemas, but its not everyone that use them in a flexible and backward compatible way. Even those who use might have a little bit of hard time to support it, i.e. having to maintain more than one entry point for each version of their schemas.

That’s when we can use the good points of the schema validation, as Bill pointed out, with the easy evolution advantages of a dynamic contract: as we do in the human web.

By using dynamic contracts as xml+atom following the Must Ignore rules, forward and backward compatibility is gained by default, independent on what the user does – assuming that tests are a must in any solution.

Dynamic contracts also give hints for frameworks, as they guide you on what your user can and can not do or access, but maybe not for tools, in a different fashion of what fixed contracts do: with a fixed schema I would be able to pre-generate my classes, while with dynamic schemas I the framework inject methods.

That’s why we try to take an approach which force programmers to adopt xml+atom. The entry point on the Restfulie framework is loosely evolution.

Its first example, the documentation and its examples do not focus on how easy it is to use nice URIs and the 4 most famous http verbs, but how easy it is to evolve your system using hypermedia and http: uri’s come soon afterwards.

And it seems to be working fine to far, the first developers using it in live systems have already supported hypermedia content as a way to guide clients through their systems.

Restfulie support in dynamic contracts

Matt pulver’s extension to Rails allows one to instantiate types with regards to their active record relations and attributes, but it requires every xml element to be present (strong coupling to the data structure presented by the server).

Using Jeokkarak (korean hashis), Restfulie instantiate objects matching your local data structure, supporting fields defined in your attributes and inserting extra fields for those elements unknown to your model.

For example, if you have a model as:

class Bill
  attr_accessor :value, :to_date
end

And the following xml:

<bill>
  <value>100</value>
  <to-date>10/10/2010</to-date>
  <taxes>0.07</taxes>
</bill>

The result is a dynamic object capable of answering to:

bill = Bill.from_web uri
puts bill.value 
puts bill.to_date
puts bill.taxes

If your model was ready to accept such xml, Restfulie will do the job, whilst if it doesn’t recognize the attribute, it will still be available to you.

That’s the default Restfulie behaviour: to allow the other part to evolve their dynamic contract (and even parts of the fixed one) by default, without any extra effort from your side.

Written by guilhermesilveira

December 8, 2009 at 9:26 am

Restfulie Java: quit pretending, start using the web for real

with 5 comments

Its time to release Restfulie Java, offering the same power encountered in its ruby release, through the use of dynamic bytecode generation and request interception using VRaptor.

Serialization framework

Restfulie adopts XStream by default. Its simple usage and configuration gets even easier due to vraptor’s serialization extension built upon XStream – but it allows the usage of other serializers.

The following code will serialize the order object including its children items (much similar to rails to_xml only option):

serializer.from(order).include("items").serialize();

Connected: Hypermedia content creation

In order to guide the client from one application state to another, the server needs to create and dispatch links that can be interpreted by the client machine, thus the need for generating hypermedia aware serialization tools and consumer apis.

Its the basic usage of the web in a software-to-software communication level.

Pretty much like Restfulie’s ruby implementation, by implementing the getFollowingTransitions method, restfulie will serialize your
resource, generating its representation with hypermedia links:

public List getFollowingTransitions(Restfulie control) {
  if (status.equals("unpaid")) {
    control.transition(OrderingController.class).cancel(this);
    control.transition(OrderingController.class).pay(this,null);
  }
  if(status.equals("paid")) {
    control.transition(OrderingController.class).retrieve(this);
  }
  return control.getTransitions();
}

Controller interception

Restfulie for Java goes further, intercepting transition invocations and checking for its status. The following example will only be executed if order is in a valid state for paying:

@Post @Path("/order/{order.id}/pay")
@Consumes("application/xml")
@Transition
public void pay(Order order, Payment payment) {
	order = database.getOrder(order.getId());
	order.pay(payment);
	result.use(xml()).from(order.getReceipt()).serialize();
}

Why?

Restfulie does not provide a bloated solution with huge api’s, trying to solve every other problem in the world. According to Richardson Maturity Model, systems are called RESTFul when they support this kind of state flow transition through hypermedia content contained within resources representations:

<order>
 <product>basic rails course</product>
 <product>RESTful training</product>
 <atom:link rel="refresh" href="http://www.caelum.com.br/orders/1" xmlns:atom="http://www.w3.org/2005/Atom"/>
 <atom:link rel="pay" href="http://www.caelum.com.br/orders/1/pay" xmlns:atom="http://www.w3.org/2005/Atom"/>
 <atom:link rel="cancel" href="http://www.caelum.com.br/orders/1" xmlns:atom="http://www.w3.org/2005/Atom"/>
</order>

Stateless state

In order to profit even more from the web infrastructure, Restfulie for Java provides a (client state) stateless api.

Addressability

VRaptor’s controller api allows you to specify custom URI’s (and http verbs) to identify resources (and transitions) in your system.

Addressability + client cache stateless state server allows one to achieve REST’s idea on cache usage and its related layered systems advantages by allowing other layers to be added between the client and the server.

Unknown usage of my resources

Addressability + hypermedia content allows clients to use your resources pretty much in a way that was not tought of at first. Addresses (in our case, URI’s) can be passed around from one application to another, to and from a client’s internal database (as simple as a browser favorites, or google gears).

Building your system upon such basis, it become unaware of its resources usage (resource representation’s interpretation) patterns, allowing clients to create such previously unknown systems.

Less and simpler code, more results

Both on the server and client side, restfulie tries to achieve results based on conventions, therefore one can easily access its entry api to insert a resource in a system, i.e.:

  Movie solino = new Movie();
  resource("http://www.caelum.com.br/movies").include("metadata").post(solino);

And after creating a resource, one can actually navigate through your resource’s connections:

  Movie solino = resource("http://www.caelum.com.br/movies/solino").get();
  Comments comments = resource(solino).getTransition("comments").executeAndRetrieve();
  // print all comments

And navigate through your resource’s states:

  Comments comments = resource(solino).getTransition("comments").executeAndRetrieve();

  Comment comment = new Comment("nice movie on generations of immigrants and the hard times that they face when moving to a new place");
  resource(comments).getTransition("add").execute(comment);

A lot of good practices should be put into place in order to avoid creating Leonard Richardson’s defined REST-RPC alike systems.

Those good practices involve simple steps as avoiding anemic models on the client side . Restfulie+Vraptor already tries to avoid this on the server side andwe will discuss about such practices in following posts.

Download and example applications

Beginners could start by downloading restfulie’s client and server example application, ready to run in a eclipse wtp enviroment or pure eclipse installation.

Users are encouraged to use either the java or ruby mailing lists.

Since Restfulie was born in Ruby…

Since we released restfulie for ruby (on rails), which can be found at its github page, it was commented by Jim Webbers both on his blog and in person during QCon in San Francisco, where both him and Ian Robinson held a tutorial on restful systems and the web and will hold another round, more hands-on, at QCon London 2010.

Meanwhile, Restfulie was commented at ruby5’s podcast, commented here, at infoq, and in portuguese by Juliana and by Anderson Leite – a fellow Caelum developer.

As some comments were out about restfulie’s ruby implementation, restfulie: it’s more than easy.

Written by guilhermesilveira

November 25, 2009 at 11:13 am

To break or not to break? Java 7?

with one comment

There is a short slide show to illustrate some thoughts. There will be better ones in the near future.

When is the right timing to break compatibility of a public api regarding its previous versions?

Well, in the open source communites there is a common sense that a library is allowed to cause some migration if there is a minor change (i.e. 1.1.5 to 1.2.0).

Whenever has a major change (i.e. 1.2.0 to 2.0.0) it might be completely rewritten in such a way that even allows its users to adopt both versions at the same time.

Some projects use the version number as a marketing technique in order to keep themselves up-to-date with their competitors.

Some products are famous for, whenever a new release appears, breaking compatibility with code written so far, requiring all programmers to rewrite part of their code. If you check Visual Basic’s life, every one-to-two years there was a major release with (usually) incompatibility.

VB programmers were used to that issue and kept coding the old projects using the previous release until the project was finished. Companies just got used to it and learned how to live on.

If your code is well tested through the use of automated tests, updating a library or a compiler/language version is an easier task because all incompatibility issues will be found prior to packing a new release of your product to your clients. testing++!

If you do not write code for your tests, as soon as you update a library/compiler/language, well….. have fun as it will probably be an unique adventure.

The java developers

Unfortunately, there is still a big part of the java community who do not write automated tests. Aside with the java legacy code that exists lacking any lines of automated tests in the world, sticking to compatibility between java releases can be seen as a good thing for the world, in general.

But for those who already write their tests, all that care with compatibility might be seen as a overestimated issue: due to the tests, we are ready to change and embrance those changes.

The java 7 crew is aware that there is a lot more that we can add to the language, but afraid because it will not preserve high levels of compatibility and usability.

What happens if the language has to worry so much about compatibility? It will evolve so slow that other languages have the chance to overcome it. This is the danger that the language itself faces. Java might not lose its position but one can find a lot more people arguing about language changes that could be done to the language – but are not… because preserving compatibility has been a main issue.

At the same time, some of the changes proposed might create huge incompatibility issues for those users who still do not write tests or that software who was not written using TDD practices. There is also another document on the same issue on the internet.

This proposes that methods defining a void return type should be considered as returning this. This way one can easily use it with Builder-patterned apis:


final JFrame jFrame = new JFrame()
.setBounds(100, 100, 100, 100)
.setVisible(true);

There are a few issues with that proposal that do not match the “let’s keep it backwards compatible” saying.

The first thing is that, nowadays, builder (or construction pattern) apis are already created returning something specific instead of void, as we can see from Hibernate‘s API, and the new date and time api.

The second point is that builders are used nowadays to create Domain Specific Languages and their implementation in Java do not use a single object because it would create a huge (and nasty) class. DSL’s are usually built in Java by using different types, i.e. the criteria API from Hibernate.

Even the given example is actually no builder api… JFrame configuration methods and real-time-usage methods are all within… itself! There is no JFrameBuilder which would build a JFrame, i.e.:


Builder b = new Builder();
b.setTitle("Title").addButtonPanel().add(cancelButton()).add(okButton());
JFrame frame = b.build();

Notice that in the simple example above, it would be a good idea to have two different types (Builder and PanelBuilder) thus the language modification do not achieve what we want our code to look like (or be used like). Instead, it will only allow us to remove the variable name from appearing 10 times in our code, making it easier for programmers to write lines of code like this:


// ugly line which has too much information at once
JFrame frame = new JFrame().setA("a").setB(2,3).setC("c").setD("d").andOn().andOn().andOn();

But why does it go against Java’s saying ‘we should not break compatibility’? Because it creates a even higher degree of coupling between my code and the api i am using.

Well, imagine that I used the swing api as mentioned above. In a future release of Swing, some of those methods might have their signature changed and therefore break my existing code. Why would an api change their method return type? Well, because if the return type was defined as void so far, no one was using it… so I can change it.

It creates the same type of coupling found while using class-inheritance in Java while using APIs. Parent methods being invoked might change their signature

Well, it was true until today. If this functionality is approved for the Java API, it will make a simple task of changing a “void” return type to something useful a hard task, where I have to think about those who have tightly-coupled their code to mine.

The questions and answers which come to my mind are…
a) is the existing Java codebase around the world usually automated tested? unfortunately, no
b) does Java want to be backward compatible? this change will not help it
c) does it want to help the creation of dsls? this change is not the solution
d) does Java want us to avoid writing the variable name multiple times? my IDE already helps me with that

Written by guilhermesilveira

August 17, 2009 at 1:24 pm