Applying the Web to Enterprise IT: 2010

November 19, 2010

Generic Media Types Potentially Leak The Model

Jean-Jacques Dubray just triggered a thought with a comment he made on Twitter regarding the rough ideas to adopt MVC 'ideas' in JAX-RS 2.0. He was worried that an MVC based approach to creating representations exposes the model to the service consumer.

Interestingly, this is only possible if the service uses generic media types. Specific media types sufficiently limit the kind of information that can be expressed in the representations and prevent the service specific model to be visible to the outside.

Specific media types enforce the resource boundary and prevent the service's domain model from leaking.

This is probably a much better argument against application/xml or application/json than the usual reference to the violation of message self descriptiveness.

November 2, 2010

"An architectural style called REST (Representational State Transfer) advocates that web applications should use HTTP as it was originally envisioned."

October 23, 2010

Agency Boundary

Post moved

October 13, 2010

Interesting Changes Must Surface

Post moved

October 12, 2010

Artifact Ownership or When Not to Use REST

Here is an indication of situations when applying REST is not appropriate:
When all artifacts that are affected by a possible change are owned by the same project (e.g. stored in the same source code control repository) then REST is not a suitable style.

An example of this is an application that contains a database component for its private use. Usually the database schema is stored in the repository together with the source code. If you face a need to change one, it is easy to change the other (from a developer coordination point of view).

The general take away from this is that artifact ownership can be used as an indicator for how appropriate a candidate style is.

(Consider how much the Unix command line benefits from the uniform API pipe and filter style: Completely decentralized developers can contribute components (grep, awk, sed, less, sort,..) without even engaging into agreeing on how the components talk to each other). The artifacts that make up the unix tool box are maintained by many different parties all over the world.

Generic vs. Specific Media Types and Evolution

Posting moved

September 1, 2010

Spotted Alternates Header in the Wild

Post moved: http://jalg.net/2010/09/spotted-alternates-header-in-the-wild/

August 25, 2010

GET /stock-quote/foo vs. getStockQuote("foo")

Consider the good old stock quote example from REST vs. RPC discussions. In both variants two kinds of coupling exist.

On the one hand there is the intentional coupling that causes the client to make the call to that particular remote thing and not just any arbitrary one. The intentional coupling is a human choice, manifested in configuration or code.

On the other hand there is technical coupling because the client software needs to know (aka be coupled to) the provided interface. Otherwise the communication would no be able to happen.

Given those two kinds or layers of coupling it makes no sense whatsoever to repeat the specifics already present in the intentional coupling at the technical level by giving that specific remote thing a specific interface.

Once the intentional choice has been made to talk to that remote thing the interface specifics can be factored away.

Why design, implement, test, maintain, document and explain more stuff if you can do the same things with less?

August 23, 2010

PUT and Content-Location

Entry moved to PUT and Content-Location.

May 21, 2010

Amazingly Close

While looking up some stuff in the book "Integration Patterns" I spotted a tiny sentence that is amazingly close to REST. In chapter one on page 3 the authors (among them Gregor Hohpe, BTW) emphasize that from the point of view of integration the whole must be considered instead of solely designing for each integration between two applications in turn. The sentence is

However, if you approach this same problem from an integration architecture perspective, the ideal application is a thin layer of presentation that consumes shared functionality or data at the enterprise level.

I find this amazingly close to the notion of a user agent component consuming representations and capabilities provided by resources.

May 12, 2010

Serendipitous Reuse vs. Reusable Services

The notion of reusable service builds on the anticipation of its uses

Reusable components notion focusses on using the same component for the same purpose often. There is no focus on use in unanticipated contexts. seren. reuse oth does that.

Serendipitous reuse on the other hand focusses on building components (that expose functionality) and using those compoents in unanticipated ways.

May 6, 2010

IRC Conversation on User, User Agent, Media Types, Application etc.

Currently I am trying to figure out the significance of the various aspects of a RESTful architecture with regard to modeling. Such aspects as user agent, media type, steady state, user, application, and application state. Tried to build-up an explanatory train of thought on that basis in an exchange with Philipp Meier on #rest IRC yesterday. It is the typical, hard to read, IRC conversation but you might find it useful nevertheless.

April 4, 2010

Steady-State

I have been meaning to write up for some time something about the notion of steady state. Now it ended up in a rest-duscuss posting.

Some references that I found insightful:

Steady-State in Roy's dissertation

Posting dealing with application entry points

Web applications can have many entry points

Steady-states are to be understood in isolation

Steady-State, bookmarks, cool URIs(last two paragraphs)

Slightly related from this blog: In Fear of Sub Requests.

March 31, 2010

Why 'Action Resources' are a REST Anti Pattern

REST's uniform interface constraint requires the operations that can be invoked on a resource to be generic, meaning that the applicability of an operation must not depend on the actual nature of the target resource. The uniform interface does not prohibit additional methods to be defined but requires any extension method to be generic. PATCH or MONITOR (slide 16) for example are valid extensions, while ORDER or PAY are not.

Our OO-biased brains are trained to think in terms of classes and associated operations (Cart.order()) and it apparently takes a considerable amount of time for our brains to re-wire and think in terms of transferring representations to modify resource state.

As a result, people are tempted to come up with ways to map non-uniform operations onto HTTP's uniform interface. One offspring of such endeavor is the REST anti pattern of Action Resources.

I have not tracked it back to its origin, but the general form goes something like this:

Given an operation foo(), define a link semantic 'foo' that enables the server to tell the client what the URI is of 'the foo-action resource of some other resource R. Knowing the foo-action resource R_foo, the client would then be able to invoke the foo() operation on R by means of an empty POST to R_foo:

Find foo-action resource of R:


HEAD /items/56

200 Ok
Link: ;rel=foo

Invoke foo() on R:


POST /items/56/foo
Content-Length: 0

204 No Content

So, why am I calling it an anti pattern?

Action resources are an anti pattern because the approach violates REST's self descriptive messages constraint. How so? Because the meaning of the POST request depends on resource state at the time the client learned that /items/56/foo is the foo-action resource of /items/56. There is nothing in the request that allows the server to understand the actual intention of the client at the time the server handles the POST request.

Suppose the client issues the above HEAD request at time T₁, the server replies at T₂ and the client receives the response at T₃. By the time T₄ the client sends the POST request to the action resource (and T₅ when the server actually receives it) the server might have changed and is now looking at the POST with the empty body which translates to the client intention of telling the resource /items/56/foo to process this [empty body].

The server does not know that the request semantics depend on the server state at T₂ when the server created the HEAD response and therefore cannot detect any mismatch between client intention and its own interpretation.

February 27, 2010

Classifying the CouchDB API

In the context of my Classification of HTTP-based APIs @hanonymity today asked me, how I would classify the CouchDB API.

Ok, let's see. Going to the HTTP Document API immediately reveals that the API definitely violates the hypermedia constraint (see last paragraph) because there is an API documentation in the first place. The only thing one would expect to see for a RESTful API is a set of media type specifications along the lines "The CouchDB API uses the following media types and link relations....which are specified here...".

Next, let's check if the API can be classified as HTTP-based Type II. The fastest way to verify this is usually to look for the use of only specified media types and it is immediately obvious that the CouchDB uses the generic media type application/json and not a specific one that would make the messages self-descriptive. CouchDB API fails the test for HTTP-based Type II, too.

This leaves us with the question whether the API is HTTP-based Type I or if we have to let go all hope because it must be classified as RPC URI-Tunneling. The thing to look out for is of course the use of action names in URIs. It does not take a lot of browsing through the API documentation to reveal that the CouchDB API designers knew what they were doing. The API very thoroughly leverages HTTP mechanics and we can happily conclude that the API is an HTTP-based Type I API.

Is it a problem that the CouchDB API violates two out of four of REST's interface constraints and is therefore not REST at all? I do not think so, because I would not consider achieving loose coupling between a database (backend) and the component that uses the database to be a very useful goal. At least not at the cost that you have to pay on the client side and also because there is strong coupling around the schema anyway between a database and the code that is using it.

However, I think CouchDB API shows quite nicely how an API can still benefit from the simplicity induced by HTTP-based Type I even if we cannot label the API as REST.

A note on the COPY method: It would be helpful to say in the API documentation that the COPY extension method is actually WebDAV's COPY method. And while we are at it, it makes also sense to note that COPY does not really fit HTTP because COPY is a method that works on two resources (Source and Destination) while HTTP does not support such method semantics. For example, caches would not understand that they need to flush the representations of the (now overwritten) destination resource.

This is not a question of RESTfulness though. It would be entirely possible to design an architecture that adheres to the REST style and provides methods that work on two resources.

February 15, 2010

Service Types Revisited

Working on the RESTifying Procurement show case I realized that it looks as if I had to revisit my approach towards service types. I have argued that a service type is constituted by the set of hypermedia semantics it makes use of. This seemed reasonable since a client developer needs to know at least a minimal set of the possible hypermedia semantics to expect from a service in order to write a client for a service of that kind.

Unfortunately this approach has some problems when services of different kinds use the same set of hypermedia semantics because the differentiating aspect is lost. I realized this because in the procurement example I am basically using a single media type but still have a range of services, for example supplier or carrier.

A possible solution to this issue is to have the all-encompassing media type define the service types. Such types are still necessary to enable lookup based on type, for example in order to find the carrier service of some external business partner. Looking for the procurement service doesn't make that much sense.

February 13, 2010

Three Aspects of Steady-States

The #rest IRC channel (transcripts here and here) has recently become for me a valuable source for thought stimulation (come and visit). Yesterday we had a discussion regarding steady-states and the following observation has been made:

URIs refer to a certain application state; at least in the sense that one can use a URI to go back to a certain application state or that one can pass a URI to another party to bring this party into that application state.

However, 'that application state' (despite the notion that I can use the URI to get back to it) is not stable over time. The semantics of the mapping are, but the transitions available form that state can change. So, what is the significance of 'that application state'?

I have not figured that out yet, but here is a thought I had this morning as a reaction to the discussion:

A Web application comprises a state machine that can change over time. Each state (aka steady-state) of that state machine has three aspects:

Its semantics (what the state means)

The serialized state of the associated domain concept (e.g. order 10029)

The outgoing transitions to other states

The first of the three is constrained to remain stable over time the other two vary depending on the state of the associated domain concept and the actual state machine the server intends to provide to (that particular) client.

UPDATE: It has come up in a number of places a notion of transient states or ephemeral URIs. As far as I understand the issue circles around the idea of giving distinct URIs to different states of application states. With this approach, an order in some 'review-pending' state would be represented by a different resource (and hence different URI) than the same order in the state 'shipment-initiated'. Please correct me, if I miss the point here.

My response to that can be found in this comment on Ian's blog.

I'd highly apprechiate if someone could shed more light on this issue. I think it is a deep one.

February 4, 2010

Service Type == Set of Possible Application States

As I have said earlier, in my opinion service types are defined by the set of hypermedia semantics (media types, link relations, ..) they use. Because representations sent by services of a given type are composed of hypermedia semantics from the service type's set it can be said that the service type defines the set of possible representations.

Given that a representation returned by a server corresponds to an application state it follows that a service type defines the set of possible application states.

I like that.

February 2, 2010

Mac OS X Productivity Apps

Recently I upgraded to Snow Leopard and naturally this made me review what's in my Applications folder. Here are the more interesting products I use more or less regularly:

iWork 09
I love iWork since it came out in 2005 (?). I use Pages a lot for personal stuff because client work usually requires true Office compatibility. I always use Keynote though, never Power Point. Keynote is just great. I prefer Excel over Numbers but mostly because I know it better I guess.

Office 2008
This is a must have on a consultant's machine and I am mostly satisfied with the Mac OS version. Though Word remains as painful as ever. I love Excel - I think it is the best product Microsoft ever made.

Merlin
I am a true lover of this project management application but the occasions where I really need it are rare. I like the look and feel and the millions of export formats. It handles MS Project files nicely.

Nova Mind Platinum
The best mind mapping software ever. The Platinum version includes a script writing feature you can use to prepare presentations in a Beyond Bullet Points way.

Oxygen XML Editor
It took some research to find a nice XML editor for Mac OS X but with Oxygen I found a tool that handles everything you might want to do with XML. I mostly use it for applying XSLTs to some XML and to pretty-print one-line XML files. It is also good for generating example documents from schemas (who understands an XML schema without an example anyway?) and for schema conversions, e.g. RelaxNG to XSD.

Aqua Data Studio
Finding a really good database query tool for Mac OS X was even harder than finding an XML editor. What I wanted was a product that would connect to nearly all common databases and would provide at least the functionality of Toad for serious relational work. Aqua Data Studio is excellent and even has a relational diagramming tool and very sophisticated export and import capabilities. Very good for cycles where you need to work on relational data with a set of Unix tools.

Magic Draw Enterprise
Magic Draw is an excellent UML tools with code and database engineering support and very nice report generation facilities. I mostly use UML for documentation and I love the database reverse engineering feature to extract relational diagrams from database schemas. It is fast and has very nice look and feel for a Java-on-Mac application. But it is unfortunately not a bargain. NoMagic really does maintain and extend the product frequently so paying for the updates has always been worthwhile for me.
Magic Draw has optional plugins for SysML and DoDAF.

Eclipse Classic
Trying to stick to 'standard' Eclipse setup with no plugins except for SVN and the gorgeous vi plugin.

Omni Outliner
The Max OS X classic outliner. Always open to manage my thoughts. N'uff said.

Omni Focus
My current ToDo items application. Though I really want to look at Things.

Screenflow
My favorite tools for creating screen casts. I use this to record code walk-throughs with clients or workshops to have something to hand over afterwards.

Parallels Desktop
Virtual Machine for running Windows if I have to. I used it a couple of years ago to develop on a client's host. So it was: Remote Desktop over VPN over Parallels - worked quite nicely actually. Nowadays of course, Apple's RDT would do the job.

BBEdit
My editor if I really can't use vi or need to convert between encodings.

Quicksilver
Application launcher. Everybody must have Quicksilver, definitely!

Snippet
A little tool for managing code snippets. I often forget about using it, but its a clever thing to have.

UPDATE:

Monkey Office
After looking at a bunch of accounting software products for the Mac, this is the one I picked.

Adobe CS5
It sometimes good to be able to edit designer's work directly, for example to create translations. This saves the roundtrip of having the designer replace the texts. InDesign is an amazing piece of software.

Mondrianum
Yesterday, I learned about Mondrianum. This is a color picker that works together with Adobe's Kuler. Difficult to explain what it does - check out for yourself.

Not worth describing, but there also is EyeTV, Toast, Colloquy, Twitterific, Photoshop, Quicktime Pro, Skype.

January 30, 2010

Fond Topic Maps Memories

Today I have done some Web Archeology and dug up the 2004 version of the GooseWorks.org Web site. The project was intended to support the original idea of the Topic Maps paradigm as developed by Steven R. Newcomb when two camps of ideas started to emerge from the community.

I remember doing a ridiculous amount of coding to verify and track Steve's ideas and was joined by Sam Hunting who evangelized on the various markup conferences.

Well, that's history, but I thought it might be worthwhile to put the site back online with some of the source code. Might also be that I have done so much thinking this week about steady-states, bookmarks and cool URIs that I felt I had to bring the site back to life.

In case you are wondering what inspired the name, have a look at the posting that started it all. Besides, it had the cool double-O in it that was said back then to be a guarantee for a site's success :-)

January 27, 2010

In Fear of Sub-Requests

Suppose we design a service that is to provide data about persons, including whom they are linked to. Let's say person representations look like this:


GET /persons/4/full

200 Ok
Content-Type: application/vnd.me.person+xml
Cache-Control: no-cache


  Karl Heinz
  Kieler Strasse 7,20556 Hamburg

In addition, the service provides minimal person data that looks like this (note that its time-to-live is longer):


GET /persons/4

200 Ok
Content-Type: application/vnd.me.person+xml
Cache-Control: max-age=3600


  Karl Heinz
  Kieler Strasse 7,20556 Hamburg

Now suppose we define the processing semantics of application/vnd.me.person+xml in such a way that the user agent should follow all the friend links and retrieve the referenced persons' data. Looks like rather bad design due to all the sub-requests, doesn't it?

On the other hand, nobody complains about HTML pages that link to massive amounts of images and other media.

January 26, 2010

4xx Client Error (But Not It's Fault)

Erik responds to 4xx Client Error. He writes:

"i don't think it appropriate to phrase it like it's the client who necessarily did something wrong when, for example, a 404 is returned. if you follow a URI that was supposed to be persistent and you get a 404 because the server did something wrong, it's quite a stretch to argue that it's your fault. it's pretty clearly not."

I agree that my emphasis on the client doing something wrong wasn't appropriate (but an extreme position is sometimes nice :-). However, I do interpret RFC2616 to be emphasizing that the burden is on the client. While it is obviously not the fault of the client if some resource currently has no representation available (404) I would not agree that "the server did something wrong". The server is free to stop providing representations for a resource and the client just has to account for that to happen.

What is interesting about this is that the conversation between client and server is not considered broken at all because the fact that 4xx errors might occur is part of the contract (and does not indicate a broken contract). This is one of the advantages of a uniform interface with uniform status codes: it can sustain the conversation even when there is a (temporary) gap in expectations.

January 25, 2010

4xx Client Error

HTTP is actually quite clear on how decoupled the server is from the client by calling all non-technical errors simply Client Errors. Dang! So simple! 404 - who broke the contract? The client! 406 - who broke the contract? The client! ...

RFC2616 uses a softer tone though:

"The 4xx class of status code is intended for cases in which the client seems to have erred."

I guess testing a RESTful system means testing the clients...

REST Doesn't Lie

Have been reading about separation of concerns today and came to think that it might be interesting to shift the angle of looking at service evolution: Instead of asking what aspects of services might change it could actually be useful to ask whether there is anything that a service can guarantee not to change, no matter what future requirements it might encounter.

Shifting the point of view revealed a rest constraint that is probably so much taken for granted that it is not talked about very often:

"The only thing that is required to be static for a resource is the semantics of the mapping, since the semantics is what distinguishes one resource from another."

(Section 5.2.2.1 Resources and Resource Identifiers of Roy's dissertation)

This is a guarantee the server has to make that is actually possible to adhere to because no evolution scenario of a server can possibly force it to change that mapping. The other two things that are not subject to change are (obviously) the uniform interface and the use of descriptive message semantics.

Besides these three aspects there is nothing in a networked system that servers can guarantee to clients and REST emphasizes that in an incredibly honest way. The client may depend on the above three aspects to remain true for the entire lifetime of a service but must plan for anything else to change.

What is the take-away? Two things mostly: you are not doing REST unless your system does not adhere to the above and you should exploit the stability of the resource semantics as much as possible in your service design.

January 22, 2010

Why if(status == 200) Is Not Enough

In another Developer Works article I read yesterday, Bruce Sun provides a code snippet that acts as an HTTP client to a service

January 21, 2010

REST Design Mistake with Apache Wink

Just came across an article on service implementation with Apache Wink by Vishnu Vettrivel. While it is primarily covering the service implementation aspects, it also has a section on RESTful design that too easily creates a wrong understanding of how a RESTful service would have to be designed.

Given that REST has entered the hype cycle it happens too often now, that actually unRESTful design is promoted. It is probably time to spend more energy on preventing that REST will be doomed before enough adopters have indeed realized and, more importantly, experienced the benefits. REST done badly might actually cause more harm than WS-* done well, so lets be wary!

What is wrong about the article? Three things mostly:

Likely violation of the hypermedia constraint: the article proposes a URI structure but fails to make clear that this knowledge must not leak to the client developer but that the client should discover the appropriate links or forms at runtime. Coupling on the URI structure must be avoided to allow the server to control (and change) its own URI space without the risk of breaking clients.

Misplaced authentication information: the proposed design places user name and password as parameters into the URI while they should be send to the server using standard HTTP authentication mechanisms. This can negatively impact caching since caches understand the special meaning of the Authenticate header. In the worst case, a public cache might keep a representation that should only be visible to an authenticated client.

Limited visibility: the article proposes the use of application/json but fails to make clear that this causes coupling on the out-of-band knowledge of the actual JSON structure being used.

I am aware that RESTful design is not at the heart of this article, but we should make sure that whoever reads what we write does not walk away with a wrong impression of what a RESTful design is.

The Protocol Hammer

It's not news, but Anne's statement just popped out when I scanned through the archives:

"If I knew what I know now back in 2000, I would have pushed for a RESTful registry with free-form search, and such a beast would have been a lot more valuable than UDDI. But we had this really spiffy, state-of-the-art protocol hammer called SOAP, and we saw everything as a nail."

Reminds me that REST officially turns 10 this year!

January 8, 2010

Service Type Specifications II

UPDATE: My thinking around this issue has evolved.

Wow, this question was much harder to get straight than I expected:

What constitutes the specification of a service type?

First things first, though! In order to provide a principled answer to that question, another one needs to be answered before:

What is the purpose of a service type specification?

Purpose of Service Type Specifications

1. Service type specifications provide the information client developers need to implement a client for services of the described type. If you hand the service type specification to a developer she should be able to know exactly what to do and what to reasonably expect from any instance of that service type. There should be no further knowledge required (except for following any included references, for example, to media type specifications, of course).

2. Service type specifications provide the information necessary for implementing instances of the specified service type. There should not be any further information required except for implementation specific details behind the service boundary of course.

3. Service type specifications provide the information service owners and maintainers need in order to understand in which way the server can evolve without breaking clients. This is redundant with 2. but mentioning it explicitly emphasizes where exactly the contract is established between client and server owners. Anything that is not specified in the service type specification or referenced material is not part of the contract and constitutes no obligation by either party.

4. Service type specifications enable service discovery by type. Clients that wish to interact with a certain kind of service can use the information provided by the service type specification to detect when they see a service that is of the desired type. The necessary information should be available as a response to the published service URI either by analyzing the set of initial transitions (goals) provided by the service or by looking at the service document's media type (see below).

Having laid out the purposes of a service type specification, we can now make principled decisions about what should be part of a service type specification.

Elements of Service Type Specifications

Here is the short answer (rationale follows below):

Service Type Specifications define the set of hypermedia specifications (media types, link relations, etc.) used by the service and information about the minimal initial set of available transitions (goals).

The latter can optionally be expressed as a media type, too (a service type specific service document type), which simplifies the definition to 'a set of media types' but leads to the creation of a new media type for a given kind of service.

The essential aspect of the above definition is that the client needs to know what the media types are it needs to understand in order to interact with the service.

So, why is that?

The ideal situation would be to say nothing about the service type at all, just agree on a set of media types that make sense to be understood in general and implement all or any of them in the clients as desired. The problem with this approach is that it does not address purpose 1. above; a client developer would not have any clue what a service is doing or how to interact with it. There would not be any notion of a service type at all; just individual hypermedia semantics (media types, link relations etc.).

But even if the client developer was provided with some means of a service type description (in the form of a dedicated service media type or a set of initial transitions) - see purpose 4. above - there would still be no way for the client developer to have any clue what can be done with the service beyond the initial transitions. Knowing the set of media types provides that clue.

The issue of guiding the service developer (purpose 2.) is addressed because the service type tells the service developer exactly what media types etc. are available to him to solve the given implementation task.

Purpose 3. above is addressed by the fact that it is not possible to remove a hypermedia specification from the specified set without incompatibly changing the semantics of the service type. Service owners are therefore free to evolve the service by adding hypermedia specifications (or extending extensible ones) but may not remove any.

The purpose of discovery (4. above) can be addressed by using a generic service description media type such as Atom Service documents (application/atomsrv+xml) and describing a minimal set of available resources. For example, an ITIL-conforming help desk service type could be specified as providing at least three collections (identified by categories) containing and accepting submissions of Incidents, Problems and Change Requests. Clients (including service registry crawlers) would know they come across an instance of that 'ITIL helpdesk service' when they see an Atom Service with the specified collections.

Alternatively, a new service document media type could be minted (e.g. application/helpdesksrv+xml) to identify the service type. The format of that type would be defined to provide links to the necessary resources. This leads to simpler discovery/registry mechanism but also might cause explosion of media types.

Using a combination of both might actually be best, such as application/atomsrv+xml;profile=helpdesk.