January 30, 2010

Fond Topic Maps Memories

Today I have done some Web Archeology and dug up the 2004 version of the GooseWorks.org Web site. The project was intended to support the original idea of the Topic Maps paradigm as developed by Steven R. Newcomb when two camps of ideas started to emerge from the community.

I remember doing a ridiculous amount of coding to verify and track Steve's ideas and was joined by Sam Hunting who evangelized on the various markup conferences.

Well, that's history, but I thought it might be worthwhile to put the site back online with some of the source code. Might also be that I have done so much thinking this week about steady-states, bookmarks and cool URIs that I felt I had to bring the site back to life.

In case you are wondering what inspired the name, have a look at the posting that started it all. Besides, it had the cool double-O in it that was said back then to be a guarantee for a site's success :-)

January 27, 2010

In Fear of Sub-Requests

Suppose we design a service that is to provide data about persons, including whom they are linked to. Let's say person representations look like this:

GET /persons/4/full

200 Ok
Content-Type: application/vnd.me.person+xml
Cache-Control: no-cache

Karl Heinz
Kieler Strasse 7,20556 Hamburg

In addition, the service provides minimal person data that looks like this (note that its time-to-live is longer):

GET /persons/4

200 Ok
Content-Type: application/vnd.me.person+xml
Cache-Control: max-age=3600

Karl Heinz
Kieler Strasse 7,20556 Hamburg

Now suppose we define the processing semantics of application/vnd.me.person+xml in such a way that the user agent should follow all the friend links and retrieve the referenced persons' data. Looks like rather bad design due to all the sub-requests, doesn't it?

On the other hand, nobody complains about HTML pages that link to massive amounts of images and other media.

January 26, 2010

4xx Client Error (But Not It's Fault)

Erik responds to 4xx Client Error. He writes:

"i don't think it appropriate to phrase it like it's the client who necessarily did something wrong when, for example, a 404 is returned. if you follow a URI that was supposed to be persistent and you get a 404 because the server did something wrong, it's quite a stretch to argue that it's your fault. it's pretty clearly not."

I agree that my emphasis on the client doing something wrong wasn't appropriate (but an extreme position is sometimes nice :-). However, I do interpret RFC2616 to be emphasizing that the burden is on the client. While it is obviously not the fault of the client if some resource currently has no representation available (404) I would not agree that "the server did something wrong". The server is free to stop providing representations for a resource and the client just has to account for that to happen.

What is interesting about this is that the conversation between client and server is not considered broken at all because the fact that 4xx errors might occur is part of the contract (and does not indicate a broken contract). This is one of the advantages of a uniform interface with uniform status codes: it can sustain the conversation even when there is a (temporary) gap in expectations.

January 25, 2010

4xx Client Error

HTTP is actually quite clear on how decoupled the server is from the client by calling all non-technical errors simply Client Errors. Dang! So simple! 404 - who broke the contract? The client! 406 - who broke the contract? The client! ...

RFC2616 uses a softer tone though:

"The 4xx class of status code is intended for cases in which the client seems to have erred."

I guess testing a RESTful system means testing the clients...

REST Doesn't Lie

Have been reading about separation of concerns today and came to think that it might be interesting to shift the angle of looking at service evolution: Instead of asking what aspects of services might change it could actually be useful to ask whether there is anything that a service can guarantee not to change, no matter what future requirements it might encounter.

Shifting the point of view revealed a rest constraint that is probably so much taken for granted that it is not talked about very often:

"The only thing that is required to be static for a resource is the semantics of the mapping, since the semantics is what distinguishes one resource from another."

(Section Resources and Resource Identifiers of Roy's dissertation)

This is a guarantee the server has to make that is actually possible to adhere to because no evolution scenario of a server can possibly force it to change that mapping. The other two things that are not subject to change are (obviously) the uniform interface and the use of descriptive message semantics.

Besides these three aspects there is nothing in a networked system that servers can guarantee to clients and REST emphasizes that in an incredibly honest way. The client may depend on the above three aspects to remain true for the entire lifetime of a service but must plan for anything else to change.

What is the take-away? Two things mostly: you are not doing REST unless your system does not adhere to the above and you should exploit the stability of the resource semantics as much as possible in your service design.

January 22, 2010

Why if(status == 200) Is Not Enough

In another Developer Works article I read yesterday, Bruce Sun provides a code snippet that acts as an HTTP client to a service

January 21, 2010

REST Design Mistake with Apache Wink

Just came across an article on service implementation with Apache Wink by Vishnu Vettrivel. While it is primarily covering the service implementation aspects, it also has a section on RESTful design that too easily creates a wrong understanding of how a RESTful service would have to be designed.

Given that REST has entered the hype cycle it happens too often now, that actually unRESTful design is promoted. It is probably time to spend more energy on preventing that REST will be doomed before enough adopters have indeed realized and, more importantly, experienced the benefits. REST done badly might actually cause more harm than WS-* done well, so lets be wary!

What is wrong about the article? Three things mostly:

  • Likely violation of the hypermedia constraint: the article proposes a URI structure but fails to make clear that this knowledge must not leak to the client developer but that the client should discover the appropriate links or forms at runtime. Coupling on the URI structure must be avoided to allow the server to control (and change) its own URI space without the risk of breaking clients.

  • Misplaced authentication information: the proposed design places user name and password as parameters into the URI while they should be send to the server using standard HTTP authentication mechanisms. This can negatively impact caching since caches understand the special meaning of the Authenticate header. In the worst case, a public cache might keep a representation that should only be visible to an authenticated client.

  • Limited visibility: the article proposes the use of application/json but fails to make clear that this causes coupling on the out-of-band knowledge of the actual JSON structure being used.

I am aware that RESTful design is not at the heart of this article, but we should make sure that whoever reads what we write does not walk away with a wrong impression of what a RESTful design is.

The Protocol Hammer

It's not news, but Anne's statement just popped out when I scanned through the archives:
"If I knew what I know now back in 2000, I would have pushed for a RESTful registry with free-form search, and such a beast would have been a lot more valuable than UDDI. But we had this really spiffy, state-of-the-art protocol hammer called SOAP, and we saw everything as a nail."

Reminds me that REST officially turns 10 this year!

January 8, 2010

Service Type Specifications II

UPDATE: My thinking around this issue has evolved.

Wow, this question was much harder to get straight than I expected:

What constitutes the specification of a service type?

First things first, though! In order to provide a principled answer to that question, another one needs to be answered before:

What is the purpose of a service type specification?

Purpose of Service Type Specifications

1. Service type specifications provide the information client developers need to implement a client for services of the described type. If you hand the service type specification to a developer she should be able to know exactly what to do and what to reasonably expect from any instance of that service type. There should be no further knowledge required (except for following any included references, for example, to media type specifications, of course).

2. Service type specifications provide the information necessary for implementing instances of the specified service type. There should not be any further information required except for implementation specific details behind the service boundary of course.

3. Service type specifications provide the information service owners and maintainers need in order to understand in which way the server can evolve without breaking clients. This is redundant with 2. but mentioning it explicitly emphasizes where exactly the contract is established between client and server owners. Anything that is not specified in the service type specification or referenced material is not part of the contract and constitutes no obligation by either party.

4. Service type specifications enable service discovery by type. Clients that wish to interact with a certain kind of service can use the information provided by the service type specification to detect when they see a service that is of the desired type. The necessary information should be available as a response to the published service URI either by analyzing the set of initial transitions (goals) provided by the service or by looking at the service document's media type (see below).

Having laid out the purposes of a service type specification, we can now make principled decisions about what should be part of a service type specification.

Elements of Service Type Specifications

Here is the short answer (rationale follows below):

Service Type Specifications define the set of hypermedia specifications (media types, link relations, etc.)  used by the service and information about the minimal initial set of available transitions (goals).

The latter can optionally be expressed as a media type, too (a service type specific service document type), which simplifies the definition to 'a set of media types' but leads to the creation of a new media type for a given kind of service.

The essential aspect of the above definition is that the client needs to know what the media types are it needs to understand in order to interact with the service.

So, why is that?

The ideal situation would be to say nothing about the service type at all, just agree on a set of media types that make sense to be understood in general and implement all or any of them in the clients as desired. The problem with this approach is that it does not address purpose 1. above; a client developer would not have any clue what a service is doing or how to interact with it. There would not be any notion of a service type at all; just individual hypermedia semantics (media types, link relations etc.).

But even if the client developer was provided with some means of a service type description (in the form of a dedicated service media type or a set of initial transitions) - see purpose 4. above - there would still be no way for the client developer to have any clue what can be done with the service beyond the initial transitions. Knowing the set of media types provides that clue.

The issue of guiding the service developer (purpose 2.) is addressed because the service type tells the service developer exactly what media types etc. are available to him to solve the given implementation task.

Purpose 3. above is addressed by the fact that it is not possible to remove a hypermedia specification from the specified set without incompatibly changing the semantics of the service type. Service owners are therefore free to evolve the service by adding hypermedia specifications (or extending extensible ones) but may not remove any.

The purpose of discovery (4. above) can be addressed by using a generic service description media type such as Atom Service documents (application/atomsrv+xml) and describing a minimal set of available resources. For example, an ITIL-conforming help desk service type could be specified as providing at least three collections (identified by categories) containing and accepting submissions of Incidents, Problems and Change Requests. Clients (including service registry crawlers) would know they come across an instance of that 'ITIL helpdesk service' when they see an Atom Service with the specified collections.

Alternatively, a new service document media type could be minted (e.g. application/helpdesksrv+xml) to identify the service type. The format of that type would be defined to provide links to the necessary resources. This leads to simpler discovery/registry mechanism but also might cause explosion of media types.

Using a combination of both might actually be best, such as application/atomsrv+xml;profile=helpdesk.