Posted in SOA by AST on Saturday, December 16th, 2006
Even though I had to temporarily drop out of the ongoing discussion on the service-orientated-architecture Yahoo group/mailing list, which prompted my last post, to focus on a few high-priority interrupts for a while, my brain hasn’t fully disengaged from the discussion.
One of the light bulbs that went off in my head during the aforementioned discussion was it finally clicked as to what the uniform interface constraint “hypermedia as the engine of application state” (Fielding 2000, §5.1.5) actually means (to me, anyway). Now that I think I understand it, I also understand why it is so hard for people to really get what it means: most of the people trying to figure this out are experienced, traditional programmers. What do traditional programmers do? If they’re any good and they’re dealing with large-scale distributed systems, they spend an awful lot of time on the design of the “perfect” API for their remote components. Perfect in this context means that it is the optimal trade-off between all of the architectural constraints and system requirements to deliver an efficient distributed system.
API Brain Damage
I’m going to go out on a limb here, but I think this “API brain damage” is part of why we haven’t been able to significantly advance the state of the art of software system design over the last 30 years. We all have it, because from the moment you learn what a compiler is and get started with programming they’re all around you. As you develop your own skills and experience, you’re exposed to both good and bad API design, and this all gets mixed in together in our little brains so that each of us develops their own perspective on what is a “good” vs. a “bad” API. Most of us can, by this stage, pass such a judgment in 30 minutes or less.
However, I think the effects of this brain damage are to create some pretty fundamental assumptions about how computer systems work. If you need programmatic access (here it is again, reinforced by the terms we use to describe our requirements) to a set of functionality (for kicks, lets call it a service), then the first thing we as programmers want to know is “where’s the API?”. It’s ingrained in our very being because of years and years of positive reinforcement. We all have it…and I think it’s a tragic mistake.
Even people who have a pretty good understanding of what REST is get pulled back into the primordial slime just as they’re about to sprout legs and walk upright. Joe Gregorio’s really good article on building a RESTful system, Constructing or Traversing URIs?, is a prime example of this type of thinking. I’m not criticizing Joe here, I’m criticizing the way we, as software architects and developers, have been trained to think.
The point of Joe’s article is that hypermedia is about link traversal, but that because the possible URI space is nearly infinite, it’s reasonable to publish recipes for link construction under the argument that this is what HTML Forms using the GET method do anyway (emphasis added). This “optimization” undermines what to me is the whole point of “hypermedia as the engine of application state”: link traversal, and this is recognized by the rest of the article Joe wrote. As soon as you start down this slippery slope, you’ve lost all of the advantages I see in even bothering with hypermedia at all. It no longer becomes part of the application, it’s just data that’s shipped around via some transfer mechanism, in this case HTTP.
Hypermedia Applications
Many things have contributed to my opinion about this topic, including this statement by Eric Newcomer recently on the mailing list:
I should just note that comparing the Web to WS-* is an apples-to-oranges comparison (one being an application and the other being a collection of specifications).
With apologies to Eric, I initially sorta dismissed this statement because I was focused on my conversation with Steve Jones—but I shouldn’t have. Eric is exactly right here. The World Wide Web is a hypermedia application, not just a collection of specifications. Again, our software development training doesn’t help us here very much. Good ol’ functional decomposition makes us (well, me at least) want to see the Web as HTTP (RFC 2616), URIs (RFC 3986), MIME types (Wikipedia entry to set of related RFCs), HTML/XHTML (W3C markup specifications) and HTML forms (W3C recommendation and RFC 2388).
In actual fact, it’s all of those working together to provide, as Eric said, the Web as a distributed hypermedia application (which, if I remember correctly is a point also made by Roy Fielding himself in the dissertation). The Web works because both user agents (browsers) and the server applications use all of these specifications together to expose a set of functionality to the interactive user. There is no a priori agreement between the browser and the Web server as to how the information service built on these specifications used, but because of agreement on how these specifications are used together, as an application, it doesn’t matter if today, CNN.com is built using Microsoft ASP, and tomorrow it’s built on PHP. Apart from the application of the “Cool URIs Don’t Change” principle, if a user starts from http://www.cnn.com, they will always be able to utilize the CNN news service via their browser’s implementation of those specifications and the implicit agreement of CNN to publish its service in accordance with them too.
The moment that CNN or any other service provider publishes a recipe, guideline or specification of how to access specific parts of that service, e.g. the latest headlines may be found at http://www.cnn.com/headlines/ or http://headlines.cnn.com/ or whatever, as Joe points out in the article, they’ve made an implicit commitment to support that API (because that’s what it essentially is) for a period of time. When someone comes along an implements a specialized headline grabber that follows that API, and CNN decides to change it due to idle whim or genuine business need, that headline grabber client is now broken. If it’s just a single user, maybe this isn’t a big deal, but if it is every 3rd-party trading partner of your organization, the impact is a bit more significant.
A hypermedia application such as Atom or RSS and content negotiation via MIME types, HTTP accept headers and embedded <link/> tags mean that this sort of evolution could happen without breaking the client—provided there is agreement on the application semantics of how those things should be both used and interpreted between clients and servers. Anything else means you’re back to brittle, API based systems that can no longer evolve independently of each other.
Closing Thoughts
I don’t have all of the answers here, but I think that the notion of a “REST API” is an oxymoron because REST is about dynamic evolvability of clients and servers based on codified understandings of previously agreed application semantics. This means that HTML browsers and Web servers agree to provide the Web application in a way that both can understand and use, but it also means that RSS/Atom feed readers and server feeds agree on the way they interact to both access and provide the syndication application.
What I’m saying is that the nature and specification of the hypermedia application is as key to REST as how you use the HTTP verbs. However, since the HTTP verbs are essentially an API that programmers can get their heads around, that’s where everyone’s focus is at the moment. I think this is a diversion from where people should be thinking about REST. As long as you agree the semantics of the hypermedia application (HTML+Forms+MIME+HTTP or Atom+XHTML+MIME+HTTP), the way that application is implemented on the server should be an implementation detail and not something exposed to clients in terms of an API.
If the hypermedia constructs being used to describe the interaction between the clients and servers are not rich enough to abstract these things so the client needs to know that it’s supposed to POST data to URI x rather than being able to simply traverse hypermedia provided by the server (meaning the operation, data and location are provided to the client in a way it can understand rather than having any of this hard-coded as client implementation logic), then the hypermedia application being used (and not the information service) has not been sufficiently defined. Using the existing and emerging specifications for describing content and interaction, it should be possible to specify the application. If it isn’t, then we need to be spending our efforts on a way to do that rather than arguing about the RESTfulness of so-and-so’s latest HTTP-based API.
To me, this is the real problem to be solved in implementing RESTful systems. I think there are people who are starting to realize this need implicitly, but I think it’s time we made that need an explicit requirement of systems implemented in the REST style. If you can’t describe the interaction via hypermedia and link traversal semantics only, then I don’t think the system truly meets the requirements of REST as I understand them today. The uniform interface is hypermedia, not HTTP. Focusing on HTTP is not seeing the forest for the trees.
Comments, flames and discussion are more than welcome.
Permalink
Posted in SOA by AST on Saturday, December 9th, 2006
I can appreciate Gervas’ position as a “neutral, non-technical observer” to the whole ROA/SOA thread, but I think the root of the problems in bright people having difficulty clarifying basic issues about REST is entirely one of “what they know” and “where they are coming from”.
I have tremendous respect for Steve and everyone else on the list [the service-oriented-architecture list] that I’ve interacted with, so this isn’t personal in any way. I think it is important to understand a bit of industry history in light of lots of smart people and vendors trying to figure out how to field an SOA that works.
A lot of us on this list have been doing distributed computing for a long time. Most of us have done a lot of one or more of CORBA and DCOM before RMI/EJB came on the scene and certainly before XML-RPC and SOAP came on the scene (some people have been doing it earlier than that).
The thing about a programming paradigm is that to get any good at actually doing something with it, it takes a lot of time and effort to learn how to think and design in a way that takes advantage of it. CORBA, DCOM and EJB and the like are about extending the local programming model to remote systems in a more-or-less coherent way.
All of them are object-oriented in that you create a service with a defined set of capabilities and a given interface. This interface is normally designed in similar ways to local interfaces in that it exposes a fairly rich and domain-specific API for interaction between clients and servers. Most of the early mistakes people make in developing CORBA, DCOM and EJB projects are in the granularity of those interfaces because they forget or don’t consider the effect of the cost and overhead of communicating over the network vs the costs within the same address space, e.g. “normal” objects.
Learning how to optimize the tradeoff between a rich, domain-specific interface and one that is efficient is one of the key things in learning how to design and develop successful distributed object systems.
If you take a look at the history involved in developing these systems, formalization of CORBA started in 1990 at the OMG, DCOM surfaced around 1993 and RMI and EJB emerged in 1997. Getting all of these technologies implemented took a lot of work because most of them are naturally fairly complex. It isn’t easy trying to make a remote system look like it is a local one. Lots of vendors produced a lot of products, and some companies were founded around some of these technologies.
While each of these technologies is good (to varying degrees) at providing a distributed object computing platform within a local physical environment, they didn’t scale very well over long distances or between enterprises. Most of them required a large number of proprietary ports to be opened in company networks, which has security implications not to mention just the operational issues of making it happen.
On the other hand, HTTP and Web pages nicely sailed through port 80 which, in most cases, was already open. Both vendors and customers said, “Wouldn’t it be great if we could do things like CORBA, but using HTTP?” Enter XML, XML-RPC and SOAP in 1999-2000.
Now, if you were a vendor that had spent millions in R&D in getting distributed objects computing working in CORBA, DCOM and EJB but had come up against limiting factors such as complexity of deployment (all those ports), lack of interoperability between CORBA, DCOM and EJB and the way the Web was influencing the development of applications, what would you do?
I bet you’d figure out how to take all those things you’d been doing and make them work over ubiquitous Web protocols. I’m not saying this is necessarily bad and doesn’t have its place, but there are two other big reasons why you might think it would be reasonable to do this:
- it is the way major software vendors had been developing systems since as early as 1990, meaning
- there was a legion of software developers who understood how to develop distributed systems using those concepts and mechanisms
Vendors are protecting their investment because they need to stay in business and keep their shareholders happy, but somehow make their distributed computing technologies work together as more and more people are running heterogeneous environments not only internally, but across trading partners.
The Web is different, however.
In the same way that messaging-oriented middleware (MOM) isn’t the same way of thinking about solving distributed computing problems as using distributed objects, building successful distributed hypermedia applications using REST for either human/computer or computer/computer interaction requires a shift in the way you think about the problem.
If you can’t suspend your assumptions about how things ought to work to understand how they do work in a different environment, e.g. MOM or REST, you’ll forever be frustrated and not understand the advantages and disadvantages of this approach over any other. From my perspective on the recent ROA/SOA thread, this is where we are and why reaching any sort of common understanding is and will continue to be so difficult.
Permalink