One of the first decisions you make when implementing an SOA or even an individual Web service will affect the rest of the decisions you need to make. This decision is: which architectural style will be the foundation of your SOA? This article will attempt to provide some starting points for answering this decision for yourself. What makes this decision so hard is, contrary to what you may read elsewhere, there really is no 100% right answer. As stated by Barry Boehm in Software Engineering Economics:
“The most important software engineering skills we must learn are the skill involved in dealing with a plurality of goals which may be at odds with each other, and the skill of coordinating the application of a plurality of means, each of which provides a varying degree of help or hindrance in achieving a given goal, depending on the situation.”
Those of you who have read some of the other articles here will know that this is just another way to say what David Bohm (the physicist) is saying about the importance of context. While the importance of this decision is crucial in any software system, the inherent goal of SOA of providing a flexible architecture on which you can build your business now and allow it to adapt and grow to meet future needs means that whatever trade-offs you make now may be around for quite some time.
Bear in mind that I’m not talking about particular techniques for service design or the selection of a platform or technology, although they do overlap to some degree. What I’m talking about is the fundamental decision you must make when you define the underlying architecture of your SOA. Of course, like most things, the making the fundamental decision consists of first making a couple of related decisions about your architecture.
The first part of the decision is probably the most profound. The second part is about how you fill in the details. There are two main options:
- Distributed Objects
Each of these will be discussed in the following sections.
One option is to base your SOA around the distributed objects paradigm. This means that essentially your developers don’t need to think about building applications differently (false assumption #1), and that since you already have distributed objects in your applications (EJBs or [D]COM/CORBA components), you’re nearly there with this whole SOA thing (false assumption #2).
Just to refresh everyone’s memory about distributed object fundamentals, in a distributed object system the client of the object and the implementation of the object are generally on separate machines. Because of this, each time a client invokes a remote method call, it needs to “flatten” or marshal the data in the parameters (probably objects) into something that can be sensibly transmitted over the network. When this is done, the server listens for the request, reads the data for the network, unmarshals the data stream into objects and finally invokes the appropriate method on the object running in some kind of container on the server.
As you well know, this is an expensive operation. It is also more complex than making a local method call because there could be all manner of failures between the client and the server: the object data may not be something that can be marshaled, the cat may have tripped over the network cable, the server could be down or overloaded, etc. Toto, we’re definitely not in Kansas anymore.
From a performance perspective, it was quickly apparent that minimizing the number of these expensive calls was the only way to write an efficient distributed application. This realization led to the collective wisdom of coarse-grained methods. A coarse-grained method is one which tries to pack as much information as is sensible into the parameters so that the server has all of the data it needs to perform some kind of task. For more details or a refresher see Data Transfer Object, Gateway, Remote Facade and Service Layer in Martin Fowler’s Patterns of Enterprise Application Architecture.
Great, so I saw “service layer” in there, that must mean that I’m ready for SOA, right?.
No, not quite.
The second option has been around for a long time, but most developers would not be familiar with it or may not realize they are using it. With the advent of JMS in the Java platform, more developers would at least have seen it, but, even if developers have seen it and have written applications using JMS, they still may not completely understand some of the more subtle things about how it works.
While most people immediately think about synchronicity vs. asynchronicity when they think of messaging, this isn’t really what’s most important. The most important thing about messaging is the message. Messaging is another way to build distributed systems, but generally the difference between it and distributed objects is that with distributed objects, you have more of a “conversation” between the client and the server. With messaging, while you can do this too, the main thing is that you send a complete message from the sender to the receiver and that’s generally all there is to it. These messages are called self-describing if everything the receiver needs to know to perform the requested task is included in the message. Messaging systems have lots of capabilities and options about how the message transfer can be accomplished, but the focus on self-describing messages is the most important part of messaging. The asynchronous nature of most messaging systems is a side-effect of this way of thinking about system design rather than being a necessary requirement. One of the best references on messaging would have to be Hohpe and Woolf’s Enterprise Integration Patterns.
Astute readers will notice that there’s not too much difference between coarse-grained methods on distributed objects and the self-describing messages used in a messaging system. This will be important in the next few sections.
Now that you have an idea about the options for step one, it’s time to fill in more of the blanks. Unfortunately, this is really where things start to get confusing. SOA is, by implication if not by definition, a distributed system. The goals of SOA are arguably best described in Doug Kaye’s Loosely Coupled: The Missing Pieces of Web Services: loosely-coupled, discrete units of functionality (services) which are interoperable and independent of vendor or platform dependencies. Once these services exist, the “agility” of the enterprise is increased because these discrete units of functionality can be combined in interesting ways to meet the changing business environment (with apologies for playing buzz-word bingo).
Working from this definition, we can infer some additional requirements for SOA:
- We need to have a way to describe how to interact with this functionality.
- If it is supposed to be composable, then we need to be able to invoke this functionality in a consistent way.
- Loose coupling means that we may have more than one place to get the same function (think everyone’s favorite: stock quotes)
- We need to carefully consider if we want to be dependent on the availability of the service (If it’s remote, do we want to be down if it is? Can we afford to be?)
- Interoperability is also about the data that’s exchanged. Are we going to map between formats (n-squared potential), or are we going to try and get everyone to agree on a common format?
These are only a few of the examples of implied requirements in an SOA based on the above goals. Without difficulty we could easily come up with more. However, for now, these are sufficient for the current discussion.
Evaluating the Distributed-Objects Approach
If we look at how these goals can be met using distributed objects, we notice a few key things:
- Normally, the client and the server need to share the same technology
- Since it is, by definition, object-oriented, the same definitions of objects must be available on both the client and the server
- Some distributed object implementations are tied to a specific platform
Other people noticed these issues some time ago, and decided that they were solvable. This resulted in XML-RPC and SOAP. The idea behind these were that HTTP was ubiquitous and XML could be parsed by nearly any environment, so these two things would satisfy the interoperability and platform-independence problems. Conveniently, these are two of the goals for SOA.
The problem of the same object definitions available on the client and server is also addressed by these two approaches. Both XML-RPC and SOAP define a remote invocation interface (solves the behavior problem) and the data description (solves the marshalling problem). This means that you now have interoperable distributed objects using XML and HTTP (note: there are a lot of issues I just glossed over here).
Cool! That’s a good thing, right?
Yes and no.
While XML-RPC and SOAP provide transport and data interoperability, the problem they don’t solve is one of both the strengths and weaknesses of objects: no standard invocation mechanism. In this case, I’m not talking about HTTP and XML as an invocation mechanism; I’m talking about that you have a nearly infinite combinations of method names, parameters and ways to expose a given operation. Some of these issues are discussed in much more detail in Chapter 3 of Fielding’s dissertation.
In a traditional distributed objects system, this really doesn’t matter that much. Either the designers of the system are publishing an external API which matches some specification for general use (e.g. CORBA Naming/Trader), or it’s primarily an internal system with a controlled number of clients. It was just understood that you would normally have to write code to invoke the remote objects, however there are some solutions which use CORBA’s Dynamic Invocation Interface (DII) to dynamically discover and invoke these methods and parameters based on a connection to a running ORB (e.g. CorbaScript).
With distributed objects, this problem can never really go away. You can hide it (adding more complexity to the infrastructure), or you can just accept it (binding you that much more tightly to the particular implementation). This problem is what WSDL is trying to solve by allowing toolkits to take the first approach.
The implications of using a distributed-objects approach to SOA are that you must deal with the complexity of these exposed service interfaces within the clients of those interfaces. You need to know that the remote interface is called
GetQuote, it takes one parameter,
symbol:string and returns a fixed-point number.
That number is probably going to represent the currency value in U.S. Dollars, but it might not. Maybe you don’t know the lookup symbol, so then you’ll have to either look it up first, or maybe you can add a new parameter to the method. If you do need to change the interface, you’re going to probably need to update the other clients. You might be able to add a new method that the clients don’t know about, but then your interface may not be quite as clear as it was before. Either way, it is likely to require recompile/redeploy of one or more service implementations.
This is the interface versioning problem, and is exactly the same as if you were using COM/CORBA or RMI. There are ways you can minimize the impact or check to make sure that the service supports the version of the interface you expect, but there’s not really an easy way to manage this issue.
Congratulations, now you have a Web-enabled, brittle interface instead of a COM, CORBA or RMI one that is still difficult to align with the loosely-coupled goals of SOA. Look at all that progress we’ve made in the last 20 years.
Evaluating the Messaging Approach
Since the focus of a messaging approach is the message, the invocation mechanism can be standardized. This is a very powerful concept because it means that even though you’ve moved the complexity around rather than eliminating it, you are now able to send messages to any destination in a uniform manner. Anywhere. Any service–as long as it follows the same messaging invocation mechanism.
Readers familiar with the Unix shell environment will know the potential that this unified interface provides you. Fielding calls the Unix shell interface a Uniform Pipe and Filter style, but the key aspect is the uniformity. The filters provide discrete functions and the user plugs them together into a process (or pipeline) which is more useful than any of the individual parts because it solves the problem at hand.
The uniform interface provides part of the loose coupling necessary for SOA, but there is still the issues of temporal and spatial loose coupling. Temporal loose coupling means that the sender and the receiver of the message do not need to be active at the same time. In order for this to work, the message sender needs to be able to put the message somewhere until the receiver is available to pick it up. The only thing that is required is that both the sender and receiver know how to deposit messages and pick them up–they do not need to know anything else about each other.
Traditional messaging systems like MQSeries or JMS generally provide a client library which connects to a messaging server that will buffer the messages until the intended receiver is available (if desired). However, the messaging approach can be applied to any location which will accept messages from one party and make them available to another. In this way, even a Web server could fulfill the role of holding messages “in transit” between a sender and receiver, as long as both knew where to look and the Web server was available. In this case, the Web server acts as an intermediary between the message sender and the receiver or receivers, decoupling them from each other in time.
Spatial loose coupling is where the message sender does not know the location of the ultimate message receiver. In the case of something like JMS, the message sender writes to an “out” queue. As soon as the message is written successfully, it is up to the messaging system to deliver that message to the intended recipients. This can involve one or more intermediate “hops”, or it may simply be an event-driven process listening on the other end of the channel.
The intermediary nature and unified interface of the messaging approach provides a high level of loose coupling between the sender and receiver required for SOA. It also can address the interoperability aspects of being able to separate the sender and receiver by platform and technology, however, this may not always be the case. As a software designer, you are responsible for picking tools and platforms to ensure they provide these things.
If the messaging system provides a uniform invocation interface for sending and receiving messages, it also avoids the interface versioning problem mentioned earlier. The only way your interface will change is if you change aspects of the underlying messaging system. Composition of services with a uniform interface is straightforward; order doesn’t matter and the person putting them together needs to only figure out the sequence of the steps.
All this is great, but it still leaves us a problem: the message. With a uniform invocation mechanism, it is straightforward to get the message from place to place, but interpreting the contents can be a whole separate level of interoperability problems. XML partially solves this problem by providing consistent parsing semantics, but the interpretation of the data can still be a problem. The message formats must be agreed between the sender and receiver or the message must be transformed from one vocabulary to another. Like I said, you’ve moved the problem around, but this has helped. We now only need to worry about one problem at a time, we can do it within our own address space, and we should have all the data (and sometimes more) than we actually need to perform the desired task.
Styles in Action
As I mentioned, you can realize any of these approaches using a number of tools and techniques. This section briefly examines some of the approaches currently in use.
- Object-Oriented SOA
Object-oriented SOA can be implemented with coarse-grained method calls using any cross-platform invocation mechanism. Common applications of this are the “vanilla” WS-* style of Web services or using a more self-contained, but still cross-platform technology such as CORBA or Michi Henning’s Ice. The litmus test to detect this style being used is the presence of a uniform interface to transmit data between services. I would argue that if even one service in the SOA presented a non-uniform interface, the SOA was using an object-oriented approach. As you may guess, there is a fine line between a “normal” distributed application environment and SOA using this criteria, but the current deciding factor for the industry seems to be whether the transport is HTTP.
- Message Transfer
Savas Parastatidis and Jim Webber coined the name MEST to refer to this particular style. They have based their implementation around the use of SOAP and a single, unified interface exposed by all of their services,
ProcessMessage. This particular style is where the line between coarse-grained method and messaging disappears. If you have a unified interface, you still get the benefits of the messaging approach because you can send self-describing messages around. I would imagine that if you looked under the covers of a few JMS implementations, you’d find something similar going on except I suspect they’re using RMI and proprietary interfaces.
I would also say the use of this particular approach using any distributed computing technology to send messages from sender to receiver was consistent with the MEST style. I think most people claiming to have implemented SOA using CORBA have taken a very similar approach–it is really the only way you can manage the complexity.
MEST may be messaging in intent, but it falls down slightly in the loose coupling category. There is no temporal loose coupling because the sender must interact directly with the receiver. It is also difficult to provide spatial loose coupling with this approach because of the nature of the distributed objects foundation. You can do things with redirects, DNS, and interceptors to provide failover and load balancing, but these are not an inherent part of the model. You can also add intermediary services to increase the loose coupling, but it is not clear if this is part of the model either.
Everyone loves Mom. Well, OK, maybe not this MOM (Stacy’s mom? No?). Message-oriented middleware is what most people think of when they think of messaging systems. As previously mentioned, it has the capabilities of providing full loose coupling, however there are generally issues getting different messaging systems to interoperate. There are several ways to solve this problem, but most of them are proprietary and ultimately limit the flexibility of deploying most messaging systems across different organizations. It is possible to implement a successful SOA based on traditional messaging middleware, however it is important to understand the potential limitations of depending completely on a MOM-based SOA.
Several of the reliable messaging specifications are really intended to bring MOM capabilities to the WS-* space. The ebXML specification also provides a reliable transport, along with Microsoft’s BizTalk Framework. While it is possible to use these as the basis of a SOA, most architects and developers do not think about integration from a messaging point of view. I do have hope that these specifications will provide manageability within SOA environments. Eric Newcomer and Greg Lomow provide good coverage of the relevant WS-* specifications in this area in Understanding SOA with Web Services.
Like everything, there’s both a positive side and a negative side. The negative side of MOM is that it provides only asynchronous message delivery. If you are doing high-performance, real-time lookups or other transactions with humans on the other end of the client, MOM is not the best choice. MOM-based SOAs generally end up being a basis for hybrid synchronous and asynchronous service environments, however it is important to remember that synchronous does not necessarily mean it isn’t messaging. Remember MEST.
If you have been hiding under a rock for the last couple of years, you may not know that Representational State Transfer (REST) is the name Roy Fielding coined in 2000 for his Ph.D. dissertation. The easiest way to describe REST is to think about the World Wide Web. Each Web page you see in your browser is actually just a representation of an abstract resource, like a product catalog. Depending on what kind of application you were using, if you accessed the same resource using the same URI, you might be able to view the catalog on your WAP phone, in your Atom or RSS content aggregator, your Web browser or even download a glitzy PDF version. Each one represents a resource or concept–in this case, a product catalog.
Another thing which REST provides is a uniform interface: HTTP, in particular the operations of GET, POST, PUT and DELETE. Any system implemented using the REST architectural style should support this uniform interface, even though most applications may only make use of GET and POST (e.g. Web browsers). Using this unified interface, you can move content between clients and servers in any format, even though most people think about HTTP in the context of HTML, GIF, JPEG and maybe PDF files.
REST is a pretty powerful architectural style for implementing SOA. The biggest problem with it is that it requires a bit of a shift in how you think about data and services. The important thing is that it will work. Scalability and performance are not really unknown issues in a REST system as we’ve had many years to figure out how to build performant websites. The loose coupling is there, primarily due to the uniform interface, but also because of the various intermediaries for HTTP which already exist like proxy servers and caches. You can’t use them all the time, but they do already exist. If you want to provide the same level of loose coupling available in most MOM implementations, you simply insert a highly-available Web server to mediate between senders and receivers. The senders upload self-describing messages using HTTP to be retrieved (and optionally deleted) at some future point by the intended recipient.
The REST architectural style isn’t the solution to every SOA problem, but it does provide an attractive base on which to build SOA services. It leverages technology that most people understand how to implement, manage and tune, and it is also possible to implement services fairly quickly and without large investments in tools and technology. The only main disadvantage (which is getting less every day) is that it does take an adjustment in how to look at problems, because the traditional programming techniques of object-oriented design don’t really fit. Other than that, nearly all of the implicit and explicit requirements for SOA can be implemented using REST.
There was a lot of information here, and I appreciate you making it this far. The problem with these techniques and approaches is that the various tradeoffs don’t really become apparent until you become familiar with each of the alternatives. Most architects and developers don’t have the opportunity to do this in the course of their normal activities. Hopefully, this will help acquaint you with these issues enough to understand the main motivations behind some of the current “religious wars” about SOA and Web services raging in some parts of the blogsphere.
Understand that even though there was a lot of material presented here, this is only a high-level discussion of a few of the issues to be addressed in implementing a successful SOA architecture. I hope to discuss some of these in more detail in future articles (I’ll try and not make them this long though ), but I wanted to provide a common starting point on which to build. If you’re really serious about understanding the issues, spend some time on the various mailing lists, at least read the above referenced books and be wary of anyone who says they have all the answers. No one knows all of the answers yet, and I doubt that they ever will for all situations. This is why it is so critical to understand the tradeoffs Boehm was mentioning at the beginning of this article. It always boils down to the individual judgment of the person making the design and implementation decisions. Good luck.