The internet is chock-full of articles, blog posts, and discussions about RPC and REST. Most are targeted at answering a question about using RPC or REST for a particular application, which in itself is a false dichotomy. The answers that are provided generally leave something to be desired and give me the impression that there are a slew of developers plugging RESTful architectures because they’re told that REST is cool, but without understanding why. Ironically, Roy Fielding took issue with this type of “design by fad” in the dissertation in which he introduced and defined REST:
“Consider how often we see software projects begin with adoption of the latest fad in architectural design, and only later discover whether or not the system requirements call for such an architecture.”
If you want to deeply understand REST and can read only one document, don’t read this post. Stop here and read Fielding’s dissertation. If you don’t have time to read his dissertation or you’re just looking for a high level overview on RPC and REST, read on. To get started, let’s take a look at RPC, REST, and HTTP each in some detail.
Remote Procedure Call (RPC) is way to describe a mechanism that lets you call a procedure in another process and exchange data by message passing. It typically involves generating some method stubs on the client process that makes making the call appear local, but behind the stub is logic to marshall the request and send it to the server process. The server process then unmarshalls the request and invokes the desired method before repeating the process in reverse to get whatever the method returns back to the client. HTTP is sometimes used as the underlying protocol for message passing, but nothing about RPC is inherently bound to HTTP. Remote Method Invocation (RMI) is closely related to RPC, but it takes remote invocation a step further by making it object oriented and providing the capability to keep references to remote objects and invoke their methods.
Representational State Transfer (REST) is an architectural style that imposes a particular set of constraints on an interface to achieve a set of goals. REST enforces a client/server model, where the client is interested in gaining information and acting on a set of resources that are managed by the server. The server tells the client about resources by providing a representation of one or more resources at a time and providing actions that can either get a new representation of resources or manipulate resources. All communication between the client and the server must be stateless and cachable. Implementations of a REST architecture are said to be RESTful.
Hypertext Transfer Protocol (HTTP) is a RESTful protocol for exposing resources across distributed systems. In HTTP the server tells clients about resources by providing a representation about those resources in the body of an HTTP request. If the body is HTML, legal subsequent actions are given in A tags that let the client either get new representations via additional GET requests, or act on resources via POST/PUT or DELETE requests.
Given those definitions, there are a few important observations to make:
- It doesn’t make sense to talk about RPC vs REST. In fact you can implement a RESTful service on top of any RPC implementation by creating methods that conform to the constraints of REST. You can even create an HTTP style REST implementation on top of an RPC implementation by creating methods for GET, POST, PUT, DELETE that take in some metadata that mirrors HTTP headers and return a string that mirrors the body of an HTTP request.
- HTTP doesn’t map 1:1 to REST, it’s an implementation of REST. REST is a set of constraints, but it doesn’t include aspects of the HTTP specific implementation. For example your service could implement a RESTful interface that exposes methods other than the ones that HTTP exposes and still be RESTful.
What people are really asking when they ask whether they should use RPC or REST is: “Should I make my service RESTful by exposing my resources via vanilla HTTP, or should I build on a higher level abstraction like SOAP or XML-RPC to expose resources in a more customized way?”. To answer that question, let’s start by exploring some of the benefits of REST and HTTP. Note that there are separate arguments for why you would want to make a service RESTful and why you would want to use vanilla HTTP, although all the arguments for the former apply to the latter (but not vice versa).
The beauty of REST is that the set of legal actions from any state is always controlled by the server. The contract with the client is very minimal; in the case of HTTP’s implementation of REST the contract is a single top level URI that I can send a GET request to. From that point on the set of legal actions is given to the client by the server at run time. This is in direct contrast to typical RPC implementations where the set of legal actions are much more rigid; as defined by the procedures that are consumed by the client and explicitly called out at build time. To illustrate the point, consider the difference between finding a particular product by navigating to a webpage and following a set of links (for things like product category) that you’re given until you find what you’re looking for, versus calling a procedure to get a product by name. In the first case changes to the API can be made on the server only, but in the second case a coordinated deployment to the client and server is required. The obvious issue with this example is that computers aren’t naturally good at consuming API’s in a dynamic fashion, so it’s difficult (but possible in most cases) to get the benefit of server controlled actions if you’re building a service to be consumed by other services.
One main advantage of using vanilla HTTP to expose resources is that you can take advantage of a whole bunch of technologies that are built for HTTP without any additional effort. To consider one specific example, let’s look at caching. Because the inputs of an HTTP response are well defined (querystring, HTTP headers, POST data) I can stand up a Squid server in front of my service and it will work out of the box. In RPC even if I’m using HTTP for message passing under the hood, there isn’t any guarantee that messages map 1:1 with procedure calls, since a single call may pass multiple messages depending on implementation. Also, the inputs to RPC procedures may or may not be passed in a standard way that makes it possible to map requests to cached responses, and requests may or may not include the appropriate metadata for to know how to handle caching (like HTTP does). It’s nice to not have to reinvent the wheel, and caching is just one example of an area where HTTP gives us the ability to stand on the shoulders of others.
Another advantage to HTTP that I’ll just touch on briefly is that it exposes resources in a very generic way via GET, POST/PUT, and DELETE operations which have stood the test of time. In practice when you roll your own ways to expose resources you’re likely to expose things in a way that is too narrow for future use cases, which results in an explosion in the number of methods, interfaces, or services that end up becoming a part of your SOA. This point really warrants a blog post of it’s own, so I’ll avoid going deeper in this particular post.
There is a lot of additional depth that I could go into, but in the interest of keeping this post reasonably brief I’ll stop here. My hope is that at least some folks find the material here to be helpful when deciding how to best expose resources via a service! If I’ve missed anything or if you feel that any of the details are inaccurate, please feel free to comment and I’ll do my best to update the material accordingly.