A data communication standard - OData vs. GData vs. homebrew

Today I discovered OData. Then I discovered GData. Then I discovered that I don't like either one.

The idea of a data standardization protocol is appealing on the surface, but the two "official" implementations of a common web protocol for data consumption are not my cup of tea. I do like tea, however. But I'm picky about my tea and I'm picky about how I communicate with web servers.

First off, I don't like complicated protocols. I find them annoying as do most other programmers. This is one of the reasons why SOAP slipped on itself and fell on its derriere at the starting gate. /hahagreatjoke

I've been working with JSON a lot lately. It benefits greatly from having native support in every popular scripting language and easy-to-integrate libraries in the few languages that don't. So, if we want to develop a standard protocol, JSON is the starting and ending point for modern development.

For some strange, bizarre reason, when a company builds a "standardization protocol", they also waste a lot of time building into said protocol a "discovery service". Let's face it: The concept of a "discovery service" is just broken. It is a LOT easier to skip that mess and just apply a healthy amount of documentation, which works equally well. "Works equally well" (aka Laziness) trumps innovation.

One of the things I do is build a success vs. failure indicator into my JSON replies:

{
"success" : false,
"error" : "The API key is invalid."
}

Or:

{
"success" : true,
"results" : [ ... ]
}

This makes it easy to test, from any programming language, whether or not a specific reply was successful without having to check for the existence of an error message.

My JSON isn't very complex, but I imagine that if I were pulling a ton of data down, I'd possibly break it up with pagination. However, OData's "__next" implementation is not what I'd do. I'd return the LIMIT-ed set of result IDs and offer a mechanism to select 'x' results based on a subset of the IDs with another query because a server has no business dictating to a client how it should paginate results and the server would benefit from performance gains. After all, most pagination results in running the same expensive query multiple times, whereas selecting a bunch of IDs will always use an index.

Another thing I do is version many of the APIs that I write. "RESTful" interfaces are overrated and are kind of annoying when a query string works just as well, perhaps better. So I just stick the version of the client API into my query string. I generally do version breakage - if I upgrade the version of the server, I immediately break all clients using the older version. If I were a large organization like Google, I might keep old API versions around for a few months. However, I'm just one programmer, so trying to support multiple versions of an API will be a waste of my time. I want you on the latest and greatest version, not some old version. But I did just come up with a good idea: Add an "api_expires" response to all queries to the server. Set it to "never" initially. When you want to deploy a new version of an API or terminate an obsolete version, change it to some date/time in the near future. This will allow application developers to have fair warning and be able to programmatically do something (e.g. send an e-mail to themselves).

The last thing I do is encrypt some of my JSON requests and replies using shared secrets. This allows for more secure communication between client and server even over plain-ol' HTTP. I'll save this discussion for a separate post.

So there you have it - my homebew protocol. Simple is better.

Cubic

Search This Blog

A data communication standard - OData vs. GData vs. homebrew

Comments

Post a Comment