On Exposing Internal Identifiers on Services

September 12th, 2006

Pretty much every application uses some sort of identifiers to keep track of things around; Services are no different, really, most of the time. Sometimes, those IDs will need to be externalized across service boundaries.

One key point to keep in mind, though, is that if you do decide to externalize those IDs they should be regarded by any consumer as opaque entities, and you'll want to clearly specify what the semantic meaning of those IDs are regarding your service operations: What do they identify? what can they be used for? Should they only be regarded as informational data with no strings attached? and so on.

However, a different thing altogether can come back and bite you if you're not careful: Datatypes. What is the data type of those identifiers you're externalizing through your services? Are you externalizing the identifier datatype as well? For example, if you have a guid as your ID, do you expose it from your service typed as a guid?

If you expose identifiers in such a way that they may get used in both directions (i.e. requests to your services and responses from it), then it can be quite convenient to expose the data type as well, as many tools (including ASP.NET WebServices and WCF) can do the type validation work and reject messages with invalid data. Sometimes, however, it might be preferrable to make identifiers more opaque by exposing them as a less-constrained datatype that allows you some flexibility and extensibility towards the future, like a string, for example, even if that's not the real underlying datatype of the identifier. We ran into such a case last week when working on extending a service we had built for one of our clients earlier this year.

The scenario we had was like this: The original service we built received some requests to do some business level operations. As part of those requests, the consumer would send along a request ID which was completely opaque to our service (it was, in fact, a string), as we didn't want to constraint what kind of ID generation mechanism the consumer used. This was pretty important because the service was built with interoperability in mind and it would have multiple consumers implemented using different technologies.

As part of the response, our service would reply with a new identifier, this time one controlled by the service itself that basically said "hey, this requests you sent me, I'm going to keep track of it using this new identifier of my own" (we called this a ConfirmationID). With both the consumer and service IDs in place, a pretty good base was in place to make it easier to reconcile the set of transactions done by a given consumer over a period of time (again something pretty important for anything that deals with money one way or another). All of this actually worked very nicely.

A couple of weeks ago we started to build new extensions and a front-end intermediary for the service so that we could support a new consumer brought by the business with a new and important business deal our client was working on.From a high-level view, everything was fine and dandy. While the protocols were different, the basic semantics of all the operations involved were pretty much covered by our existing service without too much fuzz.

One sticky problem, however, was that the ConfirmationIDs our service gave out had been "strongly-typed" on the service contract (WSDL) as 32-bit integers, which they were. In fact, they were generated on the database we recorded transactions on an identity column! On hindsight this was less than optimal. Even though consumers could still treat our ConfirmationIDs as opaque references, they now could do what we didn't force them to do on their own idenfiers: constraint us, and that's what happens. The new consumer that was brought in had a legacy protocol that restricted confirmation IDs to just 6 characters, using only [0-9A-Z]. Now we had to find a way to connect those two together.

The optimal solution would've been to just pump out IDs from our service in a manner more compliant with the restriction the new client had in (because of the way those IDs are used). However, by having left the service contract specify the ConfirmationID datatype as a 32-bit integer value, this would not have been possible without breaking existing clients (a very big deal). Hindsight is always 20/20, or so they say, and it applied to our dilemma just right: had we hidden our real ID datatype, this would not have been a problem, as we could've fixed this with a simple internal change that didn't affect the service interface in a bad way.

Fortunately, we were able to work around this without actually having to affect the service itself, but it's still a bit more painful than what I would've liked. Basically, we took advantage of the fact that a six character string using the [0-9A-Z] element set could encode far more values than a regular 32-bit signed integer (positive numbers only, of course). We then were able to generate a 6-char string by just encoding our original ConfirmationID on a base-36 number system (which was fully reversible and easy to calculate).

Technorati: Architecture, Web Services, Services, Service Orientation

Tomas Restrepo

Software developer located in Colombia.

On Exposing Internal Identifiers on Services

Share Post

Tomas Restrepo