03 October 2014

The Immutable Truth

The good folks over at simple-talk did publish my musing on the benefits of an immutable RDBMS. They ain't none. But the ensuing comment stream leads me to conclude that a couple of aspects remain cloudy. Thus, since the comments are already as long as the piece and this endeavor is the source of all that is my paean to Codd, I'll take a few paragraphs to discuss.


Accounting

The patient zero of immutable databases is the "accountants don't use erasers" meme. The notion is that immutability of accounting data, keep writing transactions serially, is what makes accounting accurate. If only RDBMS engines did just that, all of their data would be accurate. Baloney. In the first place, accounting is anything but accurate as a matter of structure. "I have just one word for you Ben: Enron (or Lehman or AIG or Madoff or ...)." I recall reading, during one of the many corporate scandals a few years back, someone asserting that a CFO who couldn't hide $10 million in ten minutes shouldn't have the job. And, of course, the Lehman Repo 105 gag; is financial engineering at its finest. That sort of thing.

What makes accounting accurate, to the extent that it is done as prescribed, is double entry bookkeeping, invented (from today's perspective) by the British (or Dutch?) East India Company. Immutability has nothing to do with it. This is the source of balance, which is the notion of "accuracy" that occurs. The simple requirement of each transaction to balance (debits must be matched with credits, netting to 0) is the basis for accuracy. The accounting books/ledgers thus add up "correctly", but offer no guarantee that the business is actually as described by these transactions. The books will be accurate, which is to say internally consistent, but can hide all manner of criminality. The same is true (well, without the felon bit) of RDBMS. Both operate with a closed world assumption; neither can claim that their data is congruent to the real world per se.


MVCC and immutability

MVCC can be seen as immutability lite. What isn't necessarily clear is that, due to the implementation of such optimistic locking, normalcy is more significant with MVCC databases. With an Organic Normal Form™ schema and MVCC engine, collisions boil down to: "last writer of non-key data wins", and since there won't be intra-row (or intra-table) dependencies, writers just keep on truckin'. That is a good thing. Traditional locking assumes that collisions must be avoided, so row level locking blocks writes. However, with less than ONF in an MVCC database, collisions become more likely and have to be "fixed up" after the fact, either with client-side code or by tossing the transaction back to the user's screen. As the Fram filter advert guy used to say, "you can pay me now, or pay me later". Later being the more expensive opportunity, of course.

No comments: