12 February 2013

I See the World in Black and White

Once again, thanks to "Revolutionary" Dave for finding something interesting in the world of quant. The post-mortem is too long for the average trip to the loo, but the appendix on Value at Risk (VaR) which is the metric to be modeled, can be read. VaR is just a number of dollars assumed to be lost under varying circumstances (wiki piece). Dave's take on the Excel/R conflict is spot on.

The trader who had instructed the modeler to develop the new VaR model (and to whom the modeler reported at the time), CIO Market Risk, and the modeler himself also believed that the Basel I model was too conservative -- that is, that it was producing a higher VaR than was appropriate.
This is the heart of the continuing quant problem: unlike natural/mechanical processes, human processes (accounting and finance especially, since they only exist as games with arbitrary rules) are subject to "unexplained" perturbations. This is what Black Swans are. Black Swans exist in nature, the archetype being the Alvarez/Yucatan asteroid. The difference is that such natural Black Swans are exogenous to the process, while human Black Swans are largely endogenous; they happen because participants can and do change the rules for self-benefit. Such manipulation is generally impossible to model.
The Model Review Group performed only limited back-testing of the model, comparing the VaR under the new model computed using historical data to the daily profit-and-loss over a subset of trading days during a two-month period. The modeler informed the Model Review Group that CIO lacked the data necessary for more extensive back-testing of the model (running the comparison required position data for the 264 previous trading days, meaning that a back-test for September 2011 would require position data from September 2010). Neither the Model Review Group nor CIO Market Risk expressed concerns about the lack of more extensive historical position data.

Again, using more historical data may only have been useful for face-validity purposes. As always, the fiddlers fiddled:
There is some evidence the Model Review Group accelerated its review as a result of this pressure, and in so doing it may have been more willing to overlook the operational flaws apparent during the approval process.

While Excel is really not appropriate to any activity beyond vanilla accounting, the failure here (as with The Great Recession) was the result of Rand-ian self-interest. The real Adam Smith posited an economy in which each actor is not only autonomous, but also bereft of market power to impose collateral damage. Finance is all about collateral damage; that's how they earn it.

Here, it gets really interesting:
On January 26, the Model Review Group discovered that, for purposes of a pricing step used in the VaR calculation, CIO was using something called the "West End" analytic suite rather than Numerix, an approved vendor model that the Model Review Group had thought was being used. The Model Review Group had never reviewed or approved West End, which (like Numerix) had been developed by the modeler.

Turns out that Numerix is built on Excel (at least in part): "All Numerix functions are available via Microsoft ® Excel® as an add-in".

If you've ever inherited an analysis application executed in 1-2-3/Excel macros, you have my condolences been there done that.

In sum: while the use of Excel, even if it's also used by a major analytics vendor, is problematic; the cause, as with The Great Recession, was corruption and not only the Suits in this case. What's especially galling is that these quants are often some sort of math-y Ph.D., yet use Excel in haphazard ways. The poor will always be with us.

05 February 2013

The Eyes Have It

Boy Howdy! Web adverts are just like TV ads:
One of the most important things that we need to do in the market is help educate people that the click is not really the most important metric for us. In fact now that we've been able to work with companies to look at in-store sales data, we find that of the people who saw a Facebook ad and then purchased the product in the store, 99% of them never clicked on an ad, so re-educating the market what the metrics are that are right for us.

That's from Sheryl Sandberg, COO of Facebook (you can get the transcript from Seeking Alpha). One might conclude that this was a bit of magician's misdirection (don't look at the hat, look at the pretty girl), or it might just be true. The efficacy of adverts has been studied pretty much forever. Ph.D. dissertations have been written.

Yahoo! has this to say:
For example, in their important paper on adfx, Abraham et. al (1990) open with the line, "Until recently, believing in the effectiveness of advertising and promotion was largely a matter of faith" -- a first sentence that might otherwise seem a bit peculiar given that before they penned it, approximately 4 trillion dollars had been spent on advertising.

If it's true that clicks really don't measure effectiveness, then what? Could the whole advert driven web go poof?

As postulated here more than once: an economy based on adverts is inherently unstable. While adverts are a useful adjunct to production, when adverts become what the economy produces? Real life becomes a version of Second Life (you may remember it?). "I'm sorry Dave. I'm afraid I can't do that."

04 February 2013

The Light Dawns

Whilst looking for a web version of Date's puzzling "dropping ACID" muse (I don't feel like typing it all in), I came across something at least as interesting. As the subtitle says, "all things xml". That was penned before NoSql had reared its ugly head, and xml "document stores" were then the rage.

On occasion, I've railed against the NoSql saga. If data doesn't really matter, then any file datastore will do. If it does matter, then either you use an existing TPM (CICS is still around), RDBMS (TPM built in), or roll your own. Since TPMs have been around since before CICS, and there is decades of experience figuring out how to do them, a few kiddie koders with php ain't likely to get there any time soon.
But recently there's been a dawning recognition among NoSQL practitioners and those working in Big Data that the fast-iterating data they process needs to be demonstrably reliable, too. The result has been NoSQL databases adding more relational functionality to their software.

"D'oh!" quoth Homer.

It gets better:
"We think one of the reasons there are so many NoSQL databases on the market is because unless your application perfectly matches their data model, it becomes difficult to build data abstractions."

Well, of course. The NoSql approach is tied to file defs and implementation language. This is the Achille's Heel of API-centric coding. The RDBMS/SQL is client language agnostic, and so long as one sticks to reasonably standard SQL, engine agnostic as well. For those eager to port across engines, there's the likes of SwisSQL.

"A distributed data store without concurrency control is a toy and makes building things on it a lot harder."

What is left unsaid, even by Rosenthal who is, by his own admission, just a coder, is that industrial strength RDBMS (DB2, of course) have had distributed transactions for decades. Gray and Reuter devote much of chapter 16 to distributed CICS, for crying out loud. Nothing as much fun as inventing the wheel. Again. And worser (isn't that a city in Poland?).