17 March 2010

NoSql <<- No Brainer

The NoSql squad is on the loose again, therefore it is time once again to slay that particular dragon. Andy Oram, of O'Reilly fame writes about a recent NoSql conference (prior to it). The usual suspects are listed.

The hallmark (or carbuncle) of the NoSql "databases" is that they have no structured access, no transactional support, and no data reduction. You get all those, and more, with relational (sql, mostly) databases. What these NoSql (Oram's list: Cassandra, CouchDB, HBase, HypergraphDB, Hypertable, Memcached, MongoDB, Neo4j, Riak, SimpleDB, Voldemort) datastores do is just what files did with your grandfather's COBOL code; store data in an application specific format. You can browse through them at your leisure.

What they have in common is the use of key/value pairs for data. The proponents fail to comprehend that their "data" is just what an index is in a relational database. And it's nothing new. Back in the 1960's random access was supported by the "fully inverted file" paradigm. Here's what Joe Celko has to say ("Joe Celko's Data and Databases"): "In an inverted file, every column in a table has an index on it. It is called an inverted file structure because columns become files." Nothing new here, please move on. Gad, these young-uns persist in re-inventing square wheels.

5 comments:

Roboprog said...

Found a somewhat interesting link on a (the?) Cassandra promotion site via "hacker news" today:
https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/
(relevant to me since they switched from PostgreSQL)

Related to what you have been promoting, this seemed to be a particularly relevant quote:

>> ... we faced issues of needing a much bigger machine and faster rotational disks ...

i.e. - if your "disks" don't rotate, would that be fast enough?

It's interesting to note that they have some SQL systems still, but they are MySQL. This suggests to me that this is their default "database", but they were forced to use PostgreSQL since MySQL does not scale well on multiprocessor machines.

Back 10 years ago or so when I was "shopping" for an open source database, I compared MySQL to PostgreSQL. MySQL was really scary (how about "ROLLBACK is not implemented", for starters) at the time. Most of the "ACID" stuff was left out then (around 2001). The MySQL team gradually added ACID support, and other extensions that PostgreSQL already had, but they pretty much lost all credibility in my eyes from then on. I've already done my share of dBASE / XBase apps in the late 80s. Slapping SQL on top of pretty much the same kind of technology is just making things worse, in my mind.

Along comes the Cassandra project: Oracle buys ... the MySQL support team, and many of them leave. They start a child project to strip down the current version of MySQL even more.

Why do I want this???

OK, so it's fun to preach to the choir. Any thoughts on this (particular link and/or Cassandra)?

Roboprog said...

Some empirical fuel for the fire:

http://www.yafla.com/dforbes/The_Impact_of_SSDs_on_Database_Performance_and_the_Performance_Paradox_of_Data_Explodification/

This guy puts together a test of one of these "NoSQL" claims. The target site claimed big improvements when they switched from MySQL (why is it always MySQL???). So this guy rigs up a simulation using humble SQL-Server (not my cup of tea, but that's the point -- it's pretty run of the mill), and stomps on their stats.

Then, he sets up an SSD and blows off the doors. Check it out for the numbers and methodology.

Robert Young said...

I'll spend some time tonight with Cassandra and the test. Been a bit busy trading.

Robert Young said...

Well, Cassandra looks a lot like dBaseII. That's not a complement. Once again, moving data control away from the data and back to the siloed application. Very Carnaby Street.

What's so infuriating about such zealots is that they come, by and large, from the OO world. And the prime directive of the OO world is that method and data are co-located, an idea which Dr. Codd was the first to define.

Robert Young said...

As to Dennis Forbes, I've added his site to Good Stuff. While he says he is not a DB guy, he shows more insight than I've usually come across in a member of the Younger Generation. Recommended.