03 October 2012

The Best Laid Plans

For those interested in how real world databases and operating systems get along (or not) in the post single-cpu, sorta parallel regime of today, there is a piece on lwn.net which is instructive. I had originally inserted the link, but it's not that simple. The link I used was supplied by a post on the Postgres/Performance email group. For better or worse, lwn.net is by subscription, but it appears that subscribers are free to link to content. So, go subscribe to the Performance email group (you should anyway, here), and you'll get the link in today's posts. It's got "20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets" in/as title.

The discussion in the post revolves (that's a pun) around the inherent conflict in scheduling between an (multi-client) operating system and a (multi-client) database engine. While flagged by the PG folks, any engine which attempts its own scheduling would have gotten caught, to some degree, by the patch. Who said parallel was easy?

Here's a quote which describes the issue:
So what is different about PostgreSQL that caused it to slow down in response to this change? It seems to come down to the design of the PostgreSQL server and the fact that it does a certain amount of its own scheduling with user-space spinlocks. Carrying its own spinlock implementation does evidently yield performance benefits for the PostgreSQL project, but it also makes the system more susceptible to problems resulting from scheduler changes in the underlying system. In this case, restricting the set of CPUs on which a newly-woken process can run increases the chance that it will end up preempting another PostgreSQL process. If the new process needs a lock held by the preempted process, it will end up waiting until the preempted processes manages to run again, slowing things down. Possibly even worse is that preempting the PostgreSQL dispatcher process -- also more likely with Mike's patch -- can slow the flow of tasks to all PostgreSQL worker processes; that, too, will hurt performance.

Spinlocks have been around for at least two dogs' ages. And the description reminds me of IBM's AS/400 (now called something else) solution, and M$'s abortive (so far) effort to replicate it, winFS; in which the database engine and the operating system are merged. Those with even longer memories could be reminded of PICK.

No comments: