Posts

Brittle, poorly designed pipelines

One of the more powerful aspects of cluster and cloud computing is the effective requirement for building in fault tolerance of some sort, to a computational pipeline. You have to assume, in a wide computation scenario, that some aspect of your system may become unavailable. Which means you need a sane way to save state at critical points in your workflow. You need sane distribution and management of the workflow. You need to be able to route around errors.

Posts

Two "new" projects

Hyperspace and 26. One does something wholly unholy, and the other takes our significant advantage in a particular area and makes it … well … even more of an advantage. Hyperspace may be with us at HPC on Wall Street. Working on it with our partner in … er … alternative dimensions. Yeah. Thats the ticket! Assuming everything works out, you will hear about 26 before SC12, and probably see a few there.

Posts

SmartOS now booting from Tiburon

Ok … took a little bit of hacking on Tiburon to add a capability I had long wanted to add in. And its not completely doing things the SmartOS way … but it works for the moment. [ ](/images/SmartOS-booted-from-Tiburon.png) Have some additional testing to do, drivers to test, yadda yadda yadda. But the message should be clear. We can boot SmartOS from Tiburon (Scalable Informatics siCluster Storage and Computing cluster infrastructure).

Posts

A code to measure IOPs/Bandwidth

Many testing codes for storage systems report various values, by shoving IO down the pipe, and measuring amount shoved, and interval between the first IO call and “end” of last IO call. This is all well and good for some cases, but caching and many other effects get in the way of accurate measurement. Systems eventually settle down to an approximate state with small perturbations around this state. The problem is that most tools don’t quite report this.

Posts

I see benchmarketing back in full swing

I’ve read quite a few storage press releases talking about how “product X is capable of performance Y and IOPs Z.” I also notice that they didn’t say “we measured this, this way, and this is what we found.” I wonder why. I look at it this way, if we reported numbers the way lots of these folks report numbers, our JackRabbit JR5 machine would have a bandwidth of 6.2GB/s read and 5GB/s write.

Posts

Rereading posts from 6 years ago ...

NFS sucked then as well. We’ve got a customer whom occasionally pushes their hardware a wee bit too hard. And stuff comes crashing down. Basically it looks like a kernel bug, one I’ve not been able to ID for a number of reasons, and I can’t find a mechanism to reliably tickle it. This is the definition of a Heisenbug. Basically the problem is this. They use NFS, extensively. NFS is great for low level IO rates.

Posts

started playing with SmartOS for the day job

This is a very cool concept, something that meshes perfectly with our Tiburon based siCluster philosophy. That is, compute nodes should boot diskless, there should be very little state on each node, and stuff that you need to do should be made absolutely as simple as possible. SmartOS is a project of Joyent. Joyent, for those not familiar with them, are a cloud company, building a nice public cloud for end users to build on.

Posts

Dear DEA ...

According to this you don’t have enough high performance storage for your analyses. First off, no, its not expensive. You are just using the wrong vendors. Second off … please … PLEASE … call us. We’d be happy to hook you up with Petabytes for the price you are likely paying for Terabytes. Seriously. Our units are inexpensive enough that you could buy them, replicate the data across them, and then store them.

Posts

one of the curious features of our history

This is about learning, not from mistakes, but from a … well … empirical approach to “partnerships”. When I started up the company 10 years ago, we weren’t on anyones radar. Self funded, running out of my basement. Yeah, real big threat there. I noticed something though. During our time operating, first as an LLC, then as an Inc., we attracted a range of … er … partners and others. Many of whom would come to try to, for lack of a more accurate way to phrase this, pry ideas, plans, and IP/designs out of us.

Posts

grub drive enumeration

So there you are, helping a customer out with a problem. They’ve just added in a replacement OS disk using your process. At the end of the process is a bit of … well … an insurance procedure. Make sure grub is correctly on each drive in the RAID1. The grub.conf file has root (hd0,0) kernel .... root=/dev/md0 ... initrd ... Makes sense, right? Cause hd0 enumerates to the first bios drive used for booting in the boot list.