Posts
First new Unison product sold
We were showing off the Unison units at #SC13, and while on the show floor, we managed to sell a storage cluster. Well, technically, the sale occurred after the show (last week in reality), but most of the configuration back and forth was during the show. I can’t say anything about the configuration or stack on it … yet … but you’ll be hearing about it fairly soon. Its one we talk about quite a bit.
Posts
Violin's (and other pure flash array vendors) post IPO struggles continue
There’s a story on The Register right now about Violin Memory losing its CTO. But that’s not the real interesting story. In the article, Chris Mellor does a pretty good job of laying bare the issues around Violin.
There are several different threads running through this. First, they don’t have much real software IP. Their hardware IP is a different story, but fundamentally, we’ve found that its best to have a very simple and effective hardware design, coupled with intelligent software.
Posts
You can tell you are a little nuts if ...
… you get really annoyed at the performance of grep on file IO (seriously folks? 32k or page size sized IO? What is this … 1992?) so you rewrite it in 20 minute in Perl, and increase the performance by 5-8x or so. If I get angry enough, I might just go all out, use direct IO, multiple parallel readers, and some other bits. I’ve got these huge disk pipes, awesome bandwidths, and this tiny little filter tool.
Posts
The most popular data analytics language
… appears to be R
[ ](http://revolution-computing.typepad.com/.a/6a010534b1db25970b019b00077267970b-popup)
This is in line with what I’ve heard, though I thought SAS was comparable in primary or secondary tool usage. This said, its important to note that in this survey, we don’t see mention of Python. Working against this is that it is a small (1300-ish) self-selecting sample, and the reporting company has a stake in the results. Also of importance is that R is a package with an embedded programming language, and Python is a programming language with add-ons.
Posts
And the SC13 video from InsideHPC is up
As usual Rich and the team at InsideHPC have done a tremendous job. If you don’t know InsideHPC and its sibling, InsideBigData, I highly recommend both publications. They are on my go-to list as information sources/summaries. The video shows a well caffinated Joe, talking through our new products. The problem for us was there simply wasn’t sufficient time to go into detail on everything. Which is a shame IMO, but one we’ll look at rectifying later.
Posts
The 60 second guide to big data by gogrid
The GoGrid folks have put together a nice marketing slide on big data, in the sense that they are explaining the features of it without explaining it, or how/where they fit. Its implied that they provide all you need for Big Data, but its their points along the way that make a great point for the day job and especially our new Fast Path Big Data Appliances. Our argument has always been that you can’t approach Big Data with last millennium’s architecture.
Posts
Big data languages: the reason for the tests
In a number of recent articles, I’ve seen/read that “Python is displacing R”, and other similar things. Something about this intrigued me, as I had heard many years ago that “Python was displacing Perl”. Only, it wasn’t. And others are questioning the supplantation premise quite strongly. It seems that there is little actual evidence of this. Mostly hyperbole, guesses, and dare I say, wishful thinking. It seems that this is modus operandi for Python advocates, and their latest object of attention is R.
Posts
Riemann zeta function in parallel/vector data languages
Continuing the work of the previous post, I looked into rewriting the serial code to run in parallel/vector data languages. My original supposition about what would make a good data language is now in doubt as a result. First, I used PDL in Perl. But its Perl, right? It can’t possibly be fast. That would be … like, I dunno … wrong? (yes, this is sarcasm). This completes the task in 12s.
Posts
Knights Landing
Over at InsideHPC, Rich has a short take on Knights Landing with a link to the longer article. This is implicitly the direction I thought things would be going in … drop in replacement CPUs to provide acceleration. Probably some big-small designs to handle OS tasks on specific cores (and reduce OS based jitter). This said, 2x such sockets gets you to 72 lanes of PCIe gen 3. A little light for us, but we’ll figure something out (our current units are more than this).