Slides from HPC on Wall Street Spring 2014 are up

See here. Very good conference, lots of good discussion.

Viewed 29882 times by 1858 viewers

hate to be an alarmist, but Heartbleed is worse than I had thought it was

TL;DR: Run, as in now, before you finish reading this, to update vulnerable OpenSSL packages. Restart your OpenSSL using services (ssh, https, openvpn). Then nuke your keys, and start all over again.

Yeah, its that bad.

I had hoped, incorrectly, that no one would start asking, “hey, can we exploit this in the wild?” any time soon.

Unfortunately … exploits are live and out there. Have a look at this session hijacking done using the bug.

Understand this leaves no trace, no fingerprints. Since the server memory, with primary key data (yes, the secret key used to encrypt bits) is completely vulnerable, and obtainable … yeah … this is not good.

Patch, restart services or machines, nuke your keys, hide your young and get your rifle ready. The barbarians are at the gates and are exploiting a weakness.

Viewed 63135 times by 2880 viewers

Sometime things work far better than one might expect

The day job builds a storage product which integrates Ceph as the storage networking layer.

What happened was, in idiomatic American English: We made very tasty lemonade out of very bitter lemons.

For the rest of the world, this means we had a bad situation during our setup at the booth. 3 boxes of drives and SSDs. 2 of them arrived. The 3rd may have been stolen, or gone missing, or wound up in a shallow grave somewhere. Either way, it wasn’t there.

So our demo “couldn’t” run. I asked Russell to do what he could to salvage this, but I was kind of pissed off.

Thus the bitter lemons.

Russell had built the storage system as an object target, with replication and erasure coding.

We lost 1/3 of our drives. 20 out of 60.

And Russell revived the storage, simply telling the storage layers that those OSDs were permanently gone.

No bother, and off it went. No rebuild required. Storage was available.

This … is … HUGE.

Under circumstances that would render other systems completely unsalvagable, it was quite easy to bring this back up.

This comes in part from the Ceph design. Also in part from the replication and the erasure coding. Erasure coding should be considered the logical replacement of RAID 5 or RAID 6. There’s really much more to it than that, but for new designs, users need to really start looking at deploying erasure coding.

But we recovered, easily, trivially, from a loss of 1/3 of our devices. On a show floor, under very much less than an ideal situation.

Tremendous cudos to the Ceph team. As we get our own erasure coding bits finished we’ll likely incorporate them within Ceph, but for the moment, leveraging whats in the existing stack, its working extraordinarily well.

And while we are at it … a booth visitor asked if Ceph was reliable yet. I think this is backwards. Ceph is extraordinarily reliable. The performance was excellent, in large part to our Unison design. I know that our partners need to be non-committal in their hardware discussions, but our argument has been that putting good quality software stacks on “meh” hardware gets you “meh” results, at best. Good quality hardware and stacks gets you great results. Performance is an enabling feature, and when you are moving around as much data as we were, you simply can’t afford to use slow/inefficient designs. More to the point, in order to make the most efficient use of resources, you need to start with efficient resources. Not poor ones.

Viewed 65123 times by 3029 viewers

Sometimes the right level of caffeination helps in work

I had an opportunity to review an old post I had written about playing with prime numbers. In it, I wrote out an explicit formula for a number, expressed as a product of primes. This goes to the definition of a composite or a prime number.

Whats interesting is what leaps out at you when you look at something you wrote a while ago.

Looking at the formula I wrote down, there is a very easy way to define if a number is prime or composite.

For the e(i) ∈ {Integers}, a number is prime if ∑i = 1 .. ∞ e(i) = 1. It is composite if ∑i = 1 .. ∞ e(i) > 1.

All you need are the e(i) signatures of the number. And technically speaking, you only have to search them up to φi > √N + 1. Which if you think about it, is an embarrassingly parallel factorization problem.

Shor’s algorithm it isn’t. But I might see if I can do some quick MPI (or even better, Julia!!) bits with this to try this very very big integers. See how it performs …

Yeah, I know, I am supposed to be writing my talk. Working on that too …

Viewed 78864 times by 3178 viewers

Doing what we are passionate about

I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude.

Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.

I’ve been, and remain, a speed demon. No, not the way I drive … but the way we design and build systems. We’ve shown again and again how well designed and implemented systems can demolish more general systems which aren’t designed to the problems at hand.

I enjoy this, and have been doing things like this since the mid 1980′s in one form or the other. Optimizing code, rebuilding hardware (yeah, I put NEC V20′s or V30′s in my IBM PCs for “free” speedup). Tweaking OSes. Tuning drivers. Though our current linux kernel patch sets are smaller than they’ve been in the past, they are still important. Tuning apps …

These are the things we’re passionate about. Building hellaciously fast hardware and software stacks.

We’ve done great things over the last 12.7 years. Really incredible things. Things that no small company, ought to be able to do. We’ve come a long way from being a one man shop operating out of my basement. But we’ve never lost that passion, the drive to go faster. Or bigger.

If anything, that drive has gotten worse. We see data growth rates driving 60+% growth per year in capacities. We see data motion, which I referred to as being “hard” in 2002, as being one of the most critical factors going forward in any storage, computing, and networking system. I had my eye on that exponential curve 12 years ago, worried about when it would start flattening out. It hasn’t yet. It has to some time soon though.

Because we have to store and process all that data, that means we have to move all that data. That realization ~10 years ago let me to explore cluster and scale out file systems … ones without single central points of information flow. Ones where when you add capacity you add bandwidth. These are the only systems that matter now, as the filer/array model is a rapidly declining legacy platform … suitable for replacement, not refresh. Its that way, as you can’t scale this type of design out horizontally, and that simply doesn’t work for gargantuan data volumes. Which must be scaled out in that manner. Object storage is a great case-in-point for this.

It was the realization 8 years ago that RAID builds and rebuilds were highly problematic for reliability due to potential correlated device failures, and recomputing block parity/CRCs on unused blocks made no sense (yet this is what all hardware RAID does). This is what led us to research FEC methods, and develop new ways of thinking about these things. We can implement very space and performance efficient designs leveraging smart algorithms and acceleration technologies.

It was the realization 10 years ago that HPC had, to a degree, hit a clock speed wall, that drove us to look at accelerators, what I called APU (Accelerator Processor Units) as a riff on CPU, back then. AMD appropriated the name (which is fine as they paid us to write white papers for them, where I used the term), and slightly changed its meaning. But the concept was that a very fast processing system, designed to perform one type of tasks very well, could do an outstanding job offloading those tasks from the CPU. You needed a powerful software stack, and as I had learned developing CT-BLAST (SGI’s GenomeCluster product), you need to make it drop-in easy to deploy. It has to be simply faster (hey, thats our tagline!) which means you can’t make it hard to use … it has to be bloody easy, and demonstrably faster. We got there with accelerators for specific problem sets, but all our attempts (pre-2006) to raise capital to build them and the ecosystem, failed. No one cared about this space we were told, and accelerators were unimportant. But we were passionate about this. And we kept banging on this until we could see that no one really was interested … in funding it. Today, 2014, where we predicted accelerators would hit about a 5-10% penetration on the computing market … we could be low in that estimate.

Everything we’ve done, we’ve done because we’ve believed our future is data and processing rich. Moving data is hard, so you need to have that motion occur in parallel, and simultaneously be as local as possible. Computing on this data is hard and often slow, so you need acceleration technology. Extracting useful insights is demanding, so we’ve developed very high performance appliances focused upon enabling people to seamlessly use massive quantities of data, very very rapidly for their computations.

Its because performance is an enabling technology. It really is. Its also a green technology. I’ll explain in a moment.

For an enabling technology, this is something that opens up completely new possibilities, that would have been simply out of reach in the past. For any reason … cost prohibitive, performance restricted, etc. Accelerated processing enables many more and more efficient processor cycles per unit time. So it reduces the energy cost per cycle (hey, doesn’t that help make it greener?). This makes computationally expensive problems often more tractable. Heck, if you have the right gear, it make things that could be hard and makes them possible as part of day to day usage, and not merely for special cases.

Its these things that change the landscape. We gave examples of what we could do with one of the technologies to a few customers, and most of them went away thinking “meh”. Every one of them came back later on and used words to the effect of “you can really do that? That would change everything for us …”. Really … they are right. It does change everything.

I wasn’t kidding about greener. I didn’t mention costs, but its fairly trivial to show that you need (often far) less hardware to provide the same performance with a well designed computing and storage system. Which lowers your acquisition and TCOs. But for a fixed cost, you get more performance. So you can play that either way. And again, its that efficiency on a per processor cycle basis that drives the relative green-ness.

What? Joe is concerned about “green”? Damn right. I want all of my systems to use less power, and use what they use more efficiently. I want to be able to pack my systems denser, use less cooling for the same performance/capacity as I’ve done before. Its an “elegance” in engineering bias on my part, I like more efficient systems. And more efficient systems should cost less to deploy over their lives. Not always (CF lights anyone? Good concept, terrible implementation, and the added Hg? Not so smart).

Why all of this? Well, past is prologue as Shakespeare said. All of these bits are needed for the next acts.

And they will be a doozy.

The first hints of them are going to be discussed this year. Never mind the next set of benchmarks and other bits. We’ll get those done, and we’ll probably set a few more records. Guessing at this … :D

But what I am talking about is enabling something … Its far more than fast storage, fast computing, etc. This isn’t a solution looking for a problem. This is much different.

None of this would have been possible without that passion to build something that enables people to think and work differently. Its the passion that matters. Building things, that people can use to solve effectively intractable problems … yeah, that makes us feel good.

Viewed 93381 times by 3478 viewers

Negative latencies

I’ve been thinking for a while that our obsession with reduction of latency in computing and storage could be ameliorated by exploiting a negative latency design. A negative latency design would be one where a hypothetical message would arrive at a receiver before the sender completed sending it.

There are a few issues with this. First off is how on earth, or elsewhere, is this possible? Second, aren’t there issues with causality violations? Third, won’t the FBI/SEC/DOJ take actions against organizations using negative latency trading systems?

Ok … to start with question 1, we need a quick review of some “elementary” special relativity.

Einstein had postulated the the energy associated with a particle had rest mass and momentum components. His infamous equation is written as

$$E^2 = p^2 c^2 + m_0^2 c^4 $$

For which there are, as every student of basic algebra knows, two solutions. The positive energy solution, everyone is familiar with. Its the negative energy solution that has interesting properties. For this solution to exist, given that $$p^2 >= 0$$
$$c^2 >= 0$$

we would need a so-called imaginary mass, as in factors of
$$i = sqrt(-1)$$
mass. This would be an imaginary number. Not a made up number or a fantasy number.

These solutions are called tachyons, and have interesting properties. The most interesting property is the superluminal transmission rate.

Continue reading »

Viewed 95696 times by 3737 viewers

HPC on Wall Street session on low latency cloud

See here for the program sheet.

The session is here: HPC on Wall Street Flyer

Description is this:

Wall Street and the global financial markets are building low latency infrastructures for pro-
cessing and timely response to information content in massive data flows. These big data flows
require architectural design patterns at a macro- and micro-level, and have implications for
users of cloud systems. This panel will discuss, from macro to micro, how new capabilities and
technologies are making a positive impact. This includes software defined networking, comput-
ing, and tightly-coupled computing and storage designs to enable low latency trading platforms.

Please do come and visit, stay for the talks and let us know what you think!

Viewed 73563 times by 3428 viewers

Arista files for IPO

From Dan Primak’s Term Sheet email

Arista Networks Inc., a Santa Clara, Calif.-based provider of cloud networking solutions, has filed for a $200 million IPO. It plans to trade on the NYSE under ticker symbol ANET, with Morgan Stanley and Citigroup leading a 17-bank underwriter group. The company reports $42 million of net income on $361 million in revenue for 2013, compared to $21 million in net income on $193 million in revenue for 2012. Shareholders include Andy Bechtolsheim.

Viewed 49993 times by 3258 viewers

Intel ditches own Hadoop distro in favor of Cloudera

Last year, Intel started building its own distro of Hadoop. Their argument was that they were optimizing it for their architecture (as compared to, say, ARM). Today came word (via that they are switching to Cloudera.

This makes perfect sense to me. Intel couldn’t really optimize Hadoop by compiler options to use new instruction capability (part of their selling point), as Hadoop is a Java thing. And Java has its own VM, and many performance touch points that have nothing to do with processor architecture. Indeed, its very hard to optimize Java for a particular microarchitecture, as Java does its utmost to hide the details of that microarchitecture from you. And push you up stack. Fine for apps, not so fine for hard core high performance.

There is a bigger picture/thread that big data is not defined to be Hadoop, but we don’t need to touch that here. Hadoop is one of the tools used in large scale analytics. Optimizing is more a function of IO/network design and higher level job distribution/layout than it is of processor microarchitecture. Thus this tie-up makes perfect sense, as Intel can continue to do what it does best, and have the cloudera folks look at doing a better job in the core at making use of the microarchitecture (which, as I noted, is very hard on a system that tries to hide it from you).

Viewed 65593 times by 3724 viewers

Nice interview with Freeman Dyson

Freeman Dyson is an incredible scientist. I imagine he, Terrance Tao, Paul Erdos and a number of others are all woven from the same cloth. Dyson has done some amazing work, and probably will do some more amazing work. The interview is here.

One of the comments he made really struck me as being dead on correct …

You became a professor at Cornell without ever having received a Ph.D. You seem almost proud of that fact.

Oh, yes. I?m very proud of not having a Ph.D. I think the Ph.D. system is an abomination. It was invented as a system for educating German professors in the 19th century, and it works well under those conditions. It?s good for a very small number of people who are going to spend their lives being professors. But it has become now a kind of union card that you have to have in order to have a job, whether it?s being a professor or other things, and it?s quite inappropriate for that. It forces people to waste years and years of their lives sort of pretending to do research for which they?re not at all well-suited. In the end, they have this piece of paper which says they?re qualified, but it really doesn?t mean anything. The Ph.D. takes far too long and discourages women from becoming scientists, which I consider a great tragedy. So I have opposed it all my life without any success at all.

I’ve used similar language, describing a Ph.D. as a union card. And I agree it takes far too long in physics. You are in your late 20s to early 30s when you finish up. Then you start your postdoc(s). Which are just queuing and filtering systems for the very small number of open faculty spots ever year. So often, you are in your late 30s, early 40s before you start your tenure track, and if things work out, you’ll get tenure 5-ish years later. If not, you get to find a job in the real world, with little practical experience, and a set of degrees and training which mean you are an expensive resource.

For 20+ years I’ve been thinking we are approaching this incorrectly, that the system in place was engineered for a different time. Its nice to see that confirmed.

Oh … and if you are a strong supporter of/believer in catastrophic anthropogenic global warming (e.g. you are of the opinion that IPCC and related bits are actually representative of reality), you will be quite disappointed in Mr Dyson. But that is life, and in real science, which is never, ever settled, you have opinions across the spectrum from strong support to strong rejection. Its refreshing to see a bold statement from such an esteemed scientist.

Viewed 56133 times by 3714 viewers