Military Supercomputers: Paying only 3-6 times more than they have to

News across the wire this morning that the DoD is buying a Linux supercluster to do things like play the world’s largest Deathmatch of Quake 3.

Since this is my line of work, and since folks who sell computers have a great track record of ripping off people who don’t understand computers (in this case, Congress), I was kinda curious to find out how much the generals had paid for this puppy. Federal Computer Weekly didn’t seem to care. The vendor’s not telling. Finally, I learn what I need to know from—the cost of the system is undisclosed.

At least now I know I can stop looking.

The new computer will run 2,132 Xeon 3.6 GHz processors and will have a theoretical top speed of 14.2 teraflops, which all sounds very impressive. A flop is a floating-point operation per second. A teraflop is a trillion of those. By way of comparison, the fastest computer on Earth is rated at 35.9 teraflops, give or take 40 billion flops.

But I still want to know what the thing costs. Because if there’s one thing I know about my industry, when the technical details and hand-waving starts, that’s usually when the salesmen are getting the unsavvy to reach for their wallets. The more impenetrable you can be, the more likely you are to make the sale. Case in point: do you really know what it means that your computer runs at 2.0 GHz? But you thought “ooooh” when you bought it, didn’t you?

Actually, it’d make much more sense to buy a computer based on its gigaflop rating. But I digress.

Having gotten no love from the press coverage, I wondered if there was anything to learn from the laboratory that bought it. That would be the Army Research Laboratory Major Shared Resource Center, whose motto is, “We’re going to scare the hell out of you before we give you any information.” Here’s what’s plastered on the bottom of every page:

WARNING!! This Department of Defense interest computer system is subject to monitoring at all times. Unauthorized access is prohibited by Public Law 99-474 (The Computer Fraud and Abuse Act of 1986). Users are advised to read and agree to the following Security Notice.

The security notice repeats the warning above, then repeats that this is a DoD “interest computer system”, then repeats that you’re being monitored, and then tells you that if you break the law, you’ll be reported to the appropriate authorities. In other words, what could happen on any web site, including the ones that don’t try to petrify you.

Incidentally, I was glad to be advised to agree with the Security Notice, which told me that the act of reading it—or visiting any part of the site whatsoever—constituted my agreement. Apparently, if I want to disagree with the Security Notice, I need to use clairvoyance to do so.

Anyway, once I changed my shorts, I hit the site. The helpful navigation system they provide is reproduced to the right. On the home page, ARL treats me to such wonderous prose as, “First, the unclassified IBM p690 system was upgraded on three different axes: processors – from 64 to 128, processor speed – from 1.3 GHz to 1.7 GHz, and switch – from dual-plane colony to IBM’s new High Performance Switch (HPS).” This project was part of—you’re reallynot going to believe this—the Technology Insertion 2003. Wow, if I knew that I was inserting technology for a living, I’d have tried to find out if I needed a permit or something.

Anyway, if I’m reading this right, the story is that they’re very excited that they’ve upgraded to a brand new system only a few months ago, which is now wholly obsolesced by the system announced today. Makes one wonder what the ROI is on Technology Insertion 2003.

Still no information on the cost of the new whizbangery, but I note back on the TOP500 list that Los Alamos National Laboratory—known the world over as the place to go whether you’re trying to burn down a neighborhood in Japan or New Mexico— has their own sweet setup from the same vendor. Okay, if I can’t get details, I’ll get reference points.

Lesson 1: after spending all that money on building nukes and fighting fires, apparently there’s no money left over for a copy of Photoshop. The Community Relations page at LANL has some nicely unedited graphics straight off the scanner, and weighs in at 4.1 megs—not a big deal until you note that New Mexico has only 8 percent broadband penetration, among the lowest in the nation. Perhaps they need some technology insertion? In any case, I’m not too convinced about the relating LANL is doing with their community.

After some sleuthing here, I finally get some payback for my short-term obsession, when in a total reversal a press release actually tells me what something costs. Los Alamos’ supercluster, which has a theoretical peak of 11.26 teraflops, cost just under $10 million. Although they actually say “trillion flops”, because you never know when a member of Congress might be reading.

But the TOP500 list, annoyingly, doesn’t work based on theoretical maxima—which you can generally only get if you immerse your entire computer and all employees in liquid nitrogen—they actually test the darned things. And the actual usage for the Lightning system came out to 8.1 teraflops. Which means that we can probably expect today’s new system to come out at around 10 teraflops, despite the 14.2 theoretical terafloppery of the press release.

So doing a little math, LANL’s baby cost about $1.2 million per teraflop, and that’s apparently good enough to keep handing contracts to Linux Networx.

On the other hand, if you wanted to get a piece of this yourself, you can pick up an Apple Xserve G5, rated at 30 gigaflops, for about $3,000. That would come out to $100,000 per teraflop.

But it’s not fair to compare one to the other; these things simply don’t scale up like that. So instead, I’ll compare Lightning, #6 on the TOP500 list, to Virginia Tech’s X system, #3 on the TOP500 list. That monster is also built around the Apple G5, and weighs in at 10.3 teraflops at a cost of $5.2 million. Which works out to $500,000 per teraflop, or less than half the cost of the government solution.

Hmmm… but that $5.2 million number includes the cost of building the air-conditioning and other special systems required to house it. I have no idea if LANL’s number includes that; if they don’t, then the comparable cost is closer to $200,000 per teraflop, or 1/6th the cost of the LANL solution.

Military computing: a teraflop here, a teraflop there, pretty soon you’re talking about real money.

Leave a Reply

Your email address will not be published. Required fields are marked *