The Google Cluster Architecture
Via Sadagopan's weblog: Here's an article that's been floating around for a year written by three of Google's architects, written for IEEE. Alot of detail on why it works like it does, and how they can scale up to tens of thousands of servers without exponential costs in management tools.
Grid is huge hype right now, and no one is delivering it right into businesses. But it is coming... and one of the primary applications for it wil be databases as the explosion of RFID and transactional data continues to build up, and BI moves more and more forward to unlocking valuable information out of it. When it does become a topic in IT departments, you'll be glad that you read this pdf and understand the Google model inside and out. Surely, anyone that wishes to be considered knowledgeable in the area of Grid computing will be expected to know how things are done there.
How will "Grid computing" look when it is ready for prime time in the world's largest companies? Only a couple of things are for sure - when it comes there will be a huge demand for skills and knowledge, and solutions won't look anything like they do right now. My personal belief is that the Google path - open source components and commodity hardware - will win out over propietary solutions that are burdened with millions in hardware and software requirements.
Consider:
The cost advantages of using inexpensive,
PC-based clusters over high-end multiprocessor
servers can be quite substantial, at
least for a highly parallelizable application like
ours. The example $278,000 rack [from RackSpace.com] contains
176 2-GHz Xeon CPUs, 176 Gbytes of
RAM, and 7 Tbytes of disk space. In comparison,
a typical x86-based server contains
eight 2-GHz Xeon CPUs, 64 Gbytes of RAM,
and 8 Tbytes of disk space; it costs about
$758,000.2 In other words, the multiprocessor
server is about three times more expensive
but has 22 times fewer CPUs, three times less
RAM, and slightly more disk space. Much of
the cost difference derives from the much
higher interconnect bandwidth and reliability
of a high-end server, but again, Google’s
highly redundant architecture does not rely
on either of these attributes.
When one of the world's fastest, most reliable, and heavily used sites on the internet does not use a SAN (or even SCSI drives), doesn't pay for for redundant power supplies, and pretty much uses all the same parts you and I could buy on Pricewatch, it's something that will be felt across the high availability industry in the future. And our industry is probably near the top of the list where changes will be seen first.

1 Comments:
This blog is awesome! If you get a chance you may want to visit this computer software site, it's pretty awesome too!
Post a Comment
Links to this post:
Create a Link
<< Home