There isn’t an Internet user in the world who doesn’t know who Google is. Even if people are using a different search engine, they often refer to a web search as a “Google” search. As a matter of fact, Google still holds the lead in searches, with a whopping 65% share of the search market. So there’s not a doubt in my mind the intensity of the data and numbers that Google’s servers have to move and crunch at any given second. Imagine receiving, sending, decrypting, encrypting, unpacking, and packing 65% of the world’s Internet searches. Not to mention all the Gmail customers they have alongside cloud services and other services Google provides. Google is not unlike any other successful company. They don’t always reveal their entire hand, especially when they’re holding a good one. But now new information on the Google network infrastructure is being released to the public for consideration, and it’s pretty amazing what they’ve done. In a blog post made by Googler Amin Vahdat, they announced the release of a paper detailing five years of Google’s in-house data center network architecture.
From relatively humble beginnings, and after a misstep or two, we’ve built and deployed five generations of datacenter network infrastructure. Our latest-generation Jupiter network has improved capacity by more than 100x relative to our first generation network, delivering more than 1 petabit/sec of total bisection bandwidth. This means that each of 100,000 servers can communicate with one another in an arbitrary pattern at 10Gb/s.
One of the most impressive pieces presented in this data is the fact that Google has built their own networking hardware since year two of their existence. They realized their potential for growth and decided that there was no networking hardware that could accommodate the amount of traffic and transfer speeds that they would require. So they set out and built their own hardware based on software maintenance, dubbed software defined networking. With software defined networking you have less hands-on time with the maintaining networking hardware and spend your time maintaining through the software. This is supposedly more cost effective than the traditional way.
We adopted a set of principles to organize our networks that is now the primary driver for networking research and industrial innovation, Software Defined Networking (SDN). We observed that we could arrange emerging merchant switch silicon around a Clos topology to scale to the bandwidth requirements of a data center building. The topology of all five generations of our data center networks follow the blueprint below. Unfortunately, this meant that we would potentially require 10,000+ individual switching elements. Even if we could overcome the scalability challenges of existing network protocols, managing and configuring such a vast number of switching elements would be impossible.
The amount of information in Google’s blog post alone is mind boggling, but then pull up the presented data in their paper and you’re really blown away. The networking geeks are going to have wet dreams after reading this tonight, I can tell you that. For the rest of you who might not be that excited about networking and infrastructure, it’s still a great read even if you take your time with it. I could go on further but you have the links below to read it all for yourself in Google’s words. You should also check out these videos for more info on Google’s data centers.