Friday, October 16, 2009

Google's Competitive Advantage

Google's Competitive Advantage.

Anybody who knows how backend infrastructures are built and managed by internet companies, they know Google's infrasturucture is no match for any other companies in terms of pure size and its ability to manage them more efficiently than anyone else.

Their system - an automated distributed system based on networks of cheap Intel-based servers - is the key to their success. Researchers can develop some fancy algorithms, but it's of no use if they cannot collect, host, and process data - lots and lots and lots of data - and Google's system can do the job better and more cost effectively.

Google's system can store and process far more data than anybody' elses. Sure one might say, throw in some mainframes and push the data into it and run the algorithm, but it'll be very costly. Also there is a limit on how much data one mainframe can host and process. Costs aside, it may be good enough for data generated in some enterprise, but it can never host and process data generated by millions of people on the internet.

Google's simple objective was to host, index, and process all the data on the internet. So they designed their system ground up to meet that goal. And they didn't have much money to buy some fancy big equipments like mainframe, either... But even if they did have money, they probably wouldn't have bought them anyway because they probably knew mainframe or any other supercomputer couldn't host the internet data.

How many servers do they have? Some say over a million. Some say over a few million. Nobody knows.
They probably don't have the exact number either. They probably don't care. They just throw in some cheap servers loaded with their customized software (e.g. Google File System, MapReduce, BigTable, etc.) into some datacenter that they own and just have them running.

Is there any other companies in the world who run as many servers as Google?
Not a chance, I would say. Sure there are now companies like Yahoo! and Facebook who deploy Hadoop-based distributed systems (an Apache Open Source project that cloned Google's system software), but their numbers are in the range of 1000's, not in the range of millions.

Google's true competitive advantage, in my opinion, is not in their algorithms or how many bright Ph.D's they have employed.
It's in their simple, cost effective, yet very scalable and powerful infrastructure.

No comments: