Google processing over 20 petabytes data per day

Google currently processes over 20 petabytes of data per day through an average of 100,000 MapReduce jobs spread across its massive computing clusters. The average MapReduce job ran across approximately 400 machines in September 2007, crunching approximately 11,000 machine years in a single month. These are just some of the facts about the search giant's […]

Share online:

Google currently processes over 20 petabytes of data per day through an average of 100,000 MapReduce jobs spread across its massive computing clusters. The average MapReduce job ran across approximately 400 machines in September 2007, crunching approximately 11,000 machine years in a single month. These are just some of the facts about the search giant's computational processing infrastructure revealed in an ACM paper by Google Fellows Jeffrey Dean and Sanjay Ghemawat.


Twenty petabytes (20,000 terabytes) per day is a tremendous amount of data processing and a key contributor to Google's continued market dominance. Competing search storage and processing systems at Microsoft (Dyrad) and Yahoo! (Hadoop) are still playing catch-up to Google's suite of GFS, MapReduce, and BigTable.

Full Article

Google, Data, Processing, Comuting, Clustering, Data Processing, Petabytes

About The Author

Deepak Gupta is a IT & Web Consultant. He is the founder and CEO of diTii.com & DIT Technologies, where he's engaged in providing Technology Consultancy, Design and Development of Desktop, Web and Mobile applications using various tools and softwares. Sign-up for the Email for daily updates. Google+ Profile.