Introduction to Apache Hadoop Nowadays, people are living in the data world. It’s not easy to measure the total volume of data stored electronically, but an IDC estimate put the size of the “digital universe” at 0.18 zettabytes in 2006, and is forecasting a tenfold growth by 2011 to 1.8 zettabytes. A zettabyte is 〖10〗^21 bytes, or equivalently one thousand exabytes, one million petabytes, or one billion terabytes. That’s roughly the same order of magnitude as one disk drive for every person in the
& Kruschwitz, 2011). History of Hadoop Technology Hadoop technology was created and developed by Doug Cutting who is recognized as the brain behind the Apache Lucene, a popularly known text search library. Originally, Hadoop had its source from Apache Nutch which was an Open Source search engine and also formed part of different Lucene projects. According to the project creator, Doug Cutting, the name Hadoop was not an acronym but just a makeup name. For instance, the ‘contrib’ module and other