Home / Big Data Tutorials

Hadoop BigData

MapReduce is a programming model or a software framework invented by google for processing large amount of data in parallel on large clusters of commodity hardware.

Apache Pig is a tool which is used to analyze huge amounts of data. It is an abstraction over MapReduce. In this all the scripts are internally converted into Map and Reduce tasks.

Apache Spark is an open source and big data processing framework which is built on distributed cluster. This is used for managing BigData processing requirements.

Hive is a type of framework built on top of Hadoop for data warehousing. It was
developed at Facebook for the analysis of large amount of data which is coming
day to day.

HBase is an open-source, non-relational, distributed database modeled after Google’s Bigtable and is written in Java. It runs on top of HDFS , providing Bigtable-like capabilities for Hadoop.