1. Hive is a part of the Apache Hadoop project that provides SQL like interface for data processing 2. Hive is one component of the Hadoop framework that allows for collecting data together into an external repository 3. Access Mostly Uused Products by 50000+ Subscribers 4. HIVE is part of the Apache Hadoop project that enables in-memory analysis of real-time streams of data
Hive is a project initially developed by facebook specifically for people with very strong SQL skills and not very strong Java skills who want to query data in Hadoop
Question :
What is PIG? 1. Pig is a subset fo the Hadoop API for data processing 2. Pig is a part of the Apache Hadoop project that provides C-like scripting languge interface for data processing 3. Access Mostly Uused Products by 50000+ Subscribers 4. None of Above
Pig is a project that was developed by Yahoo for people with very strong skills in scripting languages. Using scripting language, it dynamically creates Map Reduce jobs automatically
Question :
How can you disable the reduce step?
1. The Hadoop administrator has to set the number of the reducer slot to zero on all slave nodes. This will disable the reduce step. 2. It is imposible to disable the reduce step since it is critical part of the Mep-Reduce abstraction. 3. Access Mostly Uused Products by 50000+ Subscribers 4. While you cannot completely disable reducers you can set output to one. There needs to be at least one reduce step in Map-Reduce abstraction.
1. No reducer can start until last Mapper finished 2. If mapper is running slow then another instance of Mapper will be started by Hadoop on another machine 3. Hadoop will kill the slow mapper if it keep running if the new one finished 4. The result of the first mapper finished will be used 5. All of the above
1. Runs locally on a single Mappers output 2. Using Combiner can reduce the network traffic 3. Generally, Combiner and Reducer code is same 4. None of the 1,2 and 3 5. All 1,2 and 3 applicable to the Combiner