Question : Which of the following method or methods of JobControl object can be used to track the execution state of Jobs 1. allFinished() 2. getFailedJobs() 3. Access Mostly Uused Products by 50000+ Subscribers 4. All of the above
JobControl object is used to control the group of Jobs, This class encapsulates a set of MapReduce jobs and its dependency and there are various method get the status. It tracks the states of the jobs by placing them into different tables according to their states. This class has a thread that submits jobs when they become ready, monitors the states of the running jobs, and updates the states of jobs based on the state changes of their depending jobs states. e.g. getFailedJobs(),getReadyJobs(),getRunningJobs() ,getState(),getSuccessfulJobs(),getWaitingJobs()
The ChainMapper class allows to use multiple Mapper classes within a single Map task. The Mapper classes are invoked in a chained (or piped) fashion, the output of the first becomes the input of the second, and so on until the last Mapper, the output of the last Mapper will be written to the task's output.
The ChainReducer class allows to chain multiple Mapper classes after a Reducer within the Reducer task.
For each record output by the Reducer, the Mapper classes are invoked in a chained (or piped) fashion, the output of the first becomes the input of the second, and so on until the last Mapper, the output of the last Mapper will be written to the task's output.
Notes: Running all the Pre and Post processing in a single Jobs leaves no intermediate file and there is dramatic reduction in IO
Question : Is Data Joining like (RDBMS Join is possible in the Hadoop MapReduce) 1. Yes 2. NO