Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : What are map files and why are they important?

1. Map files are stored on the namenode and capture the metadata for all blocks on a particular rack. This is how Hadoop is "rack aware"
2. Map files are the files that show how the data is distributed in the Hadoop cluster.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Map files are sorted sequence files that also have an index. The index allows fast data look up.

Correct Answer : Get Lastest Questions and Answer :

The Hadoop map file is a variation of the sequence file. They are very important for map-side join design pattern.

A MapFile is a sorted SequenceFile with an index to permit lookups by key. MapFile can
be thought of as a persistent form of java.util.Map (although it doesnt implement this
interface), which is able to grow beyond the size of a Map that is kept in memory.

Refer HadoopExam.com Recorded Training Module : 7

Question : Let's assume you have following files in the hdfs directory called merge.
Test1.txt
hadoopexam.com Hadoop Training 1
Test2.txt
www.hadoopexam.com Hadoop YARN Training
Test3.txt
http://hadoopexam.com Amazon WebService Training
Now you run the following command
hadoop fs -getmerge -nl merge/ output2.txt
What is the content in the output2.txt file

1.
hadoopexam.com Hadoop Training 1
www.hadoopexam.com Hadoop YARN Training
http://hadoopexam.com Amazon WebService Training

2.

hadoopexam.com Hadoop Training 1

www.hadoopexam.com Hadoop YARN Training

http://hadoopexam.com Amazon WebService Training

3. Access Mostly Uused Products by 50000+ Subscribers
4. www.hadoopexam.com Hadoop YARN Traininghadoopexam.com Hadoop Training 1http://hadoopexam.com Amazon WebService Training

Correct Answer : Get Lastest Questions and Answer :

getmerge
Usage: hadoop fs -getmerge (src) (localdst) [addnl]
Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally addnl can be set to enable adding a newline
character at the end of each file.
"nl" option will add additional new line after each line in the file.

Question : In the regular WordCount MapReduce example, you have following driver code

public class WordCount extends Configured implements Tool {
public static void main(String args[]) throws Exception {
int res = ToolRunner.run(new WordCount(), args);
System.exit(res);
}
public int run(String[] args) throws Exception {
Path inputPath = new Path("shakespeare1");
Path outputPath = new Path(""+System.currentTimeMillis());
Configuration conf = getConf();
Job job = new Job(conf, this.getClass().toString());
FileInputFormat.setInputPaths(job, inputPath);
FileOutputFormat.setOutputPath(job, outputPath);
job.setJobName("WordCount");
job.setJarByClass(WordCount.class);
job.setJarByClass(WordCount.class);
job.setJobName("Word Count");
job.setMapperClass(WordMapper.class);
job.setReducerClass(SumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setNumReduceTasks(2);
return job.waitForCompletion(true) ? 0 : 1;
}}

Now you run the below command on a single node cluste. Where wc.jar is jar file containing Driver,Mapper and Reducer class.
hadoop jar wc.jar WordCount

Select the correct statement from below.

1. It will run 2 Mapper and 2 Reducer
2. It will run 2 Reducer, but number of Mapper is not known.
3. Access Mostly Uused Products by 50000+ Subscribers
4. There is not enough information to tell number of reducer.

Correct Answer : Get Lastest Questions and Answer :
As you can see in the driver code it has been defined that there would be job.setNumReduceTasks(2);
Total two reducer will be executed.

Related Questions

Question : When a JobTracker stops receiving heartbeats from the task tracker, the JobTracker

1. Retry three times to schedule the task on same task tracker

2. Reschedules the tasks on failed TaskTracker to other TaskTracker

3. Report Failures and stops

4. Restarts the failed TaskTracker

Question : Which is the default scheduler in YARN architecture

1. Fair Scheduler

2. Capacity Scheduler

3. Map Scheduler

4. Hash scheduler

Question : Which of the following is a correct pattern to use Hadoop framework

1. Summarizing the data for instance statistical summaries, counts, and indexes for group of data.
2. Filtering data for instance Sample, sanitize, identify top n , and filter unique data
3. Optimize data for instance Transform, Partition, sort and generate data

1. 1,2

2. 2,3

3. 1,3

4. 1,2 and 3

Question : The four arguments to the Mapper class represents the input key type, input value type , output key type, and output value type

1. True
2. False

Question : If the output value of the Mapper is Text, the input value to the Reducer can be LongWritable

1. True
2. False

Question : MapReduce job can be launched both Synchronously or Asynchronously

1. True
2. False