Premium

Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)



Question : Distributing the values among associated with the key in sorted order to the reducer is defined as ?
  : Distributing the values among associated with the key in sorted order to the reducer is defined as ?
1. Map and Reduce
2. Shuffle and Sort
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above

Correct Answer : Get Lastest Questions and Answer :





Question : You have written a Mapper which invokes the following five calls to the OutputColletor.collect method:
output.collect (new Text ("Apple"), new Text ("Red") ) ;
output.collect (new Text ("Banana"), new Text ("Yellow") ) ;
output.collect (new Text ("Apple"), new Text ("Yellow") ) ;
output.collect (new Text ("Cherry"), new Text ("Red") ) ;
output.collect (new Text ("Apple"), new Text ("Green") ) ;
How many times will the Reducer's reduce method be invoked?
  : You have written a Mapper which invokes the following five calls to the OutputColletor.collect method:
1. 6
2. 3
3. Access Mostly Uused Products by 50000+ Subscribers
4. 0
5. 5

Correct Answer : Get Lastest Questions and Answer :
Explanation: reduce() gets called once for each [key, (list of values)] pair. To explain, let's
say you called:
out.collect(new Text("Car"),new Text("Subaru");
out.collect(new Text("Car"),new Text("Honda");
out.collect(new Text("Car"),new Text("Ford");
out.collect(new Text("Truck"),new Text("Dodge");
out.collect(new Text("Truck"),new Text("Chevy");
Then reduce() would be called twice with the pairs
reduce(Car, )
reduce(Truck, )




Question : What data does a Reducer reduce method process?
  : What data does a Reducer reduce method process?
1. All the data in a single input file.
2. All data produced by a single mapper.
3. Access Mostly Uused Products by 50000+ Subscribers
4. All data for a given value, regardless of which mapper(s) produced it.

Correct Answer : Get Lastest Questions and Answer :
Explanation: Reducing lets you aggregate values together. A reducer function receives an iterator of input values from an input list. It then combines these values together, returning
a single output value. All values with the same key are presented to a single reduce task.


Related Questions


Question : You are running a MapReduce job, and inside the Mapper you want to get the actual file name which is being processed,
what is the correct code snippet to fetch the filename in Mapper code

  : You are running a MapReduce job, and inside the Mapper you want to get the actual file name which is being processed,
1. String fileName = ((FileStatus) context.getFileStatus()).getPath().getName();
2. String fileName = context.getPath().getName();
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above



Question : In MapReduce word count,
you know your file contains the
maximum of three different words,
and after completion of the job
you want there one file will be
created for each reducer. Hence,
you have written a custom
partitioner, which is the correct
code snippet for above requirement.
  : In MapReduce word count,
1. A
2. B
3. Access Mostly Uused Products by 50000+ Subscribers


Question : Input file size (kb) is given, and block size is given (mb). What is the size of the intermediate data occupied?

  : Input file size (kb) is given, and block size is given (mb). What is the size of the intermediate data occupied?
1. 47KB
2. 83KB
3. Access Mostly Uused Products by 50000+ Subscribers
4. Job Fails


Question : For HadoopExam.com user profiles you need to analyze roughly ,, JPEG files of all the.
Each file is no more than 3kB.Because your Hadoop cluster isn't optimized for storing and processing many small files,
you decide to group the files into a single archive. The toolkit that will be used to process
the files is written in Ruby and requires that it be run with administrator privileges.
Which of the following file formats should you select to build your archive?

  :  For HadoopExam.com user profiles you need to analyze roughly ,, JPEG files of all the.
1. TIFF
2. SequenceFiles
3. Access Mostly Uused Products by 50000+ Subscribers
4. MPEG
5. Avro

Ans : 5

Exp :The two formats that are best suited to merging small files into larger archives for processing in Hadoop are Avro and SequenceFiles. Avro has Ruby bindings; SequenceFiles are
only supported in Java.

JSON, TIFF, and MPEG are not appropriate formats for archives. JSON is also not an appropriate format for image data.




Question : SequenceFiles are flat files consisting of binary key/value pairs. SequenceFile provides Writer, Reader and SequenceFile.Sorter classes for writing, reading and sorting
respectively.
There are three SequenceFile Writers based on the SequenceFile.CompressionType used to compress key/value pairs:
You have created a SequenceFile (MAIN.PROFILE.log) with custom key and value types. What command displays the contents of a
SequenceFile named MAIN.PROFILE.log in your terminal in human-readable format?

  :  For HadoopExam.com user profiles you need to analyze roughly ,, JPEG files of all the.
1. hadoop fs -decrypt MAIN.PROFILE.log
2. hadoop fs -text MAIN.PROFILE.log
3. Access Mostly Uused Products by 50000+ Subscribers
4. hadoop fs -encode MAIN.PROFILE.log




Question : Speculative execution is an optimization technique where a computer system performs
some task that may not be actually needed. The main idea is to do work before it is known whether that work will be needed at all,
so as to prevent a delay that would have to be incurred by doing the work after it is known whether it is needed. If it turns out the work was not needed
after all, any changes made by the work are reverted and the results are ignored. In a ETL MapReduce job which will use Mappers to process data
and then using DBMSOutputFormat with the Reducers you directly push to Oracle database. Select the correct statement which applies for
speculative execution.

  : Speculative execution is an optimization technique where a computer system performs
1. Disable speculative execution for the data insert job
2. Enable speculative execution for the data insert job
3. Access Mostly Uused Products by 50000+ Subscribers
4. Configure only single mapper for the data insert job




Question : Apache MRUnit is a Java library that helps developers unit test Apache Hadoop map reduce jobs.
MRUnit testing framework is based on JUnit and it can test Map Reduce programs written on 0.20 , 0.23.x , 1.0.x , 2.x version of Hadoop
You have a Reducer which simply sums up the values for any given key. You write a unit test in MRUnit to test the Reducer, with this code:
@Test
public void testETLReducer() {
List < IntWritable > values = new ArrayList < IntWritable > ();
values.add(new IntWritable(1));
values.add(new IntWritable(1));
List < IntWritable > values2 = new ArrayList < IntWritable > ();
values2.add(new IntWritable(1));
values2.add(new IntWritable(1));
reduceDriver.withInput(new LongWritable("5673"), values);
reduceDriver.withInput(new LongWritable("109098"), values2);
reduceDriver.withOutput(new LongWritable("109098"), new IntWritable(2));
reduceDriver.runTest();
} What is the result?


  : Apache MRUnit is a Java library that helps developers unit test Apache Hadoop map reduce jobs.
1. The test will pass with warning and error
2. The test will pass with no warning and error
3. Access Mostly Uused Products by 50000+ Subscribers
4. Code will not compile