Question : In MRv Driver class , a new Job object is created, What else is true for Driver class ?
1. Always use ToolRunner class
2. Always provide the input file
3. It checks the command line syntex
4. Also sets values for the driver, mapper, and reducer classes used.
Correct Answer : 4 Explanation:
Question : What are the TWO main components of the YARN ResourceManager process? Choose answers A. Job Tracker B. Task Tracker C. Scheduler D. Applications Manager 1. A,B 2. B,C 3. C,D 4. A,D 5. B,D
Correct Answer : 3 Explanation:
Question : Given a directory of files with the following structure: line number, tab character, string: Example: 1abialkjfjkaoasdfjksdlkjhqweroij 2kadfjhuwqounahagtnbvaswslmnbfgy 3kjfteiomndscxeqalkzhtopedkfsikj You want to send each line as one record to your Mapper. Which InputFormat should you use to complete the line: conf.setInputFormat (____.class) ; ?
Correct Answer : 3 Explanation: KeyValueTextInputFormat TextInputFormats keys, being simply the offset within the file, are not normally very useful.It is common for each line in a file to be a key value pair, separated by a delimiter such as a tab character. For exammple, this is ths output produced by TextOutputFormat. Hadoop File System defaul output format. To interpret such files correctly, KeyValueTextInputFormat is appropriate. You can specify the separator via the mapreduce.input.keyvaluelinerecordreader.key.value.separator property or key.value.separator.in.input.line in the old API It is a tab character by default. Consider the following input file, where space represent a horizontal tab character line1 On the top of the Crumpetty Tree line2 The Quangle Wangle sat, line3 But his face you could not see, line4 On account of his Beaver Hat. Like in the TextInputFormat case, the input is in a single split comprising four records,although this time the keys are the Text sequences before the tab in each line: (line1, On the top of the Crumpetty Tree) (line2, The Quangle Wangle sat,) (line3, But his face you could not see,) (line4, On account of his Beaver Hat.) SequenceFileInputFormat To use data from sequence files as the input to MapReduce, you use SequenceFileInputFormat. The keys and values are determined by the sequence file, and you need to make sure that your map input types correspond
1. Four, all files will be processed 2. Three, the pound sign is an invalid character for HDFS file names 3. Access Mostly Uused Products by 50000+ Subscribers 4. None, the directory cannot be named jobdata 5. One, no special characters can prefix the name of an input file
1. Have your system administrator copy the JAR to all nodes in the cluster and set its location in the HADOOP_CLASSPATH environment variable before you submit your job. What else is the requirement of the Class using using this libjars. 2. Have your system administrator place the JAR file on a Web server accessible to all cluster nodes and then set the HTTP_JAR_URL environment variable to its location. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Package your code and the Apache Commands Math library into a zip file named JobJar.zip