Each block is replicated multiple times - Default is to replicate each block three times - Replicas are stored on different nodes - This ensures both reliability and availability
Correct Ans : 5 Exp : Meaning of regex as . any char, exactly 1 time (Please remember the regex) * any char, 0-8 times + any char, 1-8 times ? any char, 0-1 time ^ start of string (or line if multiline mode) $ end of string (or line if multiline mode) | equivalent to OR (there is no AND, check the reverse of what you need for AND). First Record passed the regular expression Second record also pass the expression third record does not pass the expression, because hours part is in single digit as you can sse in the expression first two d's are there. It is expected that each record should have at least all five character as digit. Which no record suffice. Hence in total matching records are 0 and non-matching records are 6 Please learn java regular expression it is mandatrory. Consider using Hadoop Professional Training Provided by HadoopExam.com if you face the problem.
Question
What happens when you run the below job twice , having each input directory as one of the data file called data.csv. with following command. Assuming there were no output directory exist
hadoop job HadoopExam.jar HadoopExam inputdata_1 output hadoop job HadoopExam.jar HadoopExam inputdata_2 output 1. Both the job will write the output to output directoy and output will be appended 2. Both the job will fail, saying output directory does not exist. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Both the job will successfully completes and second job will overwrite the output of first.
Ans : 3 Exp : First job will successfully run and second one will fail, because, if (output directory already exist then it will not run and throws exception, complaining output directory already exist.
Question You have a an EVENT table with following schema in the Oracle database.
PAGEID NUMBER USER VARCHAR2 EVENTTIME DATE PLACE VARCHAR2
Which of the following command creates the correct HIVE table named EVENT 1. 2. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Ans : 2 Exp : The above is correct because it correctly uses the Sqoop operation to create a Hive table that matches the database table. Option 3rd is not correct because --hive-table option for Sqoop requires a parameter that names the target table in the database.
Question : Which of the following command will delete the Hive table nameed EVENTINFO
1. hive -e 'DROP TABLE EVENTINFO' 2. hive 'DROP TABLE EVENTINFO' 3. Access Mostly Uused Products by 50000+ Subscribers 4. hive -e 'TRASH TABLE EVENTINFO' 1. 2. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Ans :1 Exp : Sqoop does not offer a way to delete a table from Hive, although it will overwrite the table definition during import if the table already exists and --hive-overwrite is specified. The correct HiveQL statement to drop a table is "DROP TABLE tablename". In Hive, table names are all case insensitives
Question There is no tables in Hive, which command will import the entire contents of the EVENT table from the database into a Hive table called EVENT that uses commas (,) to separate the fields in the data files? 1. 2. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Ans :2 Exp : --fields-terminated-by option controls the character used to separate the fields in the Hive table's data files.