What is the result of the following command (the database username is foo and password is bar)? $ sqoop list-tables - - connect jdbc : mysql : / / localhost/databasename - - table - - username foo - - password bar 1. sqoop lists only those tables in the specified MySql database that have not already been imported into FDFS 2. sqoop returns an error 3. Access Mostly Uused Products by 50000+ Subscribers 4. sqoopimports all the tables from SQLHDFS
Which best describes the primary function of Flume? 1. Flume is a platform for analyzing large data sets that consists of a high level language for expressing data analysis programs, coupled with an infrastructure consisting of sources and sinks for importing and evaluating large data sets 2. Flume acts as a Hadoop filesystem for log files 3. Access Mostly Uused Products by 50000+ Subscribers 4. Flume provides a query languages for Hadoop similar to SQL 5. Flume is a distributed server for collecting and moving large amount of data into HDFS as its produced from streaming data flows
You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because your Hadoop cluster isn't optimized for storing and processing many small files you decide to do the following actions: 1. Group the individual images into a set of larger files 2. Use the set of larger files as input for a MapReduce job that processes them directly with Python using Hadoop streaming Which data serialization system gives you the flexibility to do this?
A. CSV B. XML C. HTML D. Avro E. Sequence Files F. JSON