March Sale Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Hortonworks Apache-Hadoop-Developer Dumps

Hadoop 2.0 Certification exam for Pig and Hive Developer Questions and Answers

Question 1

Which TWO of the following statements are true regarding Hive? Choose 2 answers

Options:

A.

Useful for data analysts familiar with SQL who need to do ad-hoc queries

B.

Offers real-time queries and row level updates

C.

Allows you to define a structure for your unstructured Big Data

D.

Is a relational database

Question 2

MapReduce v2 (MRv2/YARN) is designed to address which two issues?

Options:

A.

Single point of failure in the NameNode.

B.

Resource pressure on the JobTracker.

C.

HDFS latency.

D.

Ability to run frameworks other than MapReduce, such as MPI.

E.

Reduce complexity of the MapReduce APIs.

F.

Standardize on a single MapReduce API.

Question 3

Which one of the following statements describes the relationship between the ResourceManager and the ApplicationMaster?

Options:

A.

The ApplicationMaster requests resources from the ResourceManager

B.

The ApplicationMaster starts a single instance of the ResourceManager

C.

The ResourceManager monitors and restarts any failed Containers of the ApplicationMaster

D.

The ApplicationMaster starts an instance of the ResourceManager within each Container

Question 4

Which Hadoop component is responsible for managing the distributed file system metadata?

Options:

A.

NameNode

B.

Metanode

C.

DataNode

D.

NameSpaceManager

Question 5

You have user profile records in your OLPT database, that you want to join with web logs you have already ingested into the Hadoop file system. How will you obtain these user records?

Options:

A.

HDFS command

B.

Pig LOAD command

C.

Sqoop import

D.

Hive LOAD DATA command

E.

Ingest with Flume agents

F.

Ingest with Hadoop Streaming

Question 6

In Hadoop 2.0, which TWO of the following processes work together to provide automatic failover of the NameNode? Choose 2 answers

Options:

A.

ZKFailoverController

B.

ZooKeeper

C.

QuorumManager

D.

JournalNode

Question 7

All keys used for intermediate output from mappers must:

Options:

A.

Implement a splittable compression algorithm.

B.

Be a subclass of FileInputFormat.

C.

Implement WritableComparable.

D.

Override isSplitable.

E.

Implement a comparator for speedy sorting.

Question 8

You have just executed a MapReduce job. Where is intermediate data written to after being emitted from the Mapper’s map method?

Options:

A.

Intermediate data in streamed across the network from Mapper to the Reduce and is never written to disk.

B.

Into in-memory buffers on the TaskTracker node running the Mapper that spill over and are written into HDFS.

C.

Into in-memory buffers that spill over to the local file system of the TaskTracker node running the Mapper.

D.

Into in-memory buffers that spill over to the local file system (outside HDFS) of the TaskTracker node running the Reducer

E.

Into in-memory buffers on the TaskTracker node running the Reducer that spill over and are written into HDFS.

Question 9

You have the following key-value pairs as output from your Map task:

(the, 1)

(fox, 1)

(faster, 1)

(than, 1)

(the, 1)

(dog, 1)

How many keys will be passed to the Reducer’s reduce method?

Options:

A.

Six

B.

Five

C.

Four

D.

Two

E.

One

F.

Three

Question 10

Which one of the following statements is FALSE regarding the communication between DataNodes and a federation of NameNodes in Hadoop 2.0?

Options:

A.

Each DataNode receives commands from one designated master NameNode.

B.

DataNodes send periodic heartbeats to all the NameNodes.

C.

Each DataNode registers with all the NameNodes.

D.

DataNodes send periodic block reports to all the NameNodes.

Question 11

Given the following Hive command:

Which one of the following statements is true?

Options:

A.

The files in the mydata folder are copied to a subfolder of /apps/hlve/warehouse

B.

The files in the mydata folder are moved to a subfolder of /apps/hive/wa re house

C.

The files in the mydata folder are copied into Hive's underlying relational database

D.

The files in the mydata folder do not move from their current location In HDFS

Question 12

Given the following Hive commands:

Which one of the following statements Is true?

Options:

A.

The file mydata.txt is copied to a subfolder of /apps/hive/warehouse

B.

The file mydata.txt is moved to a subfolder of /apps/hive/warehouse

C.

The file mydata.txt is copied into Hive's underlying relational database 0.

D.

The file mydata.txt does not move from Its current location in HDFS

Question 13

Which one of the following statements regarding the components of YARN is FALSE?

Options:

A.

A Container executes a specific task as assigned by the ApplicationMaster

B.

The ResourceManager is responsible for scheduling and allocating resources

C.

A client application submits a YARW job to the ResourceManager

D.

The ResourceManager monitors and restarts any failed Containers

Question 14

Your client application submits a MapReduce job to your Hadoop cluster. Identify the Hadoop daemon on which the Hadoop framework will look for an available slot schedule a MapReduce operation.

Options:

A.

TaskTracker

B.

NameNode

C.

DataNode

D.

JobTracker

E.

Secondary NameNode

Question 15

Review the following data and Pig code.

M,38,95111

F,29,95060

F,45,95192

M,62,95102

F,56,95102

A = LOAD 'data' USING PigStorage('.') as (gender:Chararray, age:int, zlp:chararray);

B = FOREACH A GENERATE age;

Which one of the following commands would save the results of B to a folder in hdfs named myoutput?

Options:

A.

STORE A INTO 'myoutput' USING PigStorage(',');

B.

DUMP B using PigStorage('myoutput');

C.

STORE B INTO 'myoutput';

D.

DUMP B INTO 'myoutput';

Question 16

In a MapReduce job, the reducer receives all values associated with same key. Which statement best describes the ordering of these values?

Options:

A.

The values are in sorted order.

B.

The values are arbitrarily ordered, and the ordering may vary from run to run of the same MapReduce job.

C.

The values are arbitrary ordered, but multiple runs of the same MapReduce job will always have the same ordering.

D.

Since the values come from mapper outputs, the reducers will receive contiguous sections of sorted values.