Test Information:
Total Questions: 110
Test Number: APACHE-HADOOP-DEVELOPER
Vendor Name: Hortonworks
Cert Name : HCAHD
Test Name: Hadoop 2.0 Certification exam for Pig and Hive Developer
Official Site: http://www.certsgrade.com
For
More Details: http://www.certsgrade.com/pdf/apache-hadoop-developer/
Question: 1
Identify
the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application
containers
and monitoring application resource usage?
A.
ResourceManager
B.
NodeManager
C.
ApplicationMaster
D.
ApplicationMasterService
E.
TaskTracker
F.
JobTracker
Answer: B
Reference:
Apache Hadoop YARN - Concepts & Applications
Question: 2
You
want to run Hadoop jobs on your development workstation for testing before you
submit
them
to your production cluster.
Which
mode of operation in Hadoop allows you to most closely simulate a production
cluster while
using
a single machine?
A.
Run all the nodes in your production cluster as virtual machines on your
development workstation.
B.
Run the hadoop command with the -jt local and the -fs file:///options.
C.
Run the DataNode, TaskTracker, NameNode and JobTracker daemons on a single
machine.
D.
Run simldooop, the Apache open-source software for simulating Hadoop clusters.
Answer: C
Question: 3
You
have the following key-value pairs as output from your Map task:
(the,
1) (fox, 1) (faster, 1) (than, 1) (the, 1) (dog, 1)
How
many keys will be passed to the Reducer's reduce method?
A.
Six
B.
Five
C.
Four
D.
Two
E.
One
F.
Three
Answer: B
Explanation:
Only
one key value pair will be passed from the two (the, 1) key value pairs.
Question: 4
Which
project gives you a distributed, Scalable, data store that allows you random,
realtime
read/write
access to hundreds of terabytes of data?
A.
HBase
B.
Hue
C.
Pig
D.
Hive
E.
Oozie
F.
Flume
G.
Sqoop
Answer: A
Explanation:
Use
Apache HBase when you need random, realtime read/write access to your Big Data.
Note:
This project's goal is the hosting of very large tables -- billions of rows X
millions of columns -
-
atop clusters of commodity hardware. Apache HBase is an open-source, distributed,
versioned,
column-oriented
store modeled after Google's Bigtable: A Distributed Storage System for
Structured
Data
by Chang et al. Just as Bigtable leverages the distributed data storage
provided by the Google
File
System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and
HDFS.
Features
Linear
and modular scalability. Strictly consistent reads and writes. Automatic and
configurable
sharding
of tables Automatic failover support between RegionServers. Convenient base
classes for
backing
Hadoop MapReduce jobs with Apache HBase tables. Easy to use Java API for client
access.
Block
cache and Bloom Filters for real-time queries. Query predicate push down via
server side Filters
Thrift
gateway and a REST-ful Web service that supports XML, Protobuf, and binary data
encoding
options
Extensible jruby-based (JIRB) shell Support for exporting metrics via the
Hadoop metrics
subsystem
to files or Ganglia; or via JMX
Reference:
http://hbase.apache.org/ (when would I use HBase? First sentence)
Question: 5
Which
one of the following statements describes a Pig bag. tuple, and map,
respectively?
A.
Unordered collection of maps, ordered collection of tuples, ordered set of
key/value pairs
B.
Unordered collection of tuples, ordered set of fields, set of key value pairs
C.
Ordered set of fields, ordered collection of tuples, ordered collection of maps
D.
Ordered collection of maps, ordered collection of bags, and unordered set of
key/value pairs
Answer: B
Question: 6
Which
HDFS command copies an HDFS file named foo to the local filesystem as localFoo?
A.
hadoop fs -get foo LocalFoo
B.
hadoop -cp foo LocalFoo
C.
hadoop fs -Is foo
D.
hadoop fs -put foo LocalFoo
Answer: A
Question: 7
You
are developing a MapReduce job for sales reporting. The mapper will process
input keys
representing
the year (IntWritable) and input values representing product indentifies
(Text).
Indentify
what determines the data types used by the Mapper for a given job.
A.
The key and value types specified in the JobConf.setMapInputKeyClass and
JobConf.setMapInputValuesClass
methods
B.
The data types specified in HADOOP_MAP_DATATYPES environment variable
C.
The mapper-specification.xml file submitted with the job determine the mapper's
input key and
value
types.
D.
The InputFormat used by the job determines the mapper's input key and value
types.
Answer: D
Explanation:
The
input types fed to the mapper are controlled by the InputFormat used.
The
default input format, "TextInputFormat," will load data in as
(LongWritable, Text) pairs.
The
long value is the byte offset of the line in the file. The Text object holds
the string
contents
of the line of the file.
Note:
The data types emitted by the reducer are identified by setOutputKeyClass()
andsetOutputValueClass().
The data types emitted by the reducer are identified by
setOutputKeyClass()
and setOutputValueClass().
By
default, it is assumed that these are the output types of the mapper as well.
If this is not
the
case, the methods setMapOutputKeyClass() and setMapOutputValueClass() methods
of
the JobConf class will override these.
Reference:
Yahoo! Hadoop Tutorial, THE DRIVER METHOD
Question: 8
All
keys used for intermediate output from mappers must:
A.
Implement a splittable compression algorithm.
B.
Be a subclass of FileInputFormat.
C.
Implement WritableComparable.
D.
Override isSplitable.
E.
Implement a comparator for speedy sorting.
Answer: C
Explanation:
The
MapReduce framework operates exclusively on <key, value> pairs, that is,
the framework views
the
input to the job as a set of <key, value> pairs and produces a set of
<key, value> pairs as the
output
of the job, conceivably of different types.
The
key and value classes have to be serializable by the framework and hence need
to implement the
Writable
interface. Additionally, the key classes have to implement the
WritableComparable interface
to
facilitate sorting by the framework.
Reference:
MapReduce Tutorial
Question:
9
Review
the following data and Pig code:
What
command to define B would produce the output (M,62,95l02) when invoking the
DUMP
operator
on B?
A.
B = FILTER A BY (zip = = '95102' AND gender = = M");
B.
B= FOREACH A BY (gender = = 'M' AND zip = = '95102');
C.
B = JOIN A BY (gender = = 'M' AND zip = = '95102');
D.
B= GROUP A BY (zip = = '95102' AND gender = = 'M');
Answer: A
Question: 10
Assuming
the following Hive query executes successfully:
Which
one of the following statements describes the result set?
A.
A bigram of the top 80 sentences that contain the substring "you are"
in the lines column of the
input
data A1 table.
B.
An 80-value ngram of sentences that contain the words "you" or
"are" in the lines column of the
inputdata
table.
C.
A trigram of the top 80 sentences that contain "you are" followed by
a null space in the lines
column
of the inputdata table.
D.
A frequency distribution of the top 80 words that follow the subsequence
"you are" in the lines
column
of the inputdata table.
Answer: D
Test Information:
Total Questions: 110
Test Number: APACHE-HADOOP-DEVELOPER
Vendor Name: Hortonworks
Cert Name : HCAHD
Test Name: Hadoop 2.0 Certification exam for Pig and Hive Developer
Official Site: http://www.certsgrade.com
For
More Details: http://www.certsgrade.com/pdf/apache-hadoop-developer/
Get20%
Immediate Discount on Full Training Mater
Discount Coupon Code: 20off2016



No comments:
Post a Comment