MC Which components does the base Hadoop stack include? HDFS, Map and Reduce. incorrect NDFS, MapReduce, and YARN. incorrect HDFS, MapReduce and YARN. correct HDFS, Spark and YARN. incorrect MC Which of the following is not one of the reasons why Spark programs are generally faster than MapReduce operations? Because Spark tries to keep its RDDs in memory as long as possible. incorrect Because Spark uses a directed acyclic graph instead of MapReduce. incorrect Because Mesos can be used as a resource manager instead of YARN. correct Because RDD transformations are "lazily" applied. incorrect MC Which statement is NOT CORRECT? Spark SQL exposes DataFrame and Dataset APIs which underlyingly use RDDs together with a performant SQL query engine. incorrect Spark SQL can be used through ODBC and JDBC interfaces. incorrect Spark SQL can be used from within Java, Python, Scala and R. incorrect Spark SQL DataFrames need to be created by loading a file. correct MC Which statement is CORRECT? The HDFS NameNode sends regular heartbeat messages to its DataNodes. incorrect HDFS is composed of a NameNode, DataNodes, and an optional SecondaryNameNode. correct DataNodes in HDFS store a registry of metadata. incorrect Both the SecondaryNameNode and primary NameNode can simultaneously handle requests from clients. incorrect MC Which statement is CORRECT? HDFS is composed of a NameNode, DataNodes, and an optional SecondaryNameNode. correct Both the SecondaryNameNode and primary NameNode can simultaneously handle requests from clients. incorrect DataNodes in HDFS store a registry of metadata. incorrect The HDFS NameNode sends regular heartbeat messages to its DataNodes. incorrect MC Which statement is CORRECT? HBase works well on large clusters as well as small ones having a few nodes. incorrect MapReduce programs cannot be used with HBase. Data is accessed using simple put and get commands instead. incorrect HBase offers a SQL engine to query its data. incorrect HBase can be considered as a NoSQL database. correct MC Which statement is NOT CORRECT? RDDs offer failure protection by tracking the lineage of operations that are applied on them. incorrect In MapReduce, the reducing can start before the mappers have finished. incorrect Hive queries run much faster than hand-written MapReduce programs. correct HDFS's high emphasis on fault tolerance results in data replication. incorrect MC Which statement is CORRECT? Spark SQL can be used through ODBC and JDBC interfaces. correct The YARN ApplicationMaster contains a scheduler which will hold submitted jobs in a queue until they are deemed ready to start. incorrect Hive queries run much faster than hand-written MapReduce programs. incorrect Apache Spark doesn't enforce a relatively linear and fixed data flow structure. incorrect MC Which statement is CORRECT? The 5V's of Big Data do not take economic value into account. incorrect Hive applies a 'schema on query' approach. incorrect A MapReduce program can be implemented in an easy, straightforward manner. incorrect YARN's JobHistoryServer keeps a log of all finished jobs. correct MC Which of the following is not one of the reasons why Spark programs are generally faster than MapReduce operations? Because Spark uses a directed acyclic graph instead of MapReduce. incorrect Because Mesos can be used as a resource manager instead of YARN. correct Because RDD transformations are "lazily" applied. incorrect Because Spark tries to keep its RDDs in memory as long as possible. incorrect