MC Which statement is NOT CORRECT? Hive queries run much faster than hand-written MapReduce programs. correct Hive's query language is not as feature complete as the full SQL standard. incorrect Hive offers a JDBC interface. incorrect Hive offers an SQL engine to query Hadoop data. incorrect MC Which statement is CORRECT? A traditional RDBMS applies a 'schema on read' approach. incorrect A company with a large volume of data should always implement a Big Data set-up, rather than a RDBMS. incorrect Apache Spark is fault tolerant and efficient. correct Hive does not support full ACID transaction management. incorrect MC Which statement is NOT CORRECT? Hive queries run much faster than hand-written MapReduce programs. correct Hive offers a JDBC interface. incorrect Hive offers an SQL engine to query Hadoop data. incorrect Hive's query language is not as feature complete as the full SQL standard. incorrect MC Which statement is NOT CORRECT? RDDs allow for two forms of operations: transformations and actions. incorrect RDDs offer failure protection by tracking the lineage of operations that are applied on them. incorrect RDDs are structured and represent a collection of columnar objects. correct RDDs represent an abstract, immutable data structure. incorrect MC Which statement is NOT CORRECT? Veracity in Big Data refers to data "in change". correct Variety in Big Data refers to data "in many forms". incorrect Velocity in Big Data refers to data "in movement". incorrect Volume in Big Data refers to data "at rest". incorrect MC Which statement is NOT CORRECT? A reducer in Hadoop reduces a collection of elements to one or more output elements. incorrect Reducer workers in Hadoop will start once all mapper workers have fished. correct A mapper in Hadoop maps each element in a collection to one or more output elements. incorrect A MapReduce pipeline in Hadoop can include an optional Sorter to sort the final output. incorrect MC Which statement is NOT CORRECT? NodeManagers in YARN are responsible for setting up containers on the node hosting a particular (sub)task. incorrect YARN's JobHistoryServer keeps a log of all finished jobs. incorrect The YARN ApplicationMaster contains a scheduler which will hold submitted jobs in a queue until they are deemed ready to start. correct Apart from handling MapReduce programs, YARN can also be used to manage other types of applications. incorrect MC Which statement is CORRECT? One of the disadvantages of Spark is that its streaming API does not allow to join multiple streams. incorrect One of the disadvantages of Spark is that its streaming and machine learning APIs are still mostly RDD based. correct One of the disadvantages of Spark is that it has no way to deal with graph based data. incorrect One of the disadvantages of Spark is that it does not support streaming data. incorrect MC Which statement is CORRECT? One of the disadvantages of Spark is that its streaming API does not allow to join multiple streams. incorrect One of the disadvantages of Spark is that it does not support streaming data. incorrect One of the disadvantages of Spark is that it has no way to deal with graph based data. incorrect One of the disadvantages of Spark is that its streaming and machine learning APIs are still mostly RDD based. correct MC Which statement is CORRECT? HBase works well on large clusters as well as small ones having a few nodes. incorrect HBase offers a SQL engine to query its data. incorrect HBase can be considered as a NoSQL database. correct MapReduce programs cannot be used with HBase. Data is accessed using simple put and get commands instead. incorrect