MC Given the following five transactions:
T1 {K, A, D, B}
T2 {D, A, C, E, B}
T3 {C, A, B, D}
T4 {B, A, E}
T5 {B, E, D}
Consider the association rule R: A -> BD.
Which statement is correct? The support of R is 60% and the confidence is 100%. incorrect The support of R is 60% and the confidence is 75%. correct The support of R is 100% and the confidence is 75%. incorrect The support of R is 75% and the confidence is 60%. incorrect
MC Which statement is CORRECT? The geodesic represents the longest path between two nodes. incorrect The betweenness counts the number of the times that a node or edge occurs in the geodesics of the network. correct The graph theoretic center is the node with the highest minimum distance to all other nodes. incorrect The closeness is always higher than the betweenness. incorrect
MC Which statement is NOT CORRECT? Although the benefit component is usually not that difficult to approximate, the costs are much harder to precisely quantify. correct Negative ROI of analytics often boils down to the lack of good quality data, management support and a company-wide data driven decision culture incorrect ROI analysis offers a common firm-wide language to compare multiple investment opportunities and decide which one(s) to go for. incorrect For companies like Facebook, Amazon, Netflix and Google a positive ROI is obvious since they essentially thrive on data and analytics. incorrect
MC Bootstrapping refers to: Drawing samples with replacement. correct Drawing samples without replacement. incorrect
MC Clustering, association rules and sequence rules are examples of: Predictive analytics incorrect Descriptive analytics correct
MC Consider a data set with a multiclass target variable as follows: 25% bad payers, 25% poor payers, 25% medium payers and 25% good payers. In this case, the entropy will be: Minimal incorrect Maximal correct
MC Which of the following costs should be included in a Total Cost of Ownership (TCO) analysis? Acquisition costs incorrect Ownership and operation costs incorrect Post ownership costs incorrect All of these costs correct
MC Decision trees can be used in the following applications: Credit risk scoring, churn prediction, customer profile segmentation and market basket analysis. incorrect Credit risk scoring and churn prediction correct Credit risk scoring incorrect Credit risk scoring, churn prediction and customer profile segmentation incorrect
MC Which statement is CORRECT? The big footprint access to data management and analytics capabilities is a serious drawback of cloud based solutions. incorrect On premise solutions catalyze improved collaboration across business departments and geographical locations. incorrect An important advantage of cloud based solutions concerns the scalability and economies of scale offered. More capacity (e.g. servers) can be added on the fly whenever needed. correct When using on premise solutions, maintenance or upgrade projects may even go by unnoticed. incorrect
MC Which statement is NOT CORRECT? The lift curve can be summarized by reporting top decile lift. incorrect There is a linear relation between AR and AUC: AR = 2 x AUC -1 incorrect Data preprocessing activities such as handling missing values, duplicate data or outliers are preventive measures for dealing with data quality issues. correct Data stewards are the data quality experts who oversee assessing data quality by performing extensive and regular data quality checks. incorrect