MC Given the following five transactions:
T1 {K, A, D, B}
T2 {D, A, C, E, B}
T3 {C, A, B, D}
T4 {B, A, E}
T5 {B, E, D}
Consider the association rule R: A -> BD.
Which statement is correct? The support of R is 60% and the confidence is 75%. correct The support of R is 75% and the confidence is 60%. incorrect The support of R is 60% and the confidence is 100%. incorrect The support of R is 100% and the confidence is 75%. incorrect
MC Outlying observations which represent erroneous data are treated using... missing value procedures. correct truncation or capping. incorrect
MC Featurization in the context of neural networks refers to... adding more nodes to the network. incorrect adding more local features to the data set. incorrect making features (=inputs) out of the network characteristics. correct selecting the most predictive features. incorrect
MC OLAP (On-Line Analytical Processing) can help in which of the following steps of the analytics process? Data transformation incorrect Data denormalizatio incorrect Data collection incorrect Data visualization correct
MC The GIGO-principle mainly relates to which aspect of the analytics process? Data selection incorrect Data transformation incorrect Data cleaning incorrect The GIGO-principle applies to all of these listed aspects correct
MC To guarantee maximum independence and organizational impact of analytics, it is important that... the Chief Data Officer (CDO) or Chief Analytics Officer (CAO) reports to the CIO or CFO. incorrect analytics is supervised only locally in the business units. incorrect the CIO takes care of all analytical responsibilities. incorrect a Chief Data Officer or Chief Analytics officer is added to the executive committee who directly reports to the CEO. correct
MC Which of the following strategies can be used to deal with missing values? Keep incorrect Delete incorrect Replace/impute incorrect All of these strategies can be applied correct
MC Which statement is CORRECT? Data preprocessing activities such as handling missing values, duplicate data or outliers are preventive measures for dealing with data quality issues. incorrect Data stewards can request data scientists to check or complete the value of a field. incorrect Data owners are the data quality experts who are in charge of assessing data quality by performing extensive and regular data quality checks. incorrect Quality of data is key to the success of any analytical exercise since it has a direct and measurable impact on the quality of the analytical model and hence its economic value. correct
MC Which of the following strategies can be used to deal with missing values? Keep incorrect Delete incorrect Replace/impute incorrect All of these strategies can be applied correct
MC Consider a data set with a multiclass target variable as follows: 25% bad payers, 25% poor payers, 25% medium payers and 25% good payers. In this case, the entropy will be: Minimal incorrect Maximal correct