MC Which of the following measures cannot be used to make the splitting decision in a regression tree? Entropy correct Mean Squared Error (MSE) incorrect ANOVA/F-test incorrect MC What is the correct ranking of the following analytics applications in terms of maturity? HR Analytics (most mature), Marketing Analytics (medium mature), Risk Analytics (least mature). incorrect Marketing Analytics (most mature), Risk Analytics (medium mature), HR Analytics (least mature). incorrect Risk Analytics (most mature), HR Analytics (medium mature), Marketing Analytics (least mature). incorrect Risk Analytics (most mature), Marketing Analytics (medium mature), HR Analytics (least mature). correct MC Which of the following strategies can be used to deal with missing values? Keep incorrect Delete incorrect Replace/impute incorrect All of these strategies can be applied correct MC The GIGO-principle mainly relates to which aspect of the analytics process? Data selection incorrect Data transformation incorrect Data cleaning incorrect The GIGO-principle applies to all of these listed aspects correct MC Given the following decision tree:



According to the decision tree, an applicant with Income > $50.000 and High Debt=Yes is classified as: Good Risk incorrect Bad Risk correct MC Which statement is CORRECT? Featurization in the context of neural networks refers to making features (=inputs) out of the network characteristics. correct The coefficient of determination R2 is often used to measure the performance of classification models. incorrect The adjacency matrix representing a social network is sparse since it contains a lot of non-zero elements. incorrect The big footprint access to data management and analytics capabilities is a serious drawback of cloud based solutions. incorrect MC Outlying observations which represent erroneous data are treated using... missing value procedures. correct truncation or capping. incorrect MC Which of the following is not an advantage of open source software for analytics? It has been thoroughly engineered and extensively tested, validated and completely documented. correct It is available for free. incorrect It can be used in combination with commercial software. incorrect A world-wide network of developers can work on it. incorrect MC Which of the following are interesting data sources to consider to boost the performance of analytical models? Network data incorrect External data incorrect Unstructured data such as text data and multimedia data incorrect All of the above correct MC Which of the following is not an advantage of open source software for analytics? It is available for free. incorrect It can be used in combination with commercial software. incorrect A world-wide network of developers can work on it. incorrect It has been thoroughly engineered and extensively tested, validated and completely documented. correct