Refer to the exhibit.
What is the approximate R-squared value for a linear regression model fitted to the data associated with this scatterplot?
After which phase of the data analytics lifecycle should you determine if the model needs any recalibration?
What type of variable is the dependent variable from a logistic regression?
In K-means clustering, what is a graph of the WSS versus the value of K used to help determine?
When using association rules, what is an itemset?
When building a K-means clustering model, you notice that the clusters did not segment on variables that you expected. What should you do?
Refer to the exhibit.
To predict whether or not a customer will renew their annual property insurance policy, an insurance company built and operationalized a naïve Bayes classification model. In the model, there are two class labels, renewal and non-renewal, that are assigned to each customer based on their attributes.
A subset of the key attributes, their values, and corresponding conditional probabilities are provided in the exhibit.
A customer has the following attributes:
● Age is greater than 65 years
● Owns their own home
● Renewal month is August
If 20% of customers do not renew the police every year, what is the score for a renewal in the naïve Bayesian model for the customer described above?
What metrics are used to help calculate relevance in text analysis?
Which SQL OLAP grouping extension returns a result for each output row with 1 identifying a summary row and 0 identifying grouped rows?
What does “MAD” in MADlib stand for?
What data asset is an example of quasi-structured data?
What is the purpose of applying the naïve Bayes conditional independence assumption?
Match each task to its description.
What is a key consideration when preparing a presentation intended for analysts?
What action occurs during feature selection in the model building phase of the data analytics lifecycle?
Which R function plots a distribution of a single variable along two different axes?