Data Mining: Module Four Exercise 4

Published on October 2017 | Categories: Comics | Downloads: 106 | Comments: 0 | Views: 839
of 1
Download PDF   Embed   Report

Page length requirements: 2–3 pages This exercise is a continuation of the data mining project introduced in the Module Two Exercise. Your Assignment Open the Bubba Gump survey data in JMP. Examine the data set and prepare an analytics project plan that describes the survey data set and how it will be used to address the stated business problem. Specifically, the summary should: * Include a description of the population from which the sample was drawn, the sources of data that were combined to construct the sample, the number of customers in the sample, and descriptions of the variables that exist in the data set. * From plots and graphs (generated using JMP, with continuous variables appropriately binned) of the distribution of values for each of the variables in the Bubba Gump sample, describe instances where data may be missing or defective or where variables may contain extreme outliers that affect the usefulness of the survey in a data mining exercise. * Identify correlations and associations, using pairwise correlations and principal components analysis, that would be useful to measure as part of the pre-analytics process, including descriptions of the benefits of each. * Describe how the data set supports analyses that address the stated business problem, and also describe any shortcomings in the data set that might limit its usefulness in a data mining exercise.

Comments

Content

Page length requirements: 2–3 pages This exercise is a continuation of the data mining project introduced in the Module Two Exercise. Your Assignment Open the Bubba Gump survey data in JMP. Examine the data set and prepare an analytics project plan that describes the survey data set and how it will be used to address the stated business problem. Specifically, the summary should: * Include a description of the population from which the sample was drawn, the sources of data that were combined to construct the sample, the number of customers in the sample, and descriptions of the variables that exist in the data set. * From plots and graphs (generated using JMP, with continuous variables appropriately binned) of the distribution of values for each of the variables in the Bubba Gump sample, describe instances where data may be missing or defective or where variables may contain extreme outliers that affect the usefulness of the survey in a data mining exercise. * Identify correlations and associations, using pairwise correlations and principal components analysis, that would be useful to measure as part of the pre-analytics process, including descriptions of the benefits of each. * Describe how the data set supports analyses that address the stated business problem, and also describe any shortcomings in the data set that might limit its usefulness in a data mining exercise.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close