Data Mining

This course work need SAS9.2 to sort out, and the data row i will send the attachment via email. and there is a e-book will be helpful will be in the attachment as well, meanwhile the book of Pearson international Edition’s Introduction to Data Mining will be helpful as well, the pic of the book wil be in the emial and im in UK but the country selction dont have UK i must get a first class level

Problem Description

The objective of this coursework is to develop a suitable model to classify the German credit dataset. Submit a report of maximum length 20 pages that documents the process you adopted and the reasons why, as well as the obtained results.

Focus your answers on explaining the rationalising the process you followed (why you made the different choices and how you evaluated different alterna- tives). This coursework is not meant to assess your skills in using SAS“EM, but rather your understanding of the data mining process. Therefore, details of how different tasks are performed in SAS“EM are irrelevant. It is also insuf-ficient to report the outcome from executing different SAS“EM nodes without

commenting upon it and explaining the implications for the problem at hand.

Without partitioning the dataset, perform an exploratory analysis of the data using visualisation and statistical analysis tools. In particular:

“ Consider the distribution of the target variable. Are the two classes balanced in the dataset? Given the nature of the task and the distri-bution of the target variable which performance evaluation measures would you propose to use for this type of problem. Rationalise your


“ Explore the distribution of each independent variable, and comment whether and why a transformation should be considered. Using vi-sualisation methods like histograms, as well as measures like infor-mation value and weights of evidence, quantify the predictive power

of each variable separately.

Assign 30% of the data as test set, 50% as training, and 20% as validation.

To ensure your analysis is unique set your own seed at the Data Partition stage of your analysis. Use as seed the number 38099.

Use variable selection methods as well as logistic regression to determine an appropriate subset of variables that can be used to build a classifier for this problem. Comment on the limitations of the different approaches and discuss your findings. For which of the classification methods that you will develop later do you expect the proposed subset to be most relevant

and why?

Develop and evaluate a classification model for each of the methods that you have been taught. For each method identify up to 3 parameters of the method that you believe are important for the final performance of a model and consider different settings. Report your findings and comment on the sensitivity of performance with respect to each parameter setting.

Reach a final recommendation for a classification model for each method.

Discuss the strengths and weaknesses of different classifiers. (This should not just repeat the material from the course. Try to relate this to your experience from building a classification model for this problem.) Justify

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
The price is based on these factors:
Academic level
Number of pages
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more