Assignment 10
This assignment provides you with an opportunity to implement a classification model using kNN.
Problem (100 Points)
- (10 Pts) Download the data set for the tutorial and load it into R.
- (50 Pts) Follow this tutorial on applying kNN to prostate cancer detection and implement all of the steps in an R Notebook. Make sure to explain each step and what it does. (Note: The data set provided as part of this assignment has been slightly modified from the one used in the tutorial, so small deviations in the result can be expected.)
- (30 Pts) Once you've complete the tutorial, try another kNN implementation from another package, such as the caret package.
- (10 Pts) Use the confusionMatrix function from the caret package to determine the accuracy of both algorithms. Here's a tutorial on confusion matrices and what they mean.
Submission Details
- Your submission must contain two files: the .Rmd notebook and a knitted PDF or HTML (from the notebook). Name your .Rmd R Notebook, DA5020.A11.LastName.Rmd and your PDF DA5020.A11.LastName.{pdf,html}, where LastName is *your* last name.
- The .Rmd file must be fully commented and properly "chunked" R code and detailed explanations. Make sure that it is easy to recognize which question you answer and that your code runs from beginning to end (because that is how we will test it.) Code that doesn't execute, stops, throws errors will receive -- naturally -- receive no points. If the graders have to "debug" your code or spend any effort getting it to run, substantial points will be deducted.
- Not submitting the knitted PDF will result in reduction of 30 points.
- Not submitting the .Rmd file (or both) will result in a score of 0.
Useful Resources