讲解:STATS 4014、R、Data Science、RJa
STATS 4014Advanced Data ScienceAssignment 4CHECKLIST : Have you shown all of your working, including probability notation where necessary? : Have you given all numbers to 3 decimal places unless otherwise stated? : Have you included all R output and plots to support your answers where necessary? : Have you included all of your R code? : Have you made sure that all plots and tables each have a caption? : If before the deadline, have you submitted your assignment via the online submission on MyUni? : Is your submission a single pdf file - correctly orientated, easy to read? If not, penalties apply. : Penalties for more than one document - 10% of final mark for each extra document. Note that youmay resubmit and your final version is marked, but the final document should be a single file. : Penalties for late submission - within 24 hours 40% of final mark. After 24 hours, assignment is notmarked and you get zero. : Assignments emailed instead of submitted by the online submission on MyUni will not be markedand will receive zero. : Have you checked that the assignment submitted is the correct one, as we cannot accept othersubmissions after the due date?Due date: Friday 17th May 2019 (Week 9), 5pm.Q1. Natural splinesConsider the data(x1, y1),(x2, y2), . . . ,(xn, yn).Suppose that g(x) is a natural cubic spline with knotsLet g(x) be any other twice continuously differentiable function such that1a. If h(x) = g(x) g(x) then use integration by parts to show that if h(x) = 0 for all a c. Show that the solution to the problem of finding a smoothing spline:must be a natural cubic spline with knots atx1, x2, . . . , xn.Q2. ROC classa. Create an S3 class that deals with ROC curves. For complete marks, you will needi. a constructor,ii. a print function,iii. a plot function, andiv. a generic confusion matrix function that takes a ROC object and cutoff and returns the confusionmatrix.To give an example, code using my S3 class is given below.data(starwars)starwars starwars %>%mutate(human = ifelse(species == Human, 1, 0)) %>%na.omit()starwars_lr starwars_roc pred = predict(starwars_lr),obs = starwars$human)starwars_roc## The number of observations is 29.## The number of positives is 18.## The number of negatives is 11.#### First rows of data## # A tibble: 6 x 2## pred obs## 2## 1 0.705 1## 2 2.31 1## 3 0.184 1## 4 2.37 1## 5 0.836 1## 6 0.665 1#### First row of summary data frame:## TP FP FN TN Score FPR TPR precision recall## 1 0 0 18 11 2.3652725 0.00000000 0.0000000STATS 4014作业代做、代写R编程语言作业、Data Science作业代做、代写R程序作业 代做Java程序|帮0 NaN 0.00000000## 2 1 0 17 11 2.3093987 0.00000000 0.05555556 1.0000000 0.05555556## 3 2 0 16 11 1.6933920 0.00000000 0.11111111 1.0000000 0.11111111## 4 2 1 16 10 0.8576164 0.09090909 0.11111111 0.6666667 0.11111111## 5 2 2 16 9 0.8357629 0.18181818 0.11111111 0.5000000 0.11111111## 6 3 2 15 9 0.7668831 0.18181818 0.16666667 0.6000000 0.16666667�TPR plot(starwars_roc, type = PR)conf_matrix(starwars_roc)## # A tibble: 2 x 3## HC `0` `1`## ## 1 0 7 6## 2 1 4 12conf_matrix(starwars_roc, cutoff = 0.9)## # A tibble: 2 x 3## HC `0` `1`## ## 1 0 10 16## 2 1 1 2conf_matrix(1:10, cutoff = 0.9)## [1] I do not know how to deal with the class defaultQ3. Titanic datasetThe data in titanic.csv contains the details for 712 passengers on the ship Titanic. The following variablesare given:4Variable Definition Keysurvival Survival 0 = No 1 = Yespclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rdsex SexAge Age in yearssibsp # of siblings / spouses aboard the Titanicparch # of parents / children aboard the Titanicticket Ticket numberfare Passenger farecabin Cabin numberembarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southamptonpclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lowerage: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5sibsp: The dataset defines family relations in this way. . . Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)parch: The dataset defines family relations in this way. . . Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.a. Read in the dataset and clean it.b. Fit a MARS model.c. Fit a CART.d. Using both models, predict which is more likely to survive a first class 24 year old male travelling aloneor a first class 24 year old female travelling alone.e. According to both models, which class and sex are least likely to survive?5Mark schemePart Marks Difficulty Area Type CommentsQ11a 7 0.29 Splines proof 7 for proof1b 7 0.29 Splines proof 7 for proof1c 5 0.00 Splines proof 5 for proofTotal 19Q22ai 5 0.00 S3 OOP coding 5 for code2aii 5 0.00 S3 OOP coding 5 for code2aiii 6 0.50 S3 OOP coding 6 for code2aiv 6 0.50 S3 OOP coding 6 for codeTotal 22Q33ab 4 0.00 MARS/CART analysis 4 for analysis3c 2 0.00 MARS/CART analysis 2 for analysis3d 4 0.00 MARS/CART analysis 4 for analysis3e 3 0.00 MARS/CART analysis 3 for analysisTotal 13Assignment total 546转自:http://www.7daixie.com/2019051621116636.html