Username or Email. Sign In Cancel. R Pubs by RStudio. Sign in Register. Introduction to Statistical Learning - Chap10 Solutions This is the solutions to the exercises of chapter 10 of the excellent book "Introduction to Statistical Learning".

Introduction to Statistical Learning - Chap9 Solutions This is the solutions to the exercises of chapter 9 of the excellent book "Introduction to Statistical Learning". Introduction to Statistical Learning - Chap8 Solutions This is the solutions to the exercises of chapter 8 of the excellent book "Introduction to Statistical Learning".

Introduction to Statistical Learning - Chap7 Solutions This is the solutions to the exercises of chapter 7 of the excellent book "Introduction to Statistical Learning". Introduction to Statistical Learning - Chap6 Solutions This is the solutions to the exercises of chapter 6 of the excellent book "Introduction to Statistical Learning". Introduction to Statistical Learning - Chap5 Solutions This is the solutions to the exercises of chapter 5 of the excellent book "Introduction to Statistical Learning".

Introduction to Statistical Learning - Chap4 Solutions This is the solutions to the exercises of chapter 4 of the excellent book "Introduction to Statistical Learning". Introduction to Statistical Learning - Chap3 Solutions This is the solutions to the exercises of chapter 3 of the excellent book "Introduction to Statistical Learning".

Introduction to Statistical Learning - Chap2 Solutions This is the solutions to the exercises of chapter 2 of the excellent book "Introduction to Statistical Learning".We perform best subset, forward stepwise, and backward stepwise selection on a single data set. Explain your answers:. The smallest training RSS will be for the model with best subset approach. This is because the model will be chosen after considering all the possible models with k parameters for best subset.

This is not true for either backward stepwise or forward stepwise. Similar will be the case for forward stepwise and backward stepwise approach. But this might not be nessassrily true for test RSS. More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance. More flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias.

Less flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance.

Less flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias. For parts a through eindicate which of i. Justify your answer. This will lead to decreased RSS. As the model is becoming more and more flexible the test RSS will reduce first and then start increasing when overfitting will start.

This will lead to increased RSS. As the model is becoming less and less flexible the test RSS will decrease first and then start increasing when overfitting will start. It is well-known that ridge regression tends to give similar coefficient values to correlated variables, whereas the lasso may give quite different coefficient values to correlated variables. We will now explore this property in a very simple setting. Consider 6. Your plot should confirm that 6.

In this exercise, we will generate simulated data, and will then use this data to perform best subset selection. Use the regsubsets function to perform best subset selection in order to choose the best model containing the predictors X,X2. Show some plots to provide evidence for your answer, and report the coefficients of the best model obtained.

Note you will need to use the data. Repeat cusing forward stepwise selection and also using backwards stepwise selection. How does your answer compare to the results in c? Now fit a lasso model to the simulated data, again using X,X2.GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.

Skip to content. Permalink Dismiss Join GitHub today GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.

Sign up. Branch: master. Find file Copy path. Raw Blame History. Disadvantages are hard to interpret and prone to overfitting. A more flexible approach might be preferred is the underlying data is very complex simple linear fit doesn't suffice or if we mainly care about the result and not inference. A less flexible model is preferred is the underlying data has a simple shape or if inference is important. Non-parametric methods don't make explicit assumptions on the shape of the data.

This can have the advantage of not needing to make an assumption on the form of the function and can more accurately fit a wider range of shapes for the underlying data. The key disadvantage is that they need a large number of observations to fit an accurate estimate. You signed in with another tab or window. Reload to refresh your session.

You signed out in another tab or window. Advantages of a very flexible model include better fit to data and fewer prior assumptions. For parametric methods, we make an assumption about the shape of the underlying data, select a model form, and fit the data to our selected form.Describe the null hypotheses to which the p-values given in Table 3.

Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coefficients of the linear model. The p-values for intercept and TV, radio is less than 0. But the p-value for newspaper is greater than 0. Thus we can conclude that TV and radio are significant in predicting sales but newspaper is not. The response is starting salary after graduation in thousands of dollars.

Justify your answer. The small coefficient does not indicate the less effect of the interaction term. It can be checked by looking at the p-value of the coefficient to determine its statistical significance. I then fit a linear regression model to the data, as well as a separate cubic regression, i.

Suppose that the true relationship between X and Y is linear, i. Consider the training residual sum of squares RSS for the linear regression, and also the training RSS for the cubic regression. Would we expect one to be lower than the other, would we expect them to be the same, or is there not enough information to tell? We cannot comment on test RSS as need to know how different is the relationship from linear. What is the predicted mpg associated with a horsepower of 98?

Plot the response and the predictor. Use the abline function to display the least squares regression line. Use the plot function to produce diagnostic plots of the least squares regression fit. Comment on any problems you see with the fit. Compute the matrix of correlations between the variables using the function cor. You will need to exclude the name variable, cor which is qualitative.

Use the lm function to perform a multiple linear regression with mpg as the response and all other variables except name as the predictors.

### ISLR EXERCISES

Use the summary function to print the results. Use the plot function to produce diagnostic plots of the linear regression fit. Do the residual plots suggest any unusually large outliers?

Does the leverage plot identify any observations with unusually high leverage? Do any interactions appear to be statistically significant? Try a few different transformations of the variables, such as log X ,??? X, X2. Comment on your findings.

Provide an interpretation of each coefficient in the model. Be careful-some of the variables in the model are qualitative!

Write out the model in equation form, being careful to handle the qualitative variables properly.TV and radio are related to sales but no evidence that newspaper is associated with sales in the presence of other predictors.

KNN regression averages the closest observations to estimate prediction, KNN classifier assigns classification group based on majority of closest observations. If the additional predictors lead to overfitting, the testing RSS could be worse higher for the cubic regression fit. The cubic regression fit should produce a better RSS on the training set because it can adjust for the non-linearity. Similar to training RSS, the cubic regression fit should produce a better RSS on the testing set because it can adjust for the non-linearity.

Using equation 3. Small std. Same as Part a. The two regression lines should be the same just with the axes switched, so it would make sense that the t-statistic is the same both are Anova test also suggests polynomial fit is not any better. Decreased variance along regression line. Fit for original y was already very good, so coef estimates are about the same for reduced epsilon. Coefficient estimates are farther from true value but not by too much. Coefficient for x1 is statistically significant but the coefficient for x2 is not given the presense of x1.

Fewer predictors have statistically significant impact when given the presence of other predictors. Yes, there is a relationship between predictor and response ii. Coefficient is negative: relationship is negative iv. Part f try 3 predictor transformations fit. I log weight Part g confint fit.

Part b fit. Part f fit. Part j confint fit.Both conceptual and applied exercises were solved. An effort was made to detail all the answers and to provide a set of bibliographical references that we found useful. The exercises were solved using Python instead of R. You are welcome to collaborate. The main motivation of this project was learning. Today there are several good books and other resources from which to learn the material we covered, and we spent some time choosing a good learning project.

We chose ISLR because it is an excellent, clear introduction to statistical learning, that keeps a nice balance between theory, intuition, mathematical rigour and programming. Our main goal was to use the exercises as an excuse to improve our proficiency using Python's data science stack. We had done other data science projects with Python, but, as we imagined, we still had a bit more to learn and still do! Since the book was written with R in mind, it made the use of Python a cool additional challenge.

We are strong advocates of the active learning principles, and this project, once more, reinforced them in our minds. If you're starting out in machine learning with Python or R! This project was developed using Python 3. We tried to stay within the standard Python data science stack as much as possible. Accordingly, our main Python packages were numpy, matplotlib, pandas, seaborn, statsmodels and scikit-learn. You should be able to run this with the standard Python setup, and the additional libraries we list below.

If you're just starting out with Python, here's a more complete 'how-to'. We recommend using Anaconda whether you are using Linux, Mac or Windows. Anaconda allows you to easily manage several Python environments. An environment is a collection of installed Python packages. Imagine that you have two projects with different requirements: a recent one with, say, Python 3.

A good environment manager helps you install libraries and allows you to switch between both environments easily, avoiding dependencies migraines. You can even work on both at the same time.An Introduction to Statistical Learning.

Start anytime in self-paced mode. This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist. Winner of the Eric Ziegel award from Technometrics.

For a more advanced treatment of these topics: The Elements of Statistical Learning. Slides and video tutorials related to this book by Abass Al Sharif can be downloaded here. Inspired by "The Elements of Statistical Learning'' Hastie, Tibshirani and Friedmanthis book provides clear and intuitive guidance on how to implement cutting edge statistical and machine learning methods.

ISL makes modern methods accessible to a wide audience without requiring a background in Statistics or Computer Science. The authors give precise, practical explanations of what methods are available, and when to use them, including explicit R code. Anyone who wants to intelligently analyze complex data should own this book.

As a textbook for an introduction to data science through machine learning, there is much to like about ISLR. As a junior at university, it is by far the most well-written textbook I have ever used, a sentiment mirrored by all my other classmates. One friend, graduating this spring with majors in Math and Data Analytics, cried out in anger that no other textbook had ever come close to the quality of this one. You and your team have turned one of the most technical subjects in my curriculum into an understandable and even enjoyable field to learn about.

Every concept is explained simply, every equation justified, and every figure chosen perfectly to clearly illustrate difficult ideas. This is the only textbook I have ever truly enjoyed reading, and I just wanted to thank you and all other contributors for your time and efforts in its production. Then, if you finish that and want more, read The Elements of Statistical Learning.

## An Introduction to Statistical Learning: with Applications in R... with Python!

Full review here. Linear Regression? I covered that last year. Wake me up when we get to Support Vector Machines! Noah Mackey. About this Book. R Code for Labs.

Data Sets and Figures. ISLR Package. Get the Book. Author Bios.

## comments