Question
You’ve got a data set to work having p (no. of variable) > n (no. of observation). Why is OLS as bad option to work with? Which techniques would be best to use? Why?

Answers

In such high dimensional data sets, we can’t use classical regression techniques, since their assumptions tend to fail. When p > n, we can no longer calculate a unique least square coefficient estimate, the variances become infinite, so OLS cannot be used at all.
To combat this situation, we can use penalized regression methods like lasso, LARS, ridge which can shrink the coefficients to reduce variance. Precisely, ridge regression works best in situations where the least square estimates have higher variance.
Among other methods include subset regression, forward stepwise regression.
    Your Comment






Search
Can you Answer!!
  • Q Which castle is sometimes referred to as 'The Key to England'?
  • Q Can we use Physical Join in BMM layer
  • Q What are age pyramids?
  • Q SAP PS job description? job profile of SAP PS?
  • Q How do i troubleshoot non-daos related database problems?
  • Q Partition of Bengal was made by:
  • Q Why do I get 80040108 errors?
  • Q What are the steps do you take to get organization context to bring testing into the development process.
  • Q How would you begin to decipher these cave paintings?
  • Q Clemintina Campbell famous as who
  • Q Proximity analysis of coal provides data for a first, general assessment of a coal's quality and type. What elements it reports ?