## [tech-spec] multicollinearity

• From: tim hesselsweet <tim_hess1@xxxxxxxxx>
• To: tech-spec@xxxxxxxxxxxxx
• Date: Thu, 24 Feb 2005 21:56:39 -0800 (PST)

```pairwise plots of variables are good for assessing
relationship between response and potential predictor
as well as pairwise correlations among predictors.
but only expert chart readers (sogi, tufte) can spot
multivariate collinearity.  splus computes everything
you need to assess collinearity problems but there's
no single function one can call.  these two approaches
fit the bill:

VIF runs regressions with each of the explanatory
variables as a response. VIF values > 4 indicate a
problem.  the argument to the function is the set of
predictors.  you would also have to augment for

> InflVIF <- function(dataf) {
+ fit1 <- lm(minh ~ minat)
+ sum1 <- summary(fit1)
+ r1 <- sum1\$r.squared
+ vif1 <- 1/(1-r1)
+ fit2 <- lm(minat ~ minh)
+ sum2 <- summary(fit2)
+ r2 <- sum2\$r.squared
+ vif2 <- 1/(1-r2)
+ vif <- data.frame(vif1,vif2)
+ print.data.frame(vif)
+ }

> InflVIF(q1.df)
vif1    vif2
1 1.03781 1.03781
vif1    vif2
1 1.03781 1.03781

Condition number greater than 15 indicates problem and
greater than 30 is big problem.  argument to function
is set of predictors.

> InflConditionNum <- function(dataf) {
+ r <- cor(dataf)
+ eigenx <- eigen(r)
+ eigenval <- eigenx\$value
+ lemdamax <- max(eigenval)
+ lemdamin <- min(eigenval)
+ conditionnum <- lemdamax/lemdamin
+ return(conditionnum)
+ }

> new.df <- data.frame(minh,minat)
> InflConditionNum(new.df)
[1] 1.471798

tim

__________________________________
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250

```

### Other related posts:

• » [tech-spec] multicollinearity