farah's blog: MULTICOLLINEARITY

MULTICOLLINEARITY

Nature of Multicollinearity Problem

Many finance/ economic variables, especially time series variables are closely related with each other.

Example; Population and GDP are closely related, i.e highly correlated.

In multiple regression models, a regression coefficient measures the partial effect of that individual variable on Y when all other X variables in the model are fixed.

However, when two explanatory variables move closely together, we cannot assume that one is fixed while the other is changing. Because when one changes, the other one also changes as they are closely related. In such a case it is difficult to isolate the partial effect of a single X variable. This is the problem of Multicollinearity

Definition;

Multicollinearity originally it meant the existence of a “perfect,” or exact, linear relationship among some or all explanatory variables of a regression model.

Multicollinearity occurs when two or more independent variables in a regression model are highly correlated to each other.
Standard error of the OLS parameter estimate will be higher if the corresponding independent variable is more highly correlated to the other independent variables in the model.
Independent variables show no statistical significance when conducting the basic significance test
It is not a mistake in the model specification, but due to the nature of the data at hand

PERFECT MULTICOLLINEARITY

Perfect multicollinearity occurs when there is a perfect linear correlation between two or more independent variables

When independent variable takes a constant value in all observations

Imperfect Multicollinearity

Although perfect multicollinearity is theoretically possible, in practice imperfect multicollinearity is what we commonly observed.

Typical examples of perfect multicollinearity are when the researcher makes a mistake(including the same variable twice or forgetting to omit a default category for a series of dummy variables)

SEVERE MULTICOLLINEARITY

The OLS method cannot produce parameter estimates.

A certain degree of correlation (multicollinearity) between the independent variables is normal and expected in most cases.

Symptoms of Multicollinearity

The symptoms of a multicollinearity problem

§ independent variable(s) considered critical in explaining the model’s dependent variable are not statistically significant according to the tests

§ High R², highly significant F-test, but few or no statistically significant t tests

§ Parameter estimates drastically change values and become statistically significant when excluding some independent variables from the regression

Detecting Multicollinearity

· Few significant t-ratios but a high R² and acollective significance of the variables
· High pairwise correlation between the explanatory variables
· Examination of partial correlations
· Estimate auxiliary regressions
· Estimate variance inflation factor (VIF)

i. A simple test for multicollinearity is to conduct “artificial” regressions between each independent variable (as the “dependent” variable) and the remaining independent variables

ii. Variance Inflation Factors (VIF_j) are calculated as:

iii. VIF_j = 2, for example, means that variance is twice what it would be if X_j, was not affected by multicollinearity

iv. A VIF_j>10 is clear evidence that the estimation of B_j is being affected by `multicollinearity

Addressing Multicollinearity

1. Although it is useful to be aware of the presence of multicollinearity, it is not easy to remedy severe (non-perfect) multicollinearity

2. If possible, adding observations or taking a new sample might help lessen multicollinearity

3. Exclude the independent variables that appear to be causing the problem

4. Modifying the model specification sometimes help, for example:

· using real instead of nominal economic data

· using a reciprocal instead of a polynomial specification on a given independent variable

farah's blog

Sunday, 2 June 2013

MULTICOLLINEARITY

No comments:

Post a Comment

Search This Blog

Blog Archive