- You Are Here:
- Home
- Government
- Departments
- Assessor's Office
- How Property is Valued
- Mass Appraisal
- Market Approach Methods for Mass Appraisal
- Multiple Regression
Multiple Regression
Y =b0+ b1x1 + b2x2 …+ bp+ e
where Y is the dependent variable, the b's are the regression coefficients for the corresponding x (independent) terms, b0 is a constant or intercept, and e is the error term reflected in the residuals. The parameters of the regression equation are estimated using the ordinary least squares method (OLS).
- Ordinary least squares: This method derives its name from
the criterion used to draw the best-fit regression line: a line such that
the sum of the squared deviations of the distances of all the points to
the line is minimized.
- Intercept: The intercept, b0,
is where the regression plane intersects with the Y-axis. It is equal to
the estimated Y value when all the independents have a value of 0.
- Regression coefficient: Regression coefficients bi
are the slopes of the regression plane in the direction of xi.
Each regression coefficient represents the net effect the ith
variable has on the dependent variable, holding the remaining x's
in the equation constant.
- Beta weights are the regression coefficients
for standardized data. Beta is the average amount by which the dependent
variable increases when the independent variable increases one standard
deviation and other independent variables are held constant. The ratio of
the beta weights is the ratio of the predictive importance of the
independent variables.
- Residuals are the difference between the
observed values and those predicted by the regression equation
- Dummy
variables:
Regression assumes interval data, but dichotomies may be considered a
special case of intervalness. Nominal and ordinal categories can be
transformed into sets of dichotomies, called dummy variables. To prevent
perfect multicollinearity, one category must be left out.
- Interpretation of b for
dummy variables.
For b coefficients for dummy variables, which have been binary
coded (the usual 1=present, 0="not" present), b is relative to
the reference category (the category left out).
- Multiple
R: The correlation
coefficient between the observed and predicted values. It ranges in value
from 0 to 1. A small value indicates that there is
little or no linear relationship between the dependent variable and the
independent variables.
- Multiple
R 2 is
the percent of the variance in the dependent variable, explained by the
independent variables. It is also called the coefficient of multiple
determination. Mathematically, R2 = [ 1 - (SSE/SST) ] ,
where
·
SSE
= error sum of squares
= (Yi - Est Yi) 2 where Yi is the actual
value of Y for the ith case and Est Yi
is the regression prediction for the ith case.
·
SST
= total sum of
squares = (Yi - MeanY) 2
·
Adjusted
R-Square: When
there are a large number of independent variables, it is possible that R2
may become artificially large, simply because some independent variables'
chance variations "explain" small parts of the variance of the
dependent variable. It is therefore essential to adjust the value of R2
as the number of independent variables increases. In the case of a few
independent variables, R2 and adjusted R2
will be close. In the case of a large number of independent variables, adjusted
R2 may be noticeably lower.
·
Multicollinearity is the intercorrelation of the
independent variables. The values of r2's near 1 violate the
assumption of no perfect collinearity, while high r2's
increase the standard error of the regression coefficients and make assessment
of the unique role of each independent variable difficult or impossible. While
simple correlations tell something about multicollinearity, the preferred
method of assessing multicollinearity is to compute the determinant of the
correlation matrix. Determinants near zero indicate that some or all
independent variables are highly correlated.
·
Partial
correlation is
the correlation of two variables while controlling for a third or more other
variables. For example r12.34 is the correlation of variables
1 and 2, controlling for variables 3 and 4. Partial correlation r12.34
equal to uncontrolled correlation r12 No effect of control
variables Partial correlation near 0 Original correlation is spurious.