In the previous post we calculated the values of

## Classic Linear Regression Model (CLRM)

There are 7 assumptions that **Gaussian** or **Classic** linear regression model (CLRM) makes. These are:

### 1. The regression model is linear in parameters

Recall that the population regression function is given by

### 2. Values of X are fixed in repeated sampling

In this series of posts, we’ll be looking at **fixed regressor models**^{[1]} where the values of X are considered to be fixed when drawing a random sample. For example, in the previous post the values of X were fixed between 80 and 260. If another sample were to be drawn, the sample values of X would be used.

### 3. Zero mean value of the error term

This assumption states that for every given

What this assumption means that all those variables that were not considered to be a part of the regression model do not systematically affect

This assumption also means is that there is no **specification error** or **specification bias**. This happens when we chose the wrong functional form for the regression model, exclude necessary explanatory variables or include unnecessary ones.

### 4. Constant variance of (homoscedasticity)

This assumption states that the variation of

With this assumption, we assume that the variance of

### 5. No autocorrelation between disturbances

Given any two **serial correlation** or **autocorrelation** i.e.

To build an intuition, let’s consider the population regression function

Now suppose that

### 6. The sample size must be greater than the number of parameters to estimate

This is fairly simple to understand. To be able to estimate

### 7. Nature of X variables

There are a few things we assume about the value of

If we have all the values of

Furthermore, we assume that there are no outliers in the

These are the 7 assumptions underlying OLS. In the next section, I’d like to elucidate on assumption 3 and 5 a little more.

## More on Assumption 3. and 5.

In assumption 3. we state that

Functionally misspecifying our regression model will also introduce autocorrelation in our model. Consider the two plots shown below.

In the first plot the line fits the sample points properly. In the second one, it doesn’t fit properly. The sample points seem to form a curve of some sort while we try to fit a straight line through it. What we get is runs of negative residuals followed by positive residuals which indicates autocorrelation.

Autocorrelation can also be introduced not just by functionally misspecifying the regression model but also by the nature of data itself. For example, consider timeseries data representing stock market movement. The value of

When there is autocorrelation, the OLS estimators are no longer the best estimators for predicting the value of

## Conclusion

In this post we looked at the assumptions underlying OLS. In the coming post we’ll look at the Gauss-Markov theorem and the BLUE property of OLS estimators.

[1] There are also **stochastic regression models** where the values of X are not fixed.