There is no "the" OLS regression.

Starting with very strong assumption:

i.i.d. random sample (this doesn't hold true in timeseries context in fin)

We have data. Data comes from realisations of (y, x). Data is generated by the random variables. We assume that the sample is iid for now -> This implicate that the ordering of data doesnt matter.

We want to estimate $\beta$. There are many equivalent ways to derive the optimal estimator $\hat{\beta}$ for the linear model.

Minimizing the sum of squared errors. (hence least square estimator)
Apply method of moments to estimate BLP $\beta = (E(xx'))^{-1}E(xy)$
Apply MM to BLP condition $E(xe) = 0$

Method1: Minimizing the sum of squared errors

We derived the BLP as $ y_i = x'_i \beta + e_i $

Where $\beta = argmin_{\beta \in \mathbb{R}^k} S(\beta)$ is the minimizer of the expected squared error $$ S(\beta) = E(y_i - x'_i \beta)^2 $$

and has the solution $\beta = (E(x_i x'_i))^{-1} E(x_i y_i)$

Now for the complete dataset: iid ${(y_1, x_1), ... (y_n, x_n)}$

For each datapoint we can plugin $y_i - x_i \beta$

SSE of sample is $$ S_n(\beta) = \frac{1}{n}SSE_n(\beta) $$

$\hat{\beta}$ is called the least squares (LS) estimator.

$y_i$ is scalar and $x_i$ is a $k \times 1$ vector.

$$ \hat{\beta} = (\sum_{i=1}^{n} x_i x'_i)^{-1} \sum_{i=1}^{n} x_i y_i$$

Method2: Moment Estimation of BLP

Equivalently, least square can be written as moment estimator.

$\beta = Q^{-1}_{xx}Q_{xy} = (E(xx'))^{-1}E(xy)$

$\hat{\beta} = \hat{Q}^{-1}_{xx} \hat{Q}_{xy}$

Moment estimators for $Q_{xx}$ and $Q_{xy}$:

$$\hat{Q}_{xy} = \frac{1}{n} \sum_{i=1}^{n} x_i y_i$$ $$\hat{Q}_{xx} = \frac{1}{n} \sum_{i=1}^{n} x_i x'_i$$

This is exactly same as Method1.

Method3: BLP Condition E(xe = 0)

Define the fitted values as residuals

$$ \hat{y_i} = x'_i \hat{\beta} $$ $$ \hat{e_i} = y_i - x'_i \hat{\beta}$$

The sample equivalent of $E(xe) = 0$ is:

$$ 0 =\frac{1}{n}\sum_{j=1}^{n} x_i \hat{e}_i = \frac{1}{n}\sum_{j=1}^{n} x_i (\hat{y}_i - x'_i \hat{\beta}) $$

$$ \hat{\beta} = (\sum_{i=1}^{n} x_i x'_i)^{-1} \sum_{i=1}^{n} x_i y_i$$

Here moment condition to estimate the condition. in method 2 we used moment generator to $\beta$.

For intercept only model:

$y_i = \mu + e_i$

$E(e_i) = 0$

So, the MM and MLE estimator of population mean is the sample mean.

And the OLS regressor is also the sample mean! Its a constant.

Sample mean = $\hat{\mu}$ = $\frac{1}{n} \sum_{i=1}^{n}y_i$

Matrix Form

$$\sum_{i=1}^{n} x_i x'_i = X'X$$ $$\sum_{i=1}^{n} x_i y_i = X'y$$

Least Square Estimator is $$ \hat{\beta} = (X'X)^{-1}(X'y)$$

The estimated version of the model is $y = X\hat{\beta} + \hat{e}$ and the residual vector is $\hat{e} = y - X\hat{\beta}$

$$ X'\hat{e} = 0$$