I asked about this on Stackexchange yesterday but since the issue is a bit puzzling one, so I decided to widen the scope.
So the gist is this: The function to fit linear model to data lm()
in R gives me a wrong result compared to Ordinary least squares method to calculate the same, and also wrong compared to Windows (on same machine) which gives the correct one (same as OLS method on SolusOS) with both methods.
So if anyone of you with the R. 3.6.1 could run the following test code (kindly provided by a nice user at Stackoverflow) and see if they match.
The first code:
X <- model.matrix(Sepal.Length ~ Sepal.Width + as.factor(Species), data = iris)
y <- with(iris, Sepal.Length)
R <- t(X) %*% X
solve(R) %*% t(X) %*% y
# [,1]
#(Intercept) 2.2513932
#Sepal.Width 0.8035609
#as.factor(Species)versicolor 1.4587431
#as.factor(Species)virginica 1.9468166
and the second:
coef(lm(Sepal.Length ~ Sepal.Width + Species, data = iris))
#(Intercept) Sepal.Width Speciesversicolor Speciesvirginica
# 2.2513932 0.8035609 1.4587431 1.9468166
So these codes should give the same results for coefficients for the model. However the latter piece of code gives me:
coef(lm(Sepal.Length ~ Sepal.Width + Species, data = iris))
#(Intercept) Sepal.Width Speciesversicolor Speciesvirginica
# -1.1562296 -0.3158123 11.5719475 11.6048354
I know this is a long shot but why not...
Thanks!