Some weird results with R on Solus but not on Windows

Voltti

I asked about this on Stackexchange yesterday but since the issue is a bit puzzling one, so I decided to widen the scope.

So the gist is this: The function to fit linear model to data lm() in R gives me a wrong result compared to Ordinary least squares method to calculate the same, and also wrong compared to Windows (on same machine) which gives the correct one (same as OLS method on SolusOS) with both methods.
So if anyone of you with the R. 3.6.1 could run the following test code (kindly provided by a nice user at Stackoverflow) and see if they match.
The first code:

X <- model.matrix(Sepal.Length ~ Sepal.Width + as.factor(Species), data = iris)
y <- with(iris, Sepal.Length)
R <- t(X) %*% X
solve(R) %*% t(X) %*% y
#                                  [,1]
#(Intercept)                  2.2513932
#Sepal.Width                  0.8035609
#as.factor(Species)versicolor 1.4587431
#as.factor(Species)virginica  1.9468166

and the second:

coef(lm(Sepal.Length ~ Sepal.Width + Species, data = iris))
#(Intercept)       Sepal.Width Speciesversicolor  Speciesvirginica
#  2.2513932         0.8035609         1.4587431         1.9468166

So these codes should give the same results for coefficients for the model. However the latter piece of code gives me:

coef(lm(Sepal.Length ~ Sepal.Width + Species, data = iris))
#(Intercept)       Sepal.Width Speciesversicolor  Speciesvirginica
#  -1.1562296      -0.3158123        11.5719475       11.6048354

I know this is a long shot but why not...

Thanks!

algent

Voltti I don't know how much is this related but you can check this task in developers page.

adurante

Voltti

Feel free to join the fun https://dev.getsol.us/T8464

The devs are aware of the issue and myself and a few others are trying to replicate the issue on VM's and find fixes. Any input you can provide at the link I provided would go a long way

Voltti

Thank you, I will look into it and see if I can be of any help!

dschinn1001

Voltti Don't forget - Results are depending too on the health of your mainboard/processor. Often it happens that processor is driven down to lower efficiency - simply by your current usage - for example you play less games a longer time, then processor could be switched to 900 MHz with 2 or 4 cores.

DataDrake

dschinn1001 No they aren't. This is most likely the result of a bug in R or one of its libraries that is hardware-specific and has absolutely nothing to do with the age of the hardware. Processor frequency also has nothing to do with the accuracy of the calculations. And as far as throttling related to temperature, most coolers will dissipate the bulk of the heat after a gaming session within 20-30 minutes. Cores are never disabled to reduce heat, they are simply not used. All cores should always go to the lowest possible frequency when idle.

Voltti

dschinn1001
Hard to believe that it would be a thing in this case as it would make computation very very unreliable. I'm sure it can be a thing in some non-deterministic calculations/simulations where actual calculations rather than just calculation efficiency depends on processor's/system's performance. edit: that is the system's characteristic/performance is a parameter in the actual calculations (not sure if I can express this in a clear understandable way; I'm also not an expert in the area).

siddhsoftware

The R version is the same on both platforms.
Here is the output of running "version" command on both platforms.
Linux and windows.

DataDrake

Voltti Most of what you would use R for is deterministic and highly repeatable though.

Voltti

@DataDrake Exactly my point as well.
I think I found the suspect for this issue though. I have provided some info in the dev.getsol.us thread.