Cette page appartient aux archives web de l'EPFL et n'est plus tenue à jour.
This page belongs to EPFL's web archive and is no longer updated.

R becomes crazy when squared


For the fun, we compute R2 in R the way it is given in the slides, namely by computing

(t(yHat) %*% yHat)/(t(y) %*% y).

However, we were surprised that the result (that we won't give here for obvious reason) was not the same as the result given by the "Multiple R-squared" of the function summary(myLinearModel).

Should we worry ? If not, which value should we use for the practical ?

Nice evening, Philémon

Posted by Philémon Orphée Favrod on Thursday 14 November 2013 at 18:05
This is an excellent question! I also tried using the formula from the slides for computing the R^2 and got a different answer. It turns out that there are slightly different definitions for R^2 depending on how the constant in the model is taken into account. The formula R uses for computing R^2 is:

R^2 = 1 - ||e||^2 / ||y - 1 yBar||^2 = ||yHat - 1 yBar||^2 / ||y - 1 yBar||^2,

where 1 is an n x 1 vector of ones and yBar is the mean of y. This is indeed the definition most commonly found in the literature (see e.g. N.R. Draper, H.S. Smith. Applied regression analysis, 3rd ed., Wiley, 1998). When the model does not include the constant, R uses the same definition as on the lecture slides (but the model you fit on the practical should include the constant!).

In the practical, you can use either definition as long as you are explicit about which one you used.
Posted by Mikael Kuusela on Saturday 16 November 2013 at 19:24