The kvr2 package provides functions to calculate nine types of
coefficients of determination (
The coefficient of determination,
- Models without an intercept (No-intercept models)
- Power regression models
- Other fits via transformations (e.g., log-log models)
This package is specifically designed for models that can be represented
as lm objects in R. This includes:
- Standard Linear Models (with an intercept)
- No-intercept Models (e.g.,
lm(y ~ x - 1)) - Power Regression Models (fitted via log-transformation, such as
lm(log(y) ~ log(x)))
Note: This package does not support general non-linear least
squares (nls) or other complex non-linear modeling frameworks. It
focuses on the mathematical sensitivity of
The primary goal of kvr2 is not to provide a definitive “best” for
every scenario, but to serve as an educational and diagnostic resource.
Many users rely on the single value provided by standard software, but
as this package demonstrates, that value is sensitive to the underlying
mathematical definition and the software’s internal defaults.
Through this package, users can:
- Understand Mathematical Sensitivity: Observe firsthand how different algebraic formulas (eight + one definitions) can lead to dramatically different interpretations of the same model fit, especially in non-intercept models.
- Diagnose Negative: It is imperative to acknowledge that a negative (typically in) should not be interpreted as a “bug”; rather, it functions as a critical diagnostic signal. This signal indicates that the model predicts outcomes that fall below the mean of a simple horizontal line.
- Evaluate Robustness and Transformations: Explore Kvalseth’s recommendations for using for consistency and for robustness against outliers, and see how behaves when models are fitted in transformed spaces (e.g., log power regression models).
The package calculates nine indices based on Kvalseth (1985):
-
$R^2_1$ to$R^2_8$ : A classification of existing and historical formulas used in statistical literature and software. -
$R^2_9$ : A robust version of the coefficient of determination based on median absolute deviations, as proposed in the original paper.
You can install the released version of kvr2 from CRAN with:
install.packages("kvr2")You can install the development version of kvr2 like so:
remotes::install_github("indenkun/kvr2")kvr2 provides a simple way to observe how different
In standard linear models with an intercept, most
library(kvr2)
# Dataset from Kvalseth (1985)
df1 <- data.frame(x = 1:6, y = c(15, 37, 52, 59, 83, 92))
# Case A: Linear regression with intercept (Values are consistent)
model_int <- lm(y ~ x, data = df1)
r2(model_int)
#> R2_1 : 0.9808
#> R2_2 : 0.9808
#> R2_3 : 0.9808
#> R2_4 : 0.9808
#> R2_5 : 0.9808
#> R2_6 : 0.9808
#> R2_7 : 0.9966
#> R2_8 : 0.9966
#> R2_9 : 0.9778
#> ---------------------------------
#> (Type: linear, with intercept, n: 6, k: 2)
# Case B: Linear regression without intercept (Values diverge)
model_no_int <- lm(y ~ x - 1, data = df1)
results <- r2(model_no_int)
results
#> R2_1 : 0.9777
#> R2_2 : 1.0836
#> R2_3 : 1.0830
#> R2_4 : 0.9783
#> R2_5 : 0.9808
#> R2_6 : 0.9808
#> R2_7 : 0.9961
#> R2_8 : 0.9961
#> R2_9 : 0.9717
#> ---------------------------------
#> (Type: linear, without intercept, n: 6, k: 1)Observation: In Case B, notice that
The r2() function returns a list object. While the output is formatted
for readability, you can easily access individual values for further
analysis or reporting.
# Accessing specific R2 values from the result object
results$r2_1
#> r2_1
#> 0.9776853
results$r2_9
#> r2_9
#> 0.9717156
# You can also use it in your custom functions or data frames
my_val <- results$r2_1To better understand the divergence between these definitions, the
kvr2 package provides a specialized plotting function. When you apply
plot_kvr2() to your model, it displays both the comparison of
# Example with the forced no-intercept model
plot_kvr2(model_no_int)In the resulting side-by-side plot:
-
Left Panel: Shows which definitions are most affected by the intercept constraint.
-
Right Panel: Reveals if the model (green line) is performing worse than the simple average (red line).
To complement comp_fit() to evaluate models via
standard error metrics such as RMSE, MAE, and MSE.
comp_fit(model_no_int)
#> RMSE : 3.9008
#> MAE : 3.6520
#> MSE : 18.2593
#> ---------------------------------
#> (Type: linear, without intercept, n: 6, k: 1)For details, refer to the documentation for each function.
The comp_model() function allows you to instantly see the impact of
the intercept constraint. In the example below, notice how
comp_model(model_no_int)
#> model | R2_1 | R2_2 | R2_3 | R2_4 | R2_5 | R2_6
#> -----------------------------------------------------------------------
#> with intercept | 0.9808 | 0.9808 | 0.9808 | 0.9808 | 0.9808 | 0.9808
#> without intercept | 0.9777 | 1.0836 | 1.0830 | 0.9783 | 0.9808 | 0.9808
#>
#> model | R2_7 | R2_8 | R2_9 | RMSE | MAE | MSE
#> ------------------------------------------------------------------------
#> with intercept | 0.9966 | 0.9966 | 0.9778 | 3.6165 | 3.5238 | 19.6190
#> without intercept | 0.9961 | 0.9961 | 0.9717 | 3.9008 | 3.6520 | 18.2593
#> ---------------------------------
#>
#> Note: Some R2 values exceed 1.0 or are negative, indicating that these definitions may be inappropriate for the no-intercept model.Kvalseth, T. O. (1985). Cautionary Note about
