I want to find which distributions fits my dataset best.I used envstats package in R
to perform goodness of test for various distributions and chose the distribution based on the p-value. But the package does not support loglogistic and 3 parameter weibull distribution. please suggest any package that can calculate p-value for loglogistic and 3-parameter weibull distribution in R.
Hi @srivatsa,
The FAdist package allows the parameterization of the 3-parameter Weibuill and loglogistic distributions. In this example, I used the Kolmogorov-Smirnov test for goodness of fit, passing in the fitted parameters, as the EnvStats package does not support this.
library("FAdist")
library("fitdistrplus")
#> Loading required package: MASS
#> Loading required package: survival
# Generate toy data set
set.seed(123)
rweib_dat <- rweibull3(300, shape = 3, scale = 1, thres = 20)
train_idx <- sample(seq(300), size = 150)
# Split the data so that the data used in determining distribution parameters
# is independent of the data set used to test goodness-of-fit
rweib_train <- rweib_dat[train_idx]
rweib_test <- rweib_dat[-train_idx]
# Estimate parameters. May have to play around with starting values
weib_fit <- mledist(rweib_train, distr = "weibull3", start = list(shape = 2,
scale = 1, thres = 5), optim.method = "SANN")
weib_fit[["estimate"]]
#> shape scale thres
#> 3.671952 1.110161 19.897529
# Goodness of fit test
weib_test <- ks.test(rweib_test, "pweibull3",
shape = weib_fit[["estimate"]][1],
scale = weib_fit[["estimate"]][2],
thres = weib_fit[["estimate"]][3])
weib_test
#>
#> One-sample Kolmogorov-Smirnov test
#>
#> data: rweib_test
#> D = 0.052534, p-value = 0.8022
#> alternative hypothesis: two-sided
# Do the same for the loglogistic distribution
llog_fit <- mledist(rweib_train, "llog", start = list(shape = 2, scale = 1.5),
optim.method = "SANN")
llog_fit[["estimate"]]
#> shape scale
#> 0.009000218 3.039997358
llog_test <- ks.test(x = rweib_test, y = "pllog", llog_fit[["estimate"]][1],
llog_fit[["estimate"]][2])
llog_test
#>
#> One-sample Kolmogorov-Smirnov test
#>
#> data: rweib_test
#> D = 0.060747, p-value = 0.6373
#> alternative hypothesis: two-sided