Warnings when estimating profiles in tidyLPA

Hi all!

I'm trying to estimate profiles using tidyLPA, which I have used in the past. Code is below and very straightforward. However, for model_1_class_3 I'm getting the warning "$ operator is invalid for atomic vectors" and for model_1_class_5 I'm getting the warning "subscript out of bounds." Not sure what's causing this given I'm passing three columns of ints into estimate_profiles, but seems like maybe something in how my dataframe is set up? Any advice would be great!

lpa <- data %>%
  select(CNITot,ICUTot,BESTot) %>%
  single_imputation(missForest) %>%
  estimate_profiles(1:5,package = "mclust") %>%
  compare_solutions(statistics = c("AIC","BIC","BLRT_val"))

Hard to say without a reprex. See the FAQ.

Sorry about that! https://1drv.ms/x/s!Ao_MkMiHiSgLgrsyfASABOtAWwNveA?e=KAw49Q is a link to the data and below is the full code:

library(tidyLPA)
library(dplyr)
library(missForest)

##Creating summed variables
data$CNITot <- rowSums(data[,c("CNI1","CNI2","CNI3","CNI4","CNI5","CNI6","CNI7","CNI8","CNI9","CNI10","CNI11","CNI12","CNI13","CNI14","CNI15","CNI16")],na.rm=TRUE)
data$ICUTot <- rowSums(data[,c("ICU1","ICU2","ICU3","ICU4","ICU5","ICU6","ICU7","ICU8","ICU9","ICU10","ICU11","ICU12","ICU13","ICU14","ICU15","ICU16","ICU17","ICU18","ICU19","ICU20","ICU21","ICU22","ICU23","ICU24")],na.rm=TRUE)
data$BESTot <- rowSums(data[,c("BES1","BES2","BES3","BES4","BES5","BES6","BES7","BES8","BES9","BES10","BES11","BES12","BES13","BES14","BES15","BES16","BES17","BES18","BES19","BES20")],na.rm=TRUE)

lpa <- data[290:311,] %>%
  select(CNITot,ICUTot,BESTot) %>%
  single_imputation(missForest) %>%
  estimate_profiles(1:5,package = "mclust") %>%
  compare_solutions(statistics = c("AIC","BIC","BLRT_val"))

I get stuff with a different error. Continuing to puzzle

library(tidyLPA)
library(dplyr)
library(missForest)

# d
# ID 28252 717093 717107 11438670 i0118020 i0118087
# #NULL! replaced from source file with , and these
# records were removed because too few fields were
# found
# read from a cut-and-paste of  https://1drv.ms/x/s!Ao_MkMiHiSgLgrsyfASABOtAWwNveA?e=KAw49Q
d <- read.csv("https://gist.githubusercontent.com/technocrat/794ae96d6e47d06e085a3c23f99498d1/raw/2313dfcc84877416adf4650a0e9869df4712bf76/reprex2023-01-19.csv")

# renamed data to d, because there is a built-in data()
# and some operations will operate on that instead of
# the data.frame

##Creating summed variables

d$CNITot <- rowSums(d[,c("CNI1","CNI2","CNI3","CNI4","CNI5","CNI6","CNI7","CNI8","CNI9","CNI10","CNI11","CNI12","CNI13","CNI14","CNI15","CNI16")],na.rm=TRUE)
d$ICUTot <- rowSums(d[,c("ICU1","ICU2","ICU3","ICU4","ICU5","ICU6","ICU7","ICU8","ICU9","ICU10","ICU11","ICU12","ICU13","ICU14","ICU15","ICU16","ICU17","ICU18","ICU19","ICU20","ICU21","ICU22","ICU23","ICU24")],na.rm=TRUE)
d$BESTot <- rowSums(d[,c("BES1","BES2","BES3","BES4","BES5","BES6","BES7","BES8","BES9","BES10","BES11","BES12","BES13","BES14","BES15","BES16","BES17","BES18","BES19","BES20")],na.rm=TRUE)

lpa <- d[290:311,] %>%
  select(CNITot,ICUTot,BESTot) %>%
  single_imputation(missForest) %>%
  estimate_profiles(1:5,package = "mclust") %>%
  compare_solutions(statistics = c("AIC","BIC","BLRT_val"))

Error in switch(method, imputeData = do.call(imputeData, list(data = x, :
EXPR must be a length 1 vector

Got it. single_imputation requires a matrix or data frame and it's being given a function.

Thanks! Unfortunately, I'm still getting the error after taking missForest out and only running as single_imputation(). Although they are only coming up when I run the full dataset, not smaller subsets. Any ideas?

The help for ?single_imputation

single_imputation(x, method = "imputeData")

implies that to use missForest option you need to not only have library(missForest) earlier in the code but set
single_imputation("missForest")
thats missForest quoted as a string.

1 Like

Even making the missForest change (reflected below) or removing missForest and running solely as single_imputation(), I'm still getting the errors described above.

library(tidyLPA)
library(dplyr)
library(missForest)

d$CNITot <- rowSums(data[,c("CNI1","CNI2","CNI3","CNI4","CNI5","CNI6","CNI7","CNI8","CNI9","CNI10","CNI11","CNI12","CNI13","CNI14","CNI15","CNI16")],na.rm=TRUE)
d$ICUTot <- rowSums(data[,c("ICU1","ICU2","ICU3","ICU4","ICU5","ICU6","ICU7","ICU8","ICU9","ICU10","ICU11","ICU12","ICU13","ICU14","ICU15","ICU16","ICU17","ICU18","ICU19","ICU20","ICU21","ICU22","ICU23","ICU24")],na.rm=TRUE)
d$BESTot <- rowSums(data[,c("BES1","BES2","BES3","BES4","BES5","BES6","BES7","BES8","BES9","BES10","BES11","BES12","BES13","BES14","BES15","BES16","BES17","BES18","BES19","BES20")],na.rm=TRUE)

lpa <- d %>%
  select(CNITot,ICUTot,BESTot) %>%
  single_imputation("missForest") %>%
  estimate_profiles(1:5,package = "mclust") %>%
  compare_solutions(statistics = c("AIC","BIC","BLRT_val"))

Hi Dazen,
your code is inconsistent as to whether your data is in a frame called d or 'data', see the rowSums lines ...
probably should be d all the time.
if the d is the gist as provided by technocrat; then the code otherwise runs; and the warnings are not of the type you originally posted about they are of the form:

"Less than 1% of cases were assigned to one of the profiles. Interpret this solution with caution and consider other models.