 # Find the regression and predicted values by grouping the categories

I have got a dataframe df. Is there a way to find the predicted values across groups. For example for below dataframe, I have found the regression equation

``````df <- structure(list(colA = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), colB = c(48, 34, 56, 34, 56, 78), colC = c(45,
67, 87, 45, 34, 56)), class = "data.frame", row.names = c(NA,
-6L))
fit <- lmList(colB ~ colC | colA, data = df) # to find regression equation
``````

Now is there a way to find the predicted values by inserting new column (Predicted colB) like below

``````df
colA colB colC Predicted colB
1    A   48   45
2    A   34   67
3    A   56   87
4    B   34   45
5    B   56   34
6    B   78   56
``````

Predicted colB should have the predicted value in colB based on colC. For example in first row, when colC is 45 what is colB?

You could use `predict()` to get a vector of the predicted values. For example:

``````df\$predicted_colB <- predict(fit)

# or with dplyr
df %>%
mutate(predicted_colB = predict(fit))
``````

Note that if the data frame were a new one (not the same data you predicted on), you'd have had to use `predict(fit, newdata = df)`.

thanks it is working. But I have a datasets with 290 rows. When I use your formula, I get below error. Not sure? Can you guide me?

``````df\$predicted_low <- predict(fit)

Error in names(val) <- rep(namVal, ngrps) :
'names' attribute  must be the same length as the vector 
``````

My guess is that there's one missing value in either `colB` or `colC` (or both). It looks like `predict()` on lme4 list objects can't handle that.

Try filtering the data before fitting it:

``````library(dplyr)

df_filtered <- df %>%
filter(!is.na(colB), !is.na(colC))

fit <- lmList(colB ~ colC | colA, data = df_filtered)

df_filtered\$predicted_colB <- predict(fit)
``````

Hi,

what does a vertical line | mean in here:

