Hi All,
I have two lists of tibbles (lets call them df1 and df2) df1 is the original tibble, while df2 is an imputation of one column (lets call that y1 in df1 and y2) using recipes::step_knnimpute
. As there is information in df1 that I need to keep, my preference would be to replace the same column in df1 with df2, or at least add the imputed column into df1. In the first case, there does not appear to be a shortcut for moving the imputed data into df1 after checking the imputation. So I have created df2, and checked the non-imputed values values match between y in df1 and df2. So I know by row the data matches, and when I bind df1 with the column of df2 like this bind_cols(list(df,df2[5]))
eveything is fine. However, as these are lists of tibbles I need to map
over them so I use the following code...
df_list <- pmap(list(df1, df2), function(first, second) {
bind_cols(list(first, second[[1]]))
})
After doing this y2 is bound to df1 but has its name changed to "...15".
My question is firstly how can I prevent the ...# being added in the cbind, and why does it do this when the names are differnt? And my second question is, is there a more efficient way of transferring the imputed data into the original data frame rather than creating df2 and binding the desired row?
df <- structure(list(Item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("ITEM76222", "ITEM78454"), class = "factor"),
Promotion = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("0", "1"), class = "factor"), ds = structure(c(1546300800,
1546387200, 1546473600, 1546560000, 1546646400, 1546732800,
1546819200, 1546905600, 1546992000, 1547078400), tzone = "UTC", class = c("POSIXct",
"POSIXt")), rowname = c("1", "2", "3", "4", "5", "6", "7",
"8", "9", "10"), y = c(374L, NA, 447L, 403L, 409L, 554L,
409L, 469L, 556L, 585L)), row.names = c(NA, -10L), index_quo = ~ds, index_time_zone = "UTC", class = c("tbl_time",
"tbl_df", "tbl", "data.frame"))
df2 <- structure(list(y2 = c(374L, 450L, 447L, 403L, 409L, 554L, 409L,
469L, 556L, 585L)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))
This is how I created the immuted data df2
immpute_anomalies <- function(ti) {
ti %>%
recipes::recipe(y ~ Item + Promotion + season + trend) %>%
recipes::step_knnimpute(y) %>%
recipes::prep() %>%
recipes::juice() %>%
dplyr::rename(y2 = "y") %>%
dplyr::select(y2)
}
list_ts_imps <- purrr::map(list_ts_anom, immpute_anomalies)
Thank you for any aid in helping me to understand this, your efforts are truly appreciated.