Hi All,
I was doing rowwise two proportion z-tests, which I then wanted to join to the tibble the results were built from. So I left join these together (in the same pipe), and then relocate the results to the other side of the data frame as you can see in the code below...
results_df %>%
dplyr::left_join(dat_df, by = c("formula"="row_number")) %>%
dplyr::relocate(6:length(results_df), after = last_col())
})
The problem is, that the relocate messes the order of the columns (the left join does not (tested)), even though I only want to move the first six columns and keep the order of the others.
Here is the the order of the col names before creating the prop.tests...
[1] "day" "ITEM76222_0" "ITEM76222_1" "ITEM78454_0" "ITEM78454_1" "LOG76222_0"
[7] "LOG76222_1" "LOG78454_0" "LOG78454_1" "can" "canlog" "tot_sales_np"
[13] "tot_sales_pr1" "tot_sales_pr2" "log_tot_sales_np" "log_tot_sales_pr1" "log_tot_sales_pr2" "row_number"
After the proptests (following the left join), we can see the order is maintained.
[1] "row_number" "prop_NL_NP1" "prop_NH_NP1" "base_chi" "base_pval1" "day"
[7] "ITEM76222_0" "ITEM76222_1" "ITEM78454_0" "ITEM78454_1" "LOG76222_0" "LOG76222_1"
[13] "LOG78454_0" "LOG78454_1" "can" "canlog" "tot_sales_np" "tot_sales_pr1"
[19] "tot_sales_pr2" "log_tot_sales_np" "log_tot_sales_pr1" "log_tot_sales_pr2"
But in the last image after doing relocate we can see the order is broken after the column tot_sales_pr_1, where it should have tot_sales_pr_2 it has the log version.
[1] "day" "ITEM76222_0" "ITEM76222_1" "ITEM78454_0" "ITEM78454_1" "LOG76222_0"
[7] "LOG76222_1" "LOG78454_0" "LOG78454_1" "can" "canlog" "tot_sales_np"
[13] "tot_sales_pr1" "log_tot_sales_pr2" "row_number" "prop_NL_NP1" "prop_NH_NP1" "base_chi"
[19] "base_pval1" "tot_sales_pr2" "log_tot_sales_np" "log_tot_sales_pr1"
Why is this this happening, and can it be avoided?
It is worth pointing out that I prefer this to be done in one pipe, but I suppose it is not necessary. I have added the original code slice to you can see what I am doing in case it helps.
df_list2 <- purrr::pmap(list(df_list, NP_conditions, PR__conditions, item_names), function(first, second, third, forth){
if(second == 2 & third == 2 | forth[[2]] == forth[[3]])
purrr::pmap(first,~{prop.test(x = c(..2, ..4), n = c(..12, ..12), correct = "FALSE")}) %>%
purrr::map_df(broom::tidy, .id = 'formula') %>%
mutate(formula = as.integer(formula)) %>%
dplyr::select(-one_of(c('parameter','conf.low','conf.high','method','alternative'))) %>%
dplyr::rename(prop_NL_NP1 = estimate1 , prop_NH_NP1 = estimate2 ,
base_chi = statistic, base_pval1 = p.value) %>%
dplyr::left_join(first, by = c("formula"="row_number")) %>%
dplyr::rename(row_number=formula) %>%
dplyr::relocate(6:length(first), after = last_col())
})
simple repex of df in df_list
dat <- data.frame(Date = sample(seq(as.Date('2019/01/01'), as.Date('2020/01/01'), by="day"), 100),
item_A1 = rnorm(n = 100, mean = 1000, sd = 2),
item_B1 = rnorm(n = 100, mean = 500, sd = 2),
item_B2 = rnorm(n = 100, mean = 700, sd = 2)) %>%
mutate(tot1=item_A1 + item_B1, tot2=item_A1 + item_B2)
Hope that is enough information. Please let me know if you need more? Or some clarity.