Merge in for loop

I would like to merge two data set using for loop for varilist in r but not happen in
assign(paste0("meg_",i),merge(paste0("bc_",i,sep=""),paste0("mr_",i,sep=""), by="caseid"))
Plese suggest me

bc <- read_dta("bc_r2.dta")
mr<- read_dta("master_r1.dta")

r<-c("c6_gen_minc", "c_num_hh")

for(i in seq_along(r)){
  assign(paste0("bc_",i),select(bc,r[[i]],caseid))
  assign(paste0("mr_",i),select(mr,r[[i]],caseid))
  assign(paste0("meg_",i),merge(paste0("bc_",i,sep=""),paste0("mr_",i,sep=""), by="caseid"))
}

try something like -

library(tidyverse)
new_df <- left_join(bc, mr, by = "caseid")

Or maybe -

new_df <- full_join(bc, mr, by = "caseid")

see also combine tables section in - https://nyu-cdsc.github.io/learningr/assets/data-transformation.pdf

In dataset bc and mr, I would like to merge only specific variables (not the whole dataset) so I save the specific variable in r and then merge them in the loop. It is the sample as it has a lot of specific variables to be merged. Merging is not the issue, but how to merge specific variables ( multiple same work). So I try to merge these specific pairs of variables into for loop.

I'd suggest you try the tidyverse approach that @jasvinderahuja suggested. You could select only the required columns from the bc & mr dataframes.

To get close to the same output as you were attempting, you could something like this:

library(tidyverse)
r<-c("c6_gen_minc", "c_num_hh", "caseid")
bc <- read_dta("bc_r2.dta") %>% 
                    select(r)  %>%
                    setNames(paste0('bc_', names(.))
mr<- read_dta("master_r1.dta") %>% 
                    select(r)  %>%
                    setNames(paste0('mr_', names(.))

meg<-full_join(bc, mr, by = c('bc_caseid' = 'mr_caseid'))  %>%
                     setNames(paste0('meg_', names(.))) %>%
                     rename(caseid = meg_bc_caseid)

It'd be much nicer without all the renaming & if the output were 'tidy' which could be done a bit like this (not checked this, so please excuse any stupid errors):

library(tidyverse)
r<-c("c6_gen_minc", "c_num_hh", "caseid")
mr<- read_dta("master_r1.dta") %>% 
                select(r) %>% 
                mutate(src = 'mr')
bc <- read_dta("bc_r2.dta") %>% 
                select(r) %>% 
                mutate(src = 'bc') %>%
                filter(caseid %in% mr$caseid)

meg<-bind_rows(bc, filter(mr, caseid %in% bc$caseid))

you could then use tidyr::pivot_wider if you needed the wider form.

Hope that helps & I didn't make too many stupid mistakes!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.