I have 9 sheets in single excel workbook ( have 2 work books, first is originally crated other is reviewed) I have to compare these sheets with same reviewed workbook in which my reviewer made changes.
I am using below codes for reading the entire workbook and transferring into the individual data frame as sheet in the workbook
# for reading sheets from the reviewed file
workbook_new <- lapply(ws_name_new, function(x) read_excel(path = path_new, sheet = x, skip = 2, na = ""))
workbook_new <- lapply(workbook_new, function(x) x[-2,])
sheet_list_new <- workbook_new[sapply(workbook_new, function(x) dim(x)[1]) > 0]
# for reading sheets from the originally created file
workbook_old <- lapply(ws_name_old, function(x) read_excel(path = path_old, sheet = x, skip = 2, na = ""))
workbook_old <- lapply(workbook_old, function(x) x[-2,])
sheet_list_old <- workbook_old[sapply(workbook_old, function(x) dim(x)[1]) > 0]
# Below code I am using for the removing the not required sheets
# for new Workbook
ws_name_new <- excel_sheets(path = path_new)
ws_name_new <- ws_name_new[-grep("*_Cal", ws_name_new)]
ws_name_new <- ws_name_new[-grep("*_cal", ws_name_new)]
ws_name_new <- ws_name_new[-gre`enter code here`p("*Summary", ws_name_new)]
ws_name_new <- ws_name_new[-c(1, 9, 16, 17, 18, 19, 20, 21, 22,23,24,25, 26, 27)]
# For Old file
ws_name_old <- excel_sheets(path = path_old)
ws_name_old <- ws_name_old[-grep("*_Cal", ws_name_old)]
ws_name_old <- ws_name_old[-grep("*_cal", ws_name_old)]
ws_name_old <- ws_name_old[-grep("*Summary", ws_name_old)]
ws_name_old <- ws_name_old[-c(1, 9, 16, 17, 18, 19, 20, 21, 22,23,24,25, 26, 27)]
the approach I am following is first I am importing the every single sheet from the excel workbook and creating two data frames. df_1_new from the reviewed file and df_1_old from the originally created. Now I want to compare df_1_new with df_1_old and get the final difference file.
I have tried compare_df() function but its giving me two rows in difference file one the original row other is the row which is modified. I need only those rows which are changed by the reviewer.
I have 9 data frame for each of the work book so total is 18 which I need to compare and get the difference file. consider the given wxaple
Example:
Suppose I have below two data framed. df_1 is which I have created and DF_2 is my reviewer corrected. Now because the file is very huge and I have more then 100 column I can not check them one by one. I looking of the solution if both the data frame can be compare and the final out put I get is the columns and rows in which changes has been made
key <- c(A1, B1, C1, D1)
a <- c(10,20,30,40)
b <- c('book', 'pen', 'textbook', 'pencil_case')
c <- c(TRUE,FALSE,TRUE,FALSE)
d <- c(2.5, 8, 10, 7)
df_1_New <- data.frame(key,a,b,c,d)
key <- c(A1, B1, C1, D1)
a <- c(10,5,40,40)
b <- c('book', 'pen', 'textbook', 'pen_case')
c <- c(TRUE,FALSE,TRUE,TRUE)
d <- c(2.5, 3, 10, 7)
df_1_Old <- data.frame(key, a,b,c,d)
ctable <- compare_df(df_1_New, df_1_Old, c("Key"), keep_unchanged_cols = FALSE)
DF_1_Diff <- ctable$comparison_df
DF_Diff<- DF_1_Diff %>% filter(chng_type == '+') %>% select(-chng_type)
I want to filter out the changes only so I am applying above code and dropping the 'cnge_type' col as well. I also wanted drop last lets say 10 row from the df I can do that with use of dplyr in the same so that I can make me code more efficient.
NOTE : I have to perform same task other 8 also and the name of the data frame is following the as df_2_new, df_2_old, df_2_new, df_3_old and so on and so forth.
Can any one suggest me or guide me the efficient way.