readxl file.xlsx rbind

Hi

I´m trying to read all the files with the below code. Because the two first rows have information that diferenciate the tables i creat a for but It doesn´t work because it does not cumulate the info.

Structure of data frame

Title table
type of data
A B C D
1 1 1 1
2 2 2 2

l<-list.files("./aranceles")

for (i in l) {
  x1<- read_excel(paste0("./aranceles/",l[i]), range = "A1:A3", col_names = TRUE)

  a<- as.character(x1[1,1])
  b<- as.character(x1[2,1])
  
  x2<- read_excel(paste0("./aranceles/",l[i]), skip=3, col_names = TRUE)%>%
    mutate(product= a, import= b)
  
  finaldf<- rbind(X2)
  }

Welcome @walter1 in the RStudio Community.
I am not sure what is exactly the problem you are meeting and therefore the code below is maybe only a starting point for a solution. I hope it will help.

# create subfolder and (two) testfiles
fs::dir_create("./aranceles")
df1= data.frame(
  A=c(1,2,3,4),
  B=c(1,2,3,4),
  C=c(1,2,3,4),
  D=c(1,2,3,4) 
)
df2= data.frame(
  A=c(4,3,1,1),
  B=c(4,3,1,1),
  C=c(4,3,1,1),
  D=c(4,3,1,1)
)
write.csv(df1,'./aranceles/myfile1.txt',row.names = FALSE)
write.csv(df2,'./aranceles/mijnbestand.txt',row.names = FALSE)

# suggested solution for your code
# working with csv files and NOT xls files

l<-list.files("./aranceles")

finaldf = data.frame()

for (fn in l) {
  x1<- read.csv(paste0("./aranceles/",fn))
  finaldf <- rbind(finaldf,x1)
  }

print(finaldf)
  A B C D
1 4 4 4 4
2 3 3 3 3
3 1 1 1 1
4 1 1 1 1
5 1 1 1 1
6 2 2 2 2
7 3 3 3 3
8 4 4 4 4

The main problem is that all the files are in xlsx so I can't use the normal code for csv. But what I´m trying to do is to read all the files and use rbind to bind.

l<-list.files("./aranceles")

for (i in l) {
x<- read_excel(paste0("./aranceles/",l[i]), skip=3, col_names = TRUE)

finaldata<- rbind(x)
}

When you do finaldata <- rbind(x) you are replacing whatever was in finaldata with x. So HanOostdijk is suggesting is to use finaldata <- rbind(finaldata, x1) so that your finaldata grows at each iteration (you append the new lines to the existing object instead of replacing).

You can also get a similar result by replacing your for loop with a call to map_dfr(), which automatically does the binding step:

my_read_function <- function(file_name){
  read_excel(paste0("./aranceles/", file_name), skip=3, col_names = TRUE)
}

l <- list.files("./aranceles")

map_dfr(l, my_read_function)

A third possibility is to temporarily store your files in a list and at the end gather them together with bind_rows():

l <- list.files("./aranceles")
my_files_contents <- list(length = length(l))
for(ff in 1:length(l)){
  my_files_contents[[ff]] <- read_excel(paste0("./aranceles/",l[ff]), skip=3, col_names = TRUE)
}

all_files_contents <- bind_rows(my_files_contents)

In all cases, you must make sure that you're not erasing your previous file whenever you read in a new one.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.