How to bind the files in the filelist

Hi everyone!
I have a time series data in a sub-folder under a big folder. The file format is in txt file. The Directory is as follow:
C:/.../.../Astation/2018/L1_0101_0601_A.txt
C:/.../.../Astation/2018/L1_0101_0603_A.txt
C:/.../.../Astation/2018/L1_0101_0605_B.txt
.
.
and so on.

Firstly, I prepared the files for the analysis as follow:

library(stringr)
library(readr)
> parent.folder <-"C:/.../.../AStation/2018"  
> sub.folders <- list.dirs(parent.folder, recursive=TRUE)[-1] 
> r.scripts <- file.path(sub.folders)
> AStation_2018 <- list()

> for (j in seq_along(r.scripts)) {
>   AStation_2018[[j]] <- dir(r.scripts[j],"\\.txt$")}

Until here, it is okay. After that, I would like to construct the dataframe using the file name in one column and values in txt file in another column.

For that, I start to row bind the files as follow:

> for (i in 1:length(AStation_2018)) {
>      for (j in 1:length(AStation_2018[[i]])){
>     files_com <- dir(".", pattern = ".txt$")
>     
>     AStation_2018_com <- rbindlist(sapply(AStation_2018[i], fread, simplify = FALSE), idcol = "file")
>     
>   }
> }   

OR

for (i in 1:length(AStation_2018)) {
  #print(i)
  #print(AStation_2018[[i]])
  #print(AStation_2018[[i]][1])

  for (j in 1:length(AStation_2018[[i]])){
    #print(length(AStation_2018[[i]]))
    #print(AStation_2018[[i]][j])
    #filename <- paste(str_sub(AStation_2018[[i]][j], 1, 14), j, sep= "_")
    filename <- paste(str_sub(AStation_2018[[i]][j], 1, 14))
    #print(filename)

    complete_file_name = paste(r.scripts[i],'/', AStation_2018[[i]][j], sep="" )
    #print(complete_file_name)

    assign(filename, read_tsv(complete_file_name, skip = 14, col_names = FALSE))
  }
} 

But I couldn't perform this code. Is there any problem in my code?

This works for me.
I did not check if your code works when you adjust the pattern:
the dot has a special meaning in regular expressions.

library(purrr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

df1 <- data.frame(
  id = letters[1:4],
  f1 = 1:4,
  f2 = (1:4)^2
)
df2 <- data.frame(
  id = letters[3+1:4],
  f1 = 3+1:4,
  f2 = (3+1:4)^2
)

dir.create("./nchan08sub")

write.csv(df1,"./nchan08sub/file1.txt")
write.csv(df2,"./nchan08sub/file2.txt")
# this is comparable to your situation 

myfiles <- list.files(path="./nchan08sub",
                      pattern="\\.txt$",
                      full.names = TRUE) 

purrr::map_dfr(myfiles,
                  function(f) read.csv(f))
#>   X id f1 f2
#> 1 1  a  1  1
#> 2 2  b  2  4
#> 3 3  c  3  9
#> 4 4  d  4 16
#> 5 1  d  4 16
#> 6 2  e  5 25
#> 7 3  f  6 36
#> 8 4  g  7 49
Created on 2023-01-18 with reprex v2.0.2

You can read them in directly:

files_com <- dir(".", pattern = ".txt$", full.names = TRUE)

combined <- readr::read_csv(files_com, id = "file_name")
# or using fread like in the original post
combined <- rbindlist(lapply(files_com, fread))

If I do separately, it is okay. But when it is in loop, it does not work. I don't know what wrong with my code.

I tried your code separately, it is okay. But in loop, it doesn't perform well.

The code is not supposed to run in a loop. The functions are vectorised.

Try running them as is or adapt them for your file/directory structure.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.