Mutiple .txt list to data frame in r

I have a folder with 44 .txt files (.../data). I want to collect the data contained in all those 44 files in just one data frame. This is what I have done:

list_of_files <- list.files(path = "/Users/setegonz/MEGAsync/ProjetoUFABC-master/data", recursive = TRUE, pattern = "\\.txt$", full.names = TRUE)
df <- list_of_files %>%
  set_names(.) %>%
  map_df(., read_table, .id = "FileName")

This is my output:

I would like to achieve two things:

  1. Two separate all the variables that I have grouped in the second column, into individual columns.
  2. Two shorten the identification name to somthing like "sub_01", "sub_2", and so on.

I would like to achieve this by using a tidiverse approach.


Could you please turn this into a self-contained REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help, you can also include some example data using the datapasta package

I'm getting problems trying to use this format. I think that my list its not supported.

What error message do you get when you do datapasta::df_paste(head(df))?

Also, have you tried specifying the separator character?

map_df(., read_table, sep = ",", .id = "FileName")

After using:

map_df(., read_table, sep = ",", .id = "FileName")

The output chaged and I manage to use the datapasta way...So here it goes:

mydata <- tibble::tribble(
  ~FileName,                                                                                                                                                                                               ~X.,
  "/Users/setegonz/MEGAsync/ProjetoUFABC-master/data/sub_01.txt", "Trial,nBack,Valence,Image,imagePresentationTime,NoisePresentationTime,x,StimDuration,NoiseDuration,RT,Accuracy,experiment,Clock_1,Clock_2,dataOneMinPress_1,dataOneMinPress_2,dataOneMinPress_3",
  "/Users/setegonz/MEGAsync/ProjetoUFABC-master/data/sub_01.txt",                                                                             "1,1,0,4530.jpg,0.0141170582301129,0.684714303524743,2.03743156177552,0.67059724529463,1.35271725825078,NaN,1,0,,,,,",
  "/Users/setegonz/MEGAsync/ProjetoUFABC-master/data/sub_01.txt",                                                                     "2,1,0,4530.jpg,2.0492009591087,2.6609536327162,5.26096588873156,0.611752673607498,2.60001225601536,1.3177634643348,1,0,,,,,",
  "/Users/setegonz/MEGAsync/ProjetoUFABC-master/data/sub_01.txt",                                                                               "3,1,0,4000.jpg,5.27273649355379,5.88450426077657,8.88452026000869,0.611767767222773,3.00001599923212,NaN,1,0,,,,,",
  "/Users/setegonz/MEGAsync/ProjetoUFABC-master/data/sub_01.txt",                                                                                 "4,1,0,6314.jpg,8.89628452551256,9.50805833018126,12.50807915937,0.611773804668701,3.00002082918877,NaN,1,0,,,,,"

As you can see all my variables are togheter in the same column. Then I tried to separate with:

mydataseparation <- separate(mydata, c(2), ",")

But separate() doesn't divide my variables

separate works but it is not an ideal solution, you should get separate variables from read_table()

mydata %>% 
    separate(X., into = c('Trial','nBack','Valence','Image','imagePresentationTime','NoisePresentationTime','x','StimDuration','NoiseDuration','RT','Accuracy','experiment','Clock_1','Clock_2','dataOneMinPress_1','dataOneMinPress_2','dataOneMinPress_3'),
             sep = ",") %>% 
#> # A tibble: 5 x 18
#>   FileName Trial nBack Valence Image imagePresentati~ NoisePresentati~
#>   <chr>    <chr> <chr> <chr>   <chr> <chr>            <chr>           
#> 1 /Users/~ Trial nBack Valence Image imagePresentati~ NoisePresentati~
#> 2 /Users/~ 1     1     0       4530~ 0.0141170582301~ 0.6847143035247~
#> 3 /Users/~ 2     1     0       4530~ 2.0492009591087  2.6609536327162 
#> 4 /Users/~ 3     1     0       4000~ 5.27273649355379 5.88450426077657
#> 5 /Users/~ 4     1     0       6314~ 8.89628452551256 9.50805833018126
#> # ... with 11 more variables: x <chr>, StimDuration <chr>,
#> #   NoiseDuration <chr>, RT <chr>, Accuracy <chr>, experiment <chr>,
#> #   Clock_1 <chr>, Clock_2 <chr>, dataOneMinPress_1 <chr>,
#> #   dataOneMinPress_2 <chr>, dataOneMinPress_3 <chr>

Created on 2019-02-12 by the reprex package (v0.2.1)

Try with read.csv to get separate variables from the beginning

df <- list_of_files %>%
    map_df(read.csv, sep = ",", .id = "FileName")

Thanks!! read.csv works great!!!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.