Write csv for different files

I use lapply to filter all csv files (81 files) in a folder (in each file there is a column with number and i filter to remove number present in double)
But now i want export the result as csv file by making 81 differents files. how can i do this ?

Thanks in advance :slight_smile:

This is definitely doable and sounds like a job for purrr::walk(), but without more specifics about your files and such, it is hard to suggest specific code that would work.

I will try this monday.
What you want to know more specificly on my files ?

Thanks

First it would help if you tell us what you do exactly so far. You said you use lapply to filter.?

So you:

  • read each file
  • modify the file
  • save the modified file

Correct? Do you saved the intermediary results (in a list or a dataframe) or just want to run this on the go? If the first 2 steps work already it should be easy to add just a write_csv statement.

2 Likes

My mac is at my work so i give you what i done exactly tommorow

I'm really a beginner so there may be a simpler way to do this, but so I have in a folder 81 csv2 files, all these files have the same structure, and in particular there is a column called POS. In these files certain lines are duplicated so I use the POS column as an argument to eliminate duplicate lines (thanks to lapply, disctinct). For the results to be named as the file from which they come I use "names" and gsub.
I hope that is clear for you

this is what i'v done :

library(dplyr)
filenames <- list.files("/Users/my_name/my_folder", pattern= "*.csv", full.names = TRUE)
filenames2 <- gsub("/Users/my_name/my_folder/", "", filenames)
filename <- gsub(".csv", "", filenames2)
ldf <- lapply(filenames, read.csv2)

res <- lapply(ldf, distinct, POS, .keep_all = TRUE)
names(res) <- filename

In these sorts of situations where you want to apply the same task iteratively, I find it is usually best to think about how you would do it for one item, create a function for this, and then it becomes much simpler to apply that function iteratively.

For example, we can create the function engine() which does all the things you want: it loads the CSV, finds distinct observations, and then saves out the file. The function works on a single pair of data, the input CSV filename, and the output CSV filename. Then, walk2() is the function that applies it iteratively to each data set in turn.

library(dplyr)
library(purrr)
library(fs)

inputs <- dir_ls("/Users/my_name/my_folder", glob = '*.csv')
outputs <- path_file(inputs)

engine <- function(input, output) {
  data <- read.csv2(input)
  data <- distinct(data, POS, .keep_all = TRUE)
  write.csv(data, output)
  invisible(data)
}

walk2(inputs, outputs, engine)

Just be careful you do not overwrite the original files, unless that is your intent. I generally like to use full paths for both input and output, to be safe.

1 Like

thanks so much it works very well. It write in the working directory so I just add a "setwd" t specify the folder in which add the results
Maybe I should create a new topics but if you have the solution it could be very good, in my csv, in the column POS if some value who follow each other (for exemple, 1; 4; 8; 9; 34; 39…) and I want, if it's possible, remove the lines in which there are values that follow each other (in my exemple the line with the value 8 and 9). It's possible?

Yes, this is possible. Do you want to remove both lines if they are numbered sequentially, or just keep the first one (or the last one)?

Remove both lines...

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.