Read and join multiple files


#1

Hi,

I have two kind of files (CSV & Excel). I have 100 csv files (file names start with "Colours") in the path (C:/My data"). These files are not having header, but all data are in same format and same order. So I need to read all these files and add 8-10 headers (i.e column names) and append all of them into 1 single data.

Then I have a excel file (monitered.xlsx) in another path (C:/My data1). I have to read the file and it has only one column, which needs to be vlookup (inner Join) with a column of the above single data (i.e appended csv data)

Then I have another excel file (Assignment.xlsx) in other path (C:/ My data2). I need to put vlookup (inner Join) with the above derived output to this excel file.

Can anyone kindly give me the R coding for the above requirement.


#2

Hopefully, this should get you started :slightly_smiling_face:

# Load libraries
library('tidyverse')
# List all files ending with csv in directory
csv_files = list.files(path = 'path/to/my/files', pattern = "csv$", full.names = TRUE)
# Read each csv file into a list
csv_list = lapply(csv_files, read_csv)
# Concatenate the data in each csv file into one combined data frame
my_data = do.call(rbind, csv_list)

#3

You could also combine the lapply and do.call functions into one using purrr::map_dfr():

my_data <- purrr::map_dfr(csv_files, read_csv)

#4

Thanks leon. BTW, how to add the headers and apply Joins (vlookup) with my next data


#5

you can add column names by using the colnames()<- function. It is implemented like this:

colnames(my_data) <- c("vector", "of", "my", "column", "names")

To do a join, you can use the dplyr package. There is documentation on the different kind of joins here and here. To read in your excel file you can use the readxl::read_excel function.