how to merge different excel sheets in a workbook to a dataframe in R

Hi, I have an excel workbook with 2 sheets, both have same number of columns and datatypes.
I am trying to merge them into a single data frame with the following code.

sheet=excel_sheets("/cloud/project/cyclistic/quartery_trips.xlsx")
sheet
data_frame = lapply(setNames(sheet, sheet), 
                    function(x) read_excel("/cloud/project/cyclistic/quartery_trips.xlsx", sheet=x))

in R studio cloud, but the session always crashes ,have tried different ways to overcome this but it keeps happening.
Kindly suggest a solution

Do you get a crash if you don't use lapply()?

sheet=excel_sheets("/cloud/project/cyclistic/quartery_trips.xlsx")
DF1 <- read_excel("/cloud/project/cyclistic/quartery_trips.xlsx", sheet=sheet[1])
DF2 <- read_excel("/cloud/project/cyclistic/quartery_trips.xlsx", sheet=sheet[2])
data_frame <- rbind(DF1, DF2)

Yes, even without using lapply and by simply running the following code chunk, the session gets terminated automatically.
DF1 <- read_excel("/cloud/project/cyclistic/quartery_trips.xlsx", sheet=sheet[1])

Can you read a csv file from the same location? You can export one of your Excel sheets as a csv and store it there or type up a simple file using a plain text editor. Then test if read.csv() works.

There is no problem while reading separately both as csv or excel.
The problem occurs only when i try to bind or merge it.
I even tried it as CSV using the following code

df <-
list.files(path = "/cloud/project/cyclistic/", pattern = "*.csv") %>%
map_df(~read_csv(.,colnames("ride_id","rideable_type","started_at","trip_start_day",
"ended_at","trip_end_day","start_station_name","start_station_id",
"end_station_name","end_station_id","start_lat","start_lng",
"end_lat","end_lng","member_casual","ride_length"),
col_types = cols(ride_id=col_character(),
rideable_type=col_character(),
started_at=col_character(),
trip_start_day=col_integer(),
ended_at=col_character(),
trip_end_day=col_integer(),
start_station_name=col_character(),
start_station_id=col_integer(),
end_station_name=col_character(),
end_station_id=col_integer(),
start_lat=col_character(),
start_lng=col_character(),
end_lat=col_character(),
end_lng=col_character(),
member_casual=col_character(),
ride_length=col_time()),na=c(' ')))

This gives me the following error
Error in map():
:information_source: In index: 1.
Caused by error in colnames():
! unused arguments ("trip_start_day", "ended_at", "trip_end_day", "start_station_name", "start_station_id", "end_station_name", "end_station_id", "start_lat", "start_lng", "end_lat", "end_lng", "member_casual", "ride_length")

Try a much simpler command. Read a single file and if it doesn't have column names, just set col_names = FALSE. Don't worry about the column types.

yes i have tried to read them one by one like this

tripdata_may <- read_csv("may.csv",na=c(''))
tripdata_june <- read_csv("june.csv",na=c(''))
tripdata_july <- read_csv("july.csv",na=c(''))
tripdata_august <- read_csv("august.csv",na=c(''))

It works perfectly but how to bind them together, they have the same column names and data types but every time the session gets terminated

Normally,

NewDF <- rbind(tripdata_may, tripdata_june, tripdata_july, tripdata_august)

would work. If that crashes the session, check the the size of each object with code like

object.size(tripdata_june)

Do you know how much memory you have available?

yes, I figured R crashes because it needs large amount of RAM because of my file size.
I ran the same code in R studio desktop and it ran smoothly.
Thanks a lot for your help and support !!

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.