How to delete rows with multiple column values

I have a number of txt files, while each txt file has the same dimension and column names. There are more than 10 years of consecutive daily data in each txt file, and I want to calculate average daily data from the 10 years. If there is a leap year, It will ignore the February 29's value and just use the other 365 days.

I want to put the average daily values (365 values) from the many txt files in a dataframe. Here is a small example txt files.


samp1 = data.frame(
time= seq(as.Date('2008-01-01'),as.Date('2010-12-31'),by='day'),
value= rnorm(1096) )
samp1$year = year(samp1$time);samp1$month=month(samp1$time);samp1$day=day(samp1$time)

My question is, how to remove rows if the row has month==2 and day ==29 at the same time? Thanks.

Use a loop to read in the files and build a single data frame then use dplyr to calculate the average. Something like this (I haven't tested this).

filenames <- list.files(pattern=".txt$") # all text files in project directory
data <- vector("list", length(filenames)) # list to store data
for (i in seq_along(filenames)){
  data[[i]] <- read_csv(filenames[i]) %>% # read as a tibble
                        mutate(filename=filenames[i]) # add filename to data
data <- bind_rows(data)
avdata < - data %>%
   filter(!(month==2 && day==29)) %>% # remove these rows
   group_by(month, day) %>% # average for each month-day combination
   summarise(avflow=mean(flow)) # average flow

Thanks, it works. I found (month==2 & day==29) is correct, rather than (month==2 && day==29).

Yes you are right & is for vectors and && is for scalars.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.