Filter dataframe using a column value

In the following code, how can I remove all df's rows that years equal to the year from df2?

library(dplyr, warn.conflicts = FALSE)
library(readr)
library(lubridate)
id <- "1FYfx8R-oOH7APgwHK7Eh6ZYm2xKXHM85"
df <- readr::read_csv(paste0("https://docs.google.com/uc?id=", id, "&export=download"),
                        col_names = TRUE)

  df <- df %>%
    mutate(year = year(datetime), month = month(datetime))
  no_year=length(unique(df$year))  
  df1 <- df %>%
    select(-datetime) %>%
    group_by(year, month) %>%
    summarise_all(funs(sum(. > 0))) %>%
    ungroup()

  df2=aggregate(month ~ year, data = df1, FUN = length)%>% 
    filter(month<11)

Is this what you are after?

# get data for the years that don't match
df3 <- df %>% 
  anti_join(df2, by = "year")

# just to look at the years
df3 %>% 
  filter(year >= 2000) %>% 
  distinct(year)

# A tibble: 10 x 1
    year
   <dbl>
 1  2001
 2  2002
 3  2003
 4  2005
 5  2006
 6  2007
 7  2010
 8  2011
 9  2012
10  2013

thanks for reply. I want to filter df in such a way that all rows having similar year from df2 is removed.

Like

df1 = filter(df, year!=2004 )

But , following code doesnot work.

 g=df2$year
  df1 = filter(df, year!=g )

Like this then?

# the years in a vector
df2_years <- unique(df2$year)

# new dataset
df3 <- df %>% 
  filter(!year %in% df2_years)

It is the same as df3 above, but uses filter()

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.