How to cut out data from certain dates in a dataframe?

I have a dataframe with dates and observations. The observation from 01.09.2020 untill 27.10.2020 are not usable and i need to cut these dates and their observations out of the dataframe. How do i do it?

First: Instal caret package:

install.packages("caret")
library(caret)

Second: Create a partition of the percentage that you want, in this case, 50% or 0.5, with createDataPartition() function

new <- createDataPartition(y = data$column_data, times = 1, p = 0.5, list = FALSE)

Now you have new vector with your 50% of data.

I hope this helps!

OK for example: 01.01.2020 to 01.09.2020 :

data1 <- data %>%
filter(date_colum %in% 01.01.2020:01.09.2020)

That´s all!

And if you need a sample of 500 or 1000 as in this case, you can pipe sample_n function:

data1 <- data %>%
filter(date_colum %in% 01.01.2020:01.09.2020) %>%
sample_n(1000)

Sorry I didn´t read you wanted to exclude 27.10.2020

Yes, my problem is that there is data before and after the section i need to exclude. Is there a opposite to the %in% Operator?

Are you looking for ! ?

If you're only looking to drop a couple of dates, maybe something like...

mydf <- data.frame("month"=c("jan","feb","mar","apr","jun","july"))

mydf
  month
1   jan
2   feb
3   mar
4   apr
5   jun
6  july

#drop feb and mar
mydf2 <- data.frame("month"=mydf[!mydf$month=="feb" &
                !mydf$month=="mar",])
mydf2
month
1   jan
2   apr
3   jun
4  july

Asseming your tata are "dates


library(lubridate)
city <-   structure(list(name = c("Abilene", "Akron", "Albany", "Albuquerque", 
                        "Alexandria", "Allentown", "Amarillo", "Anaheim", "Anchorage", 
                        "Ann Arbor", "Arden-Arcade", "Arlington", "Arlington", "Arvada", 
                        "Athens-Clarke County", "Atlanta", "Augusta-Richmond County", 
                        "Aurora", "Aurora", "Austin"), pop = c(115930L, 217074L, 93994L, 
                      448607L, 128283L, 106632L, 173627L, 328014L, 260283L, 114024L, 
                     92040L, 332969L, 174838L, 102153L, 101489L, 416474L, 199775L, 
                    276393L, 142990L, 656562L), ddate = structure(c(18262, 18263, 
                     18264, 18265, 18266, 18267, 18268, 18269, 18270, 18271, 18272, 
                     18273, 18274, 18275, 18276, 18277, 18278, 18279, 18280, 18281
                   ), class = "Date")), row.names = c(NA, 20L), class = "data.frame")

library(lubridate)
city$ddate <-   seq(ymd("2020/01/01"), ymd("2020-01-20"), by = "day")

dat1 <-   subset(city, ddate < ymd("2020/01/15") | ddate > ymd("2020/01/18"))

dat1

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.