Odd behavior when taking the min() of an empty date vector.

I came across this within a dplyr pipeline I was running. I am trying to find the minimum date that is after some other date. Frequently this results in taking the minimum of a 0 length vector. I imagine it's just something wrong with the print method of Date? The warning indicates that Inf is returned, as indeed it is, but for some reason it prints as NA. Does anyone know how I might get this to print Inf correctly?

y <- seq(from = as.Date('2018-01-01'), to = as.Date('2018-01-10'), 1)
min_y <- min(y[y > as.Date('2018-01-11')])
#> Warning in min.default(structure(numeric(0), class = "Date"), na.rm =
#> FALSE): no non-missing arguments to min; returning Inf
print(min_y)
#> [1] NA
is.na(min_y)
#> [1] FALSE
cat(min_y)
#> Inf
min_y == Inf
#> [1] TRUE
class(min_y)
#> [1] "Date"

Dates are squirrelly. The thing to keep in mind is that they are stored internally as the number of seconds from January 1, 1970. So, to do an arithmetic operator like min, there needs to be something to work with and, by definition, a zero length vector doesn't have anything.

Here's a clumsy way to deal with it, though.

# install.packages("devtools")
#devtools::install_github("STATWORX/helfRlein")
library(helfRlein)
y <- seq(from = as.Date('2018-01-01'), to = as.Date('2018-01-10'), 1)
target_date <- as.Date("2018-01-11")
y[min(y %nin% target_date)] # the non-in %nin% operator from the helfRlein library
[1] "2018-01-01"

A bit off topic, but just want to mention that the 'not in' operator is easily created with the function Negate():

'%out%' <- Negate('%in%')
y <- seq(from = as.Date('2018-01-01'), to = as.Date('2018-01-10'), 1)
target_date <- as.Date("2018-01-11")
y[min(y %out% target_date)]
[1] "2018-01-01"

There is no difference, besides reducing dependencies, which is always good :slight_smile:

Cheers
Fer

2 Likes

I think this is because the numeric Inf you got, is converted back to Date Class and it results to NA. (of class Date).

If you want to keep Inf you need to store your date as numeric, or as character (not as Date) using a custom function to change NA of class Date to "Inf" as character (not Inf numeric). I am not sure Inf of class
Date as any meaning.

I think all this is the way R deals with date between numeric, character and class Date.

Some code to complement

# Your date from 01-01 to 10-01
y <- seq(from = as.Date('2018-01-01'), to = as.Date('2018-01-10'), 1)
# When trying to select everything above 11-01. 
# there is none so it is all false
all(y > as.Date('2018-01-11'))
#> [1] FALSE
# and it subsets to nothing because all indexes are FALSE
y[y > as.Date('2018-01-11')]
#> Date of length 0
# min of an object of length 0 is set to Inf. 
min(numeric(0))
#> Warning in min(numeric(0)): aucun argument trouvé pour min ; Inf est
#> renvoyé
#> [1] Inf
# and it is NA for date, because the numeric Inf is converted back to Date class
as.Date(Inf, origin = "1970-01-01")
#> [1] NA
# it is NA as it doesn't exist. That why you got the result.
min_y <- min(y[y > as.Date('2018-01-11')])
#> Warning in min.default(structure(numeric(0), class = "Date"), na.rm =
#> FALSE): aucun argument trouvé pour min ; Inf est renvoyé
class(min_y)
#> [1] "Date"
min_y
#> [1] NA
cat(min_y)
#> Inf
# same as as.numeric
as.numeric(min_y)
#> [1] Inf
as.character(min_y)
#> [1] NA
# print use format
format(min_y)
#> [1] NA
# So cat seems to convert to numeric

Created on 2019-01-03 by the reprex package (v0.2.1)

1 Like

Thanks all,

For my purposes I can just convert these Infs to NAs.

I still find this very confusing though.

I tried to run the debugger on print() (learning the hard way to always use debugonce(print) and never debug(print)) and then print(min_y) but at no time is x represented as Inf, it is NA from the start. Which is very odd since cat(min_y) gives the correct value. Running the debugger on cat() is even stranger because it too seems to represent the input as NA from the start.

@Fer '%out%' <- Negate('%in%') is a lifesaver! Thanks.

Thanks, good to know. Always prefer to point someone to CRAN or at least a pkg destined, though there are a lot of useful little helpers in HelfRhein.

1 Like

x is Inf only in numeric format, it is NA as soon as it is converted in Date. so when you print min_y, it is already a date, so print(min_y) calls print.Date, with a x as NA because it is a date. cat calls .Internal(cat(...)) so I don't know what it does, but it seems to come up with the numeric representation of the date. You can look at source code in r-source somewhere

print(as.Date(Inf, origin = "1970-01-01"))
#> [1] NA
cat(as.Date(Inf, origin = "1970-01-01"))
#> Inf
as.numeric(as.Date(Inf, origin = "1970-01-01"))
#> [1] Inf

Created on 2019-01-03 by the reprex package (v0.2.1)

Thank you for the question ! it helped saw all that.

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.