not find function "date_breaks"

Hello, I am presenting the following alert:

(DT [, ggplot (.SD, mapping = aes (date, psavert))
+ + geom_point ()
+ + scale_x_datetime (breaks = date_breaks ("10 years"))
+ + labs (y = 'personal saving rate'),])
Error in date_breaks ("10 years"): could not find function "date_breaks"

And I can't find the answer.
Could you help me, guide me !!
I try to reproduce the examples of the following link
Manipulating date breaks and date labels
So far I have only accomplished the following:

library(data.table)
library(ggplot2)
DT<-setDF(economics)#Using the economics dataset
DT<-setDT(DT)
str(DT)
print(DT,topn = 3)
#How does the saving rate vary with time?
(DT[,ggplot(.SD,mapping=aes(date, psavert))
  + geom_point()
  + labs(y='personal saving rate'),])
#Yikes! the calculated breaks are awful, we need to intervene. 
#We do so using the date_breaks and date_format functions from mizani.
#Set breaks every 10 years
(DT[,ggplot(.SD,mapping=aes(date, psavert))
    + geom_point()
    + scale_x_datetime(breaks=date_breaks("10 years"))
    + labs(y='personal saving rate'),])

Some correction: is it necessary?

you seem to be working from a Python tutorial rather than an R tutorial.
I think the mizani mentioned is a Python library, and not an R library.

I've been moving forward:

library(data.table)
library(ggplot2)
DT<-setDF(economics)#Using the economics dataset
DT<-setDT(DT)
str(DT)
print(DT,topn = 3)
#How does the saving rate vary with time?
(DT[,ggplot(.SD,mapping=aes(date, psavert))
  + geom_point()
  + labs(y='personal saving rate'),])
#Yikes! the calculated breaks are awful, we need to intervene. 
#We do so using the date_breaks and date_format functions from mizani.
#Set breaks every 10 years
(DT[,ggplot(.SD,mapping=aes(date, psavert))
    + geom_point()
    #+ scale_x_datetime(breaks=date_breaks("10 years"))
    + scale_x_date(date_breaks = "10 years")
    + labs(y='personal saving rate'),])
#That is better. Since all the breaks are at the beginning of the year, 
#we can omit the month and day. Using date_format we override the format string.
#For more on the options for the format string see the strftime behavior.
(DT[,ggplot(.SD,mapping=aes(date, psavert))
    + geom_point()
    #+ scale_x_datetime(breaks=date_breaks("10 years"))
    + scale_x_date(date_breaks = "10 years",date_labels="%Y")
    + labs(y='personal saving rate'),])

#We can achieve the same result with a custom formating function.

With the help of this other link:
Position scales for date/time data

What can be wrong in this expression:

custom_date_format2<-function(x) {
  if (month(x)==1 && mday(x)==1) {
    ## First day of the year
    fmt = "%Y"
  } else if (month(x)%%2 != 0) {
    # Every other month
    fmt ="%b"
  } else {
    fmt =""
  }
  return (format(x, fmt))  
}

ALL IN TIME I RECEIVE THE FOLLOWING ALERT:

Warning message:
In if (month(x)%%2 != 0) { :
  the condition has length > 1 and only the first element will be used

It's warning, not an error, which means it's telling you something is unusual, but not wrong, so you should be aware in case it was unintentional.

In this case it looks like you've used a vector of length > 1 for x, which means the condition mentioned in the warning has multiple TRUE and FALSE values, so the value is ambiguous. In this circumstance, the warning tells you the first value will be used for the if statement.

This doesn't happen with the first if condition because && can only return single value.

1 Like

transforming the function in this way, there are no alerts

custom_date_format2<-function(x) {
 fmt <- ifelse(month(x)==1 && mday(x)==1, "%Y",
             ifelse(month(x)%%2 != 0, "%b",""))
  return (format(x, fmt))  
}
DT[40:60,custom_date_format2(date)]
 [1] "1970-10-01" "1970-11-01" "1970-12-01" "1971-01-01" "1971-02-01" "1971-03-01"
 [7] "1971-04-01" "1971-05-01" "1971-06-01" "1971-07-01" "1971-08-01" "1971-09-01"
[13] "1971-10-01" "1971-11-01" "1971-12-01" "1972-01-01" "1972-02-01" "1972-03-01"
[19] "1972-04-01" "1972-05-01" "1972-06-01"
> DT[31:43,custom_date_format2(date)]
 [1] "1970" "1970" "1970" "1970" "1970" "1970" "1970" "1970" "1970" "1970" "1970"
[12] "1970" "1971"
> DT[198:201,custom_date_format2(date)]
[1] "1983-12-01" "1984-01-01" "1984-02-01" "1984-03-01"
> DT[1:12,custom_date_format2(date)]
 [1] "Jul" "Aug" "Sep" "Oct" "Nov" "Dec" "Jan" "Feb" "March" "Apr" "May"

and of the form, exposed above, yes;

DT[40:60,custom_date_format2(date)]
 [1] "1970-10-01" "1970-11-01" "1970-12-01" "1971-01-01" "1971-02-01" "1971-03-01"
 [7] "1971-04-01" "1971-05-01" "1971-06-01" "1971-07-01" "1971-08-01" "1971-09-01"
[13] "1971-10-01" "1971-11-01" "1971-12-01" "1972-01-01" "1972-02-01" "1972-03-01"
[19] "1972-04-01" "1972-05-01" "1972-06-01"
Warning message:
In if (month(x)%%2 != 0) { :
  the condition has length > 1 and only the first element will be used

I would like to know why this happens!

the relevant difference is that ifelse function is vectorised, so implicitly loops and tests each element of a vector passed in and applies the true and false outcomes for each.
if function is NOT vectorised, and will assume that you want to test only the first element of any vector you pass to it.

As a programmer it will be up to you to determing if you are working with vectors or single values, and how they should be processed. The warning thrown by if is very literal and descriptive
the condition (month(x)%%2 !=0) has length > 1 and only the first element will be evaluated.
The relevance of this given the particulars of your function might make it a moot point, as your code explicitly picks a single format to apply to all the dates you pass in. if you had been trying to assign a bespoke format for each value of date based on the condition, then you would see a relevant difference in the output of the if vs ifelse construction.

1 Like

I am interested in evaluating each item.
If I understood correctly, I should use
ifelse !!!,
The code is as follows:

custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 && mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b",**""**))
  return (format(x,fmt))
}

now i get the following alert:

col = ifelse(month(x)==1 && mday(x)==1,"%Y",
+              ifelse(month(x)%%2 != 0,"%b",
+                                          NA))
Error in as.POSIXlt(x) : object 'x' not found

And making this other change ["" to NULL], the alert is another:

custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 && mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b",NULL))
  return (format(x,fmt))
}
>
Error in if (any(f0 <- format == "")) { : 
  missing value where TRUE/FALSE needed
Called from: format.POSIXlt(as.POSIXlt(x), ...)

For me it is something new everything about representing calendar dates and times.
Any suggestion, regarding the code,
And some recent enough document for the Date and time data handling. My interest is to manipulate this data with the data.table package.

custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 && mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b",**""**))
  return (format(x,fmt))
}

What code did you run? This code would raise error when trying to create the function itself.

I think your key problem is the use of double-ampersand '&&' as part of your condition within ifelse.
this form of and operator only considers the first of any vector. To check for all in elementwise fashion single ampersand & will do elementwise check

custom_date_format2<-function(x) {
  fmt <- ifelse(month(x)==1 & mday(x)==1, "%Y",
                ifelse(month(x)%%2 != 0, "%b",""))

  return (format(x, fmt))  
}

I use lubridate to manipulate dates, so I can test the above function like

custom_date_format2(c(lubridate::ymd("2020-01-1"),
                      lubridate::ymd("2020-02-1"),
                      lubridate::ymd("2020-03-1")))

I'm not sure what effect you are hoping to achieve, do you want not print any value for the last condition, rather than the default format ? To achieve that, you need to move format around, so that it doesnt happen for every element...

custom_date_format3<-function(x) {
ifelse(month(x)==1 & mday(x)==1, format(x, "%Y"),
                ifelse(month(x)%%2 != 0, format(x, "%b"),""))
}
custom_date_format3(c(lubridate::ymd("2020-01-1"),
                      lubridate::ymd("2020-02-1"),
                      lubridate::ymd("2020-03-1")))

The custom_date_format2 function now alerts me regarding the second check:

custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b",NULL))
  return (format(x,fmt))
}

And I use it inside a ggplot command to format dates with scale_x_date in ggplot2

(DT[,ggplot(.SD[40:60],mapping=aes(date, psavert))                            # modified
  + geom_point()
  + scale_x_date(date_breaks="1 month",
    labels=custom_date_format2,minor_breaks =NULL)
  + labs(y='personal saving rate'),]
)

When loading the library (scales)
alert was removed!
Now, because when I make this change in the function I get an alert again:

(DT[,ggplot(.SD[40:60],mapping=aes(date, psavert))                            # modified
+   + geom_point()
+   + scale_x_date(date_breaks="1 month",
+     labels=custom_date_format2,minor_breaks =NULL)
+   + labs(y='personal saving rate'),]
+ )
Error in if (any(f0 <- format == "")) { : 
  missing value where TRUE/FALSE needed
Called from: format.POSIXlt(as.POSIXlt(x), ...)
Browse[1]> 

And the function custom_date_format2 outside the ggplot returns the expected values:

custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b",""))
  return (format(x,fmt))
}
>DT[31:55,custom_date_format2(date)]
 [1] "1970" "1970-02-01" "March" "1970-04-01" "May" "1970-06-01"
  [7] "Jul" "1970-08-01" "Sep" "1970-10-01" "Nov" "1970-12-01"
[13] "1971" "1971-02-01" "March" "1971-04-01" "May" "1971-06-01"
[19] "Jul" "1971-08-01" "Sep" "1971-10-01" "Nov" "1971-12-01"
[25] "1972"
>
custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b","%b"))
  return (format(x,fmt))
}
> DT[31:55,custom_date_format2(date)]
 [1] "1970" "Feb" "March" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
[12] "Dec" "1971" "Feb" "March" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct"
[23] "Nov" "Dec" "1972"
> 

Hi @Hermes, it's hard to know how to recreate what you're experiencing without knowing exactly what you did: Here you present two versions of your custom data function, but you don't say what version you used that triggered the warning or what the warning was.

When you post, it would helpful to know: 1) what data you're using (for example, is it still the same DT from your original post, 2) what code you're using and where it comes from, and 3) the behavior you don't understand and or the ideal behavior you'd like to achieve.

In your last post, we have a guess at 1) but 2) and 3) are less clear. Could you clarify those points?

  1. The data I am using for example, is it still the same DT from my original post.
    2.a) I am using
custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b","%b")) #originally ""
  return (format(x,fmt))
}

(DT[,ggplot(.SD[40:60],mapping=aes(date, psavert))                            # modified
  + geom_point()
  + scale_x_date(date_breaks="1 month",
    labels=custom_date_format2,minor_breaks =NULL)
  + labs(y='personal saving rate'),]
)

2b) it came from
manipulating-date-breaks-and-date-labels

from datetime import date

def custom_date_format2(breaks):
    """
    Function to format the date
    """
    res = []
    for x in breaks:
        # First day of the year
        if x.month == 1 and x.day == 1:
            fmt = '%Y'
        # Every other month
        elif x.month % 2 != 0:
            fmt = '%b'
        else:
            fmt = ''

        res.append(date.strftime(x, fmt))

    return res

(ggplot(economics.loc[40:60, :])                            # modified
 + geom_point(aes('date', 'psavert'))
 + scale_x_datetime(
     breaks=date_breaks('1 months'),
     labels=custom_date_format2,
     minor_breaks=[])
 + labs(y='personal saving rate')
)

So far, all the codes before this, that are shown on the page, I have managed to translate them to R and inside data.table
3)I want to get the graph displayed as a result of running the code shown on the page leading to R
tutorials_miscellaneous-manipulating-date-breaks-and-date-labels_13_0

Thanks, @Hermes: The problem is caused by the call to ifelse(), which returns vector of formats that begins and ends with NA. (Run debug(custom_date_format2) before you run your plotting command, and you'll be able to inspect fmt. In debug mode, pressing 'Enter' steps through the code, and you can see object values in the 'Environment' tab in the upper right.)

To fix this, you can either define custom_date_format2() like this:

custom_date_format2<-function(x) {
  fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b","%b")) #originally ""
  fmt <- replace_na(fmt, '%b') # requires tidyr or tidyverse package
  return (format(x,fmt))
}

or you can use the tidyverse equivalent, if_esle(), which allows you to explicitly deal with missing values:

custom_date_format2<-function(x) {
  # requires dplyr or tidyverse package
  fmt =if_else(month(x)==1 & mday(x)==1, "%Y", missing = '%b',
              if_else(month(x)%%2 != 0, "%b","%b", missing = '%b')) 
  return (format(x,fmt))
}

but the one I prefer is:

custom_date_format2<-function(x) {
  # requires dplyr or tidyverse package
  fmt <- 
    case_when(
      month(x) == 1 & mday(x) == 1 ~ "%Y", # first pass
      TRUE ~ "%b"  # last pass
    )
  return (format(x,fmt))
}

since case_when() avoids the need for nested if_esle() statements and allows for explicit treatment of any case you need to deal with.

I hope this helps.

P.S. Remember to run undebug(custom_date_format2) when you're done, @Hermes, or it will become very tiresome :slight_smile:

A short question:
How to make the characters of the months in the graph be shown in English, and not in Hebrew?

None of the content on this thread relates to Hebrew...
The last sample of output data you provided to this thread showed English...

DT[31:55,custom_date_format2(date)]
[1] "1970" "Feb" "March" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"

Can you identify the code you added which resulted in a different language?
If you did not introduce the effect inentionally, it might be a side effect of locale detection by libraries you are using.

For example in the lubridate docs https://cran.r-project.org/web/packages/lubridate/lubridate.pdf
You can see wht wday() function which gives the 'name' (language dependent) of the day of the week of the date passed in, this defaults to a users locale but can be explicitly overriden.

wday(x, label = FALSE, abbr = TRUE,
week_start = getOption("lubridate.week.start", 7),
locale = Sys.getlocale("LC_TIME"))

locale locale to use for day names. Default to current locale.

also:
שלום, שמי ניר

I actually did the translation to show them on the forum. this is what i get on the charts

if its from the format() function which interprets %b etc, then

Sys.setlocale("LC_TIME", "English")