The custom_date_format2 function now alerts me regarding the second check:

custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b",NULL))
  return (format(x,fmt))
}

And I use it inside a ggplot command to format dates with scale_x_date in ggplot2

(DT[,ggplot(.SD[40:60],mapping=aes(date, psavert))                            # modified
  + geom_point()
  + scale_x_date(date_breaks="1 month",
    labels=custom_date_format2,minor_breaks =NULL)
  + labs(y='personal saving rate'),]
)

When loading the library (scales)
alert was removed!
Now, because when I make this change in the function I get an alert again:

(DT[,ggplot(.SD[40:60],mapping=aes(date, psavert))                            # modified
+   + geom_point()
+   + scale_x_date(date_breaks="1 month",
+     labels=custom_date_format2,minor_breaks =NULL)
+   + labs(y='personal saving rate'),]
+ )
Error in if (any(f0 <- format == "")) { : 
  missing value where TRUE/FALSE needed
Called from: format.POSIXlt(as.POSIXlt(x), ...)
Browse[1]> 

And the function custom_date_format2 outside the ggplot returns the expected values:

custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b",""))
  return (format(x,fmt))
}
>DT[31:55,custom_date_format2(date)]
 [1] "1970" "1970-02-01" "March" "1970-04-01" "May" "1970-06-01"
  [7] "Jul" "1970-08-01" "Sep" "1970-10-01" "Nov" "1970-12-01"
[13] "1971" "1971-02-01" "March" "1971-04-01" "May" "1971-06-01"
[19] "Jul" "1971-08-01" "Sep" "1971-10-01" "Nov" "1971-12-01"
[25] "1972"
>
custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b","%b"))
  return (format(x,fmt))
}
> DT[31:55,custom_date_format2(date)]
 [1] "1970" "Feb" "March" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
[12] "Dec" "1971" "Feb" "March" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct"
[23] "Nov" "Dec" "1972"
> 

Hi @Hermes, it's hard to know how to recreate what you're experiencing without knowing exactly what you did: Here you present two versions of your custom data function, but you don't say what version you used that triggered the warning or what the warning was.

When you post, it would helpful to know: 1) what data you're using (for example, is it still the same DT from your original post, 2) what code you're using and where it comes from, and 3) the behavior you don't understand and or the ideal behavior you'd like to achieve.

In your last post, we have a guess at 1) but 2) and 3) are less clear. Could you clarify those points?

  1. The data I am using for example, is it still the same DT from my original post.
    2.a) I am using
custom_date_format2<-function(x) {
fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b","%b")) #originally ""
  return (format(x,fmt))
}

(DT[,ggplot(.SD[40:60],mapping=aes(date, psavert))                            # modified
  + geom_point()
  + scale_x_date(date_breaks="1 month",
    labels=custom_date_format2,minor_breaks =NULL)
  + labs(y='personal saving rate'),]
)

2b) it came from
manipulating-date-breaks-and-date-labels

from datetime import date

def custom_date_format2(breaks):
    """
    Function to format the date
    """
    res = []
    for x in breaks:
        # First day of the year
        if x.month == 1 and x.day == 1:
            fmt = '%Y'
        # Every other month
        elif x.month % 2 != 0:
            fmt = '%b'
        else:
            fmt = ''

        res.append(date.strftime(x, fmt))

    return res

(ggplot(economics.loc[40:60, :])                            # modified
 + geom_point(aes('date', 'psavert'))
 + scale_x_datetime(
     breaks=date_breaks('1 months'),
     labels=custom_date_format2,
     minor_breaks=[])
 + labs(y='personal saving rate')
)

So far, all the codes before this, that are shown on the page, I have managed to translate them to R and inside data.table
3)I want to get the graph displayed as a result of running the code shown on the page leading to R
tutorials_miscellaneous-manipulating-date-breaks-and-date-labels_13_0

Thanks, @Hermes: The problem is caused by the call to ifelse(), which returns vector of formats that begins and ends with NA. (Run debug(custom_date_format2) before you run your plotting command, and you'll be able to inspect fmt. In debug mode, pressing 'Enter' steps through the code, and you can see object values in the 'Environment' tab in the upper right.)

To fix this, you can either define custom_date_format2() like this:

custom_date_format2<-function(x) {
  fmt =ifelse(month(x)==1 & mday(x)==1, "%Y",
              ifelse(month(x)%%2 != 0, "%b","%b")) #originally ""
  fmt <- replace_na(fmt, '%b') # requires tidyr or tidyverse package
  return (format(x,fmt))
}

or you can use the tidyverse equivalent, if_esle(), which allows you to explicitly deal with missing values:

custom_date_format2<-function(x) {
  # requires dplyr or tidyverse package
  fmt =if_else(month(x)==1 & mday(x)==1, "%Y", missing = '%b',
              if_else(month(x)%%2 != 0, "%b","%b", missing = '%b')) 
  return (format(x,fmt))
}

but the one I prefer is:

custom_date_format2<-function(x) {
  # requires dplyr or tidyverse package
  fmt <- 
    case_when(
      month(x) == 1 & mday(x) == 1 ~ "%Y", # first pass
      TRUE ~ "%b"  # last pass
    )
  return (format(x,fmt))
}

since case_when() avoids the need for nested if_esle() statements and allows for explicit treatment of any case you need to deal with.

I hope this helps.

P.S. Remember to run undebug(custom_date_format2) when you're done, @Hermes, or it will become very tiresome :slight_smile:

A short question:
How to make the characters of the months in the graph be shown in English, and not in Hebrew?

None of the content on this thread relates to Hebrew...
The last sample of output data you provided to this thread showed English...

DT[31:55,custom_date_format2(date)]
[1] "1970" "Feb" "March" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"

Can you identify the code you added which resulted in a different language?
If you did not introduce the effect inentionally, it might be a side effect of locale detection by libraries you are using.

For example in the lubridate docs https://cran.r-project.org/web/packages/lubridate/lubridate.pdf
You can see wht wday() function which gives the 'name' (language dependent) of the day of the week of the date passed in, this defaults to a users locale but can be explicitly overriden.

wday(x, label = FALSE, abbr = TRUE,
week_start = getOption("lubridate.week.start", 7),
locale = Sys.getlocale("LC_TIME"))

locale locale to use for day names. Default to current locale.

also:
שלום, שמי ניר

I actually did the translation to show them on the forum. this is what i get on the charts

if its from the format() function which interprets %b etc, then

Sys.setlocale("LC_TIME", "English")

I try to get the last graph shown on the site, with its functions:

def custom_date_format3(breaks):
    """
    Function to format the date
    """
    res = []
    for x in breaks:
        # First day of the year
        if x.month == 1:
            fmt = '%Y'
        else:
            fmt = '%b'

        res.append(date.strftime(x, fmt))

    return res


def custom_date_breaks(width=None):
    """
    Create a function that calculates date breaks

    It delegates the work to `date_breaks`
    """
    def filter_func(limits):
        breaks = date_breaks(width)(limits)
        # filter
        return [x for x in breaks if x.month % 2]

    return filter_func


(ggplot(economics.loc[40:60, :])
 + geom_point(aes('date', 'psavert'))
 + scale_x_datetime(                                        # modified
     breaks=custom_date_breaks('1 months'),
     labels=custom_date_format3)
 + labs(y='personal saving rate')
)

and did not achieve the expected graph.
lastGph
and did not achieve the expected graph.
These are the functions i use

custom_date_format3<-function(x){
  res<-c()
  fmt<-dplyr::case_when(
    month(x) == 1 ~ "%Y",
    TRUE ~ "%b")# first pass
      # last pass
  res<-c(res,format(x,fmt))
    return(res)
}
custom_date_breaks<-function(x){
 #m<-c(0)
 m = structure(rep(NA_real_, 1 ), class="Date")
 rpt<-dplyr::case_when(month(x) %% 2==0 ~ x)
 m<-c(m,rpt)
  return(m) 
}

(DT[,ggplot(.SD[40:60],mapping=aes(date, psavert))
  + geom_point()
  + scale_x_date(                                       # modified
    breaks=custom_date_breaks,
    labels=custom_date_format3)
  + labs(y='personal saving rate'),]
)

and the graph I get:

How to obtain the graph without the need of these functions? Only using the functions of the ggplot2 package:
scale_date {ggplot2}
Position scales for date / time data

Did you check what the output of your custom_date_breaks() function looks like?

The output of the function is as follows:

DT[,custom_date_breaks(date)][1:30]
 [1] NA           NA           "1967-08-01" NA           "1967-10-01" NA           "1967-12-01" NA           "1968-02-01"
[10] NA           "1968-04-01" NA           "1968-06-01" NA           "1968-08-01" NA           "1968-10-01" NA          
[19] "1968-12-01" NA           "1969-02-01" NA           "1969-04-01" NA           "1969-06-01" NA           "1969-08-01"
[28] NA           "1969-10-01" NA          

Is that the output you expected? Could you say a little about what you intended custom_date_breaks() to do?

I try to get the graph displayed,as above posted :point_up:
I have performed the functions of the site to R. I have tried it, my knowledge of python is that of having used other software.
So I also asked: if it is not possible to obtain, from the data, the same graph using the proper functions of ggplot2.?

Thanks, @Hermes, that is helpful. I didn't realize you were trying to write the code in python to describe what you intended -- I had thought you were posting actual python code from a package you had mentioned earlier. I might suggest you try pseudocode in R, instead, referring to the documentation specifications you're implementing, in much the way it looks like you tried to develop your python code. That would be helpful, since folks here may not be familiar with python syntax, and may have trouble following you.

I'll come back again when I can, but if you look at the R documentation for scale_x_date() (which I'm sure you did, from your python code), it says breaks should be one of

  • NULL for no breaks
  • waiver() for the breaks specified by date_breaks
  • A Date/POSIXct vector giving positions of breaks
  • A function that takes the limits as input and returns breaks as output

so the last two would apply in your case, and it says limits should be one of:

  • NULL to use the default scale range
  • A numeric vector of length two providing limits of the scale. Use NA to refer to the existing minimum or maximum
  • A function that accepts the existing (automatic) limits and returns new limits

where, again, the last two apply in your case.

From this, it looks like limits are stored in any case as a vectors of length two, indicating the min and max, so the function version of breaks should probably takes these as input and return the third type of breaks, namely a "Date/POSIXct vector giving positions of breaks".

I hope this helps, and thanks again for the clarification.

with a change in function:
custom_date_format2
AND THESE LIMITS
limits = as.Date(c("1970-11-01","1972-09-01")
is the closest I have come to the expected graph.
If I vary the limits, it will disappear: the labels with years or the labels with names of months.

library(data.table)
library(ggplot2)
library(scales) 
library(ggeasy)
DT<-setDF(economics)#Using the economics dataset
DT<-setDT(DT)
str(DT)
print(DT,topn = 3)
#How does the saving rate vary with time?
(DT[,ggplot(.SD,mapping=aes(date, psavert))
    + geom_point()
    + labs(y='personal saving rate'),])

custom_date_format2<-function(x) {
  # requires dplyr or tidyverse package
  res<-c()
  fmt <- 
    dplyr::case_when(
      month(x) == 1 & mday(x) == 1 ~ "%Y", # first pass
      TRUE ~ "%b"# last pass
    )
  res<-c(res,format(x,fmt,tz = "GMT"))
  return(res)
}

(DT[,ggplot(.SD[39:60],mapping=aes(date, psavert))                            # modified
    + geom_point()
    + scale_x_date(breaks="2 month",minor_breaks =NULL,
                   labels=custom_date_format2  ,limits = as.Date(c("1970-11-01","1972-09-01")))
    + labs(y='personal saving rate')
    +easy_x_axis_labels_size(7),]
)


How to limit the axis of dates between Nov 1971 and Jul 1972?
What could be missing?

Hi @Hermes I think I understand now what you're describing, and here is a possibility:

library(data.table)
library(ggplot2)
library(scales) 
DT<-setDF(economics)#Using the economics dataset
DT<-setDT(DT)
print(DT,topn = 3)
#>            date     pce      pop psavert uempmed unemploy
#>   1: 1967-07-01   506.7 198712.0    12.6     4.5     2944
#>   2: 1967-08-01   509.8 198911.0    12.6     4.7     2945
#>   3: 1967-09-01   515.6 199113.0    11.9     4.6     2958
#>  ---                                                     
#> 572: 2015-02-01 12082.4 320074.5     7.9    12.9     8610
#> 573: 2015-03-01 12158.3 320230.8     7.4    12.0     8504
#> 574: 2015-04-01 12193.8 320402.3     7.6    11.5     8526
#How does the saving rate vary with time?

custom_date_format2<-function(x) {
# requires dplyr or tidyverse package
res<-c()
fmt <- 
dplyr::case_when(
month(x) == 1 & mday(x) == 1 ~ "%Y", # first pass
TRUE ~ "%b"# last pass
)
res<-c(res,format(x,fmt,tz = "GMT"))
return(res)
}

limits = as.Date(c("1970-11-01","1972-07-01"))
(DT[,ggplot(.SD[39:60],mapping=aes(date, psavert))                            # modified
+ geom_point()
+ scale_x_date(
  breaks= 
     seq(limits[1], limits[2], '2 months'),
  minor_breaks =NULL,
  labels=custom_date_format2,
  limits = limits
  )
+ labs(y='personal saving rate')]
)
#> Warning: Removed 2 rows containing missing values (geom_point).

Created on 2020-04-03 by the reprex package (v0.3.0)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.