Half year periods based on date

Hi,
I managed to extract months (as numeric), years (as numeric) and year quarters (using zoo package) from my date variable. Now I need to create half year periods with proper labels (such as H1 17, H2 17, H1 18, H2 18).

How can I do that easily?
I would appreciate your help.

Slavek

We don't really have enough info to help you out. Could you ask this with a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

1 Like

I used following codes:

df$year <- as.numeric(format(df$`Interview Date`,'%Y'))
df$month <- as.numeric(format(df$`Interview Date`,'%m'))

library(zoo)
df$yq <- as.yearqtr(df$`Interview Date`, format = "%Y-%m-%d")

and `Interview Date' is in lollowing format:

$ Interview Date: POSIXct, format: "2018-07-02 08:58:00" "2018-07-02 13:37:00" "2018-07-09 05:37:00" "2018-07-12 17:19:00" ...

$ Interview Date: POSIXct, format: "2018-07-02 08:58:00" "2018-07-02 13:37:00" "2018-07-09 05:37:00" "2018-07-12 17:19:00" ...

I thought I could simply use a recode of df$month into new variable called "HY" with 1-6=1 and 7-12=2 to specify half year periods and then combine this new variable with years somehow to get 2016_1, 2016_2...2018_2 etc.

I know it must be possible but I don't know how to do that...

Can you help?

Your code is not actually reproducible, please follow the link I gave you before and turn this into a reproducible example, including sample data.

I am going throug this now as I am new to R.
In the meantime I menaged to recode HY periods using

df$HY <- ifelse(df$month < 7, 1, 2)

If you are new to R, it's better for you if you learn how to properly ask for help first, I'm sure there are much easier ways to do what you are trying to accomplish but we need sample data to give you better quality help.

1 Like
library(readxl)
#> Warning: package 'readxl' was built under R version 3.4.4
df <- read_excel("P:/NL 12 18 Urns in Run (6m).xlsx", sheet = "Sales")
#> New names:
#> * `` -> `..3`
#> * `` -> `..4`
#> * `` -> `..6`

df$year <- as.numeric(format(df```
Interview Date`,'%Y'))
df$month <- as.numeric(format(df```
Interview Date`,'%m'))

library(zoo)
#> Warning: package 'zoo' was built under R version 3.4.4
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
df$yq <- as.yearqtr(df```
Interview Date`, format = "%Y-%m-%d")

df$hy <- ifelse(df$month < 7, 1, 2)

range(df$yq)
#> [1] "2018 Q3" "2018 Q4"
reprex()
#> Error in reprex(): could not find function "reprex"

Created on 2019-01-25 by the reprex package (v0.2.1)

How I need to merge hy with year to get the same thing as yq generated by zoo package

Still not reproducible since, we don't have acces to your file, you can use the datapasta package to include some sample data in your reprex, here is a blog post from mara that explains how to do it.

tibble::tribble(
            ~URN, ~Interview.Date, NA~NA, NA~NA, ~Outlet.Code, NA~NA, ~Staff.Code, ~Branch.ID, ~Dealer.Code, ~Dealer.Name, ~Dealer.Area, ~Dealer.Region,
  "9NL012654879",     "02-Jul-18", NANA, NANA,           NA, NANA,          NA,   3302187L,    "NL11206",        "xxx",     "Area 3",      "Verkoop",
  "9NL012654878",     "02-Jul-18", NANA, NANA,           NA, NANA,          NA,  12310540L,    "NL12417",        "yyy",     "Area 2",      "Verkoop",
  "9NL012654880",     "09-Jul-18", NANA, NANA,           NA, NANA,          NA,  12310535L,    "NL11252",        "zzz",     "Area 1",      "Verkoop",
  "9NL012545887",     "12-Jul-18", NANA, NANA,           NA, NANA,          NA,  12310405L,    "NL11251",        "aaa",     "Area 1",      "Verkoop",
  "9NL012544675",     "12-Jul-18", NANA, NANA,           NA, NANA,          NA,  12310533L,    "NL11250",        "bbb",     "Area 1",      "Verkoop",
  "9NL012544671",     "12-Jul-18", NANA, NANA,           NA, NANA,          NA,  12310407L,    "NL14703",        "ccc",     "Area 1",      "Verkoop"
  )

That data has no POSIXct variable, but this example should work with your data too.

library(dplyr)
library(lubridate)

sample_data <- data.frame(time_stamp = as.POSIXct(c("2018-06-02 08:58:00",
                                                    "2018-06-02 13:37:00",
                                                    "2018-06-09 05:37:00",
                                                    "2018-06-12 17:19:00",
                                                    "2018-07-02 08:58:00",
                                                    "2018-07-02 13:37:00",
                                                    "2018-07-09 05:37:00",
                                                    "2018-07-12 17:19:00")))
sample_data %>% 
    mutate(half = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
#>            time_stamp    half
#> 1 2018-06-02 08:58:00 H1 2018
#> 2 2018-06-02 13:37:00 H1 2018
#> 3 2018-06-09 05:37:00 H1 2018
#> 4 2018-06-12 17:19:00 H1 2018
#> 5 2018-07-02 08:58:00 H2 2018
#> 6 2018-07-02 13:37:00 H2 2018
#> 7 2018-07-09 05:37:00 H2 2018
#> 8 2018-07-12 17:19:00 H2 2018

Created on 2019-01-25 by the reprex package (v0.2.1)

1 Like

Thank you but I cannot specify individual records each time. How can I refer as.POSIXct to the existing variable InterviewDate which I guess is already in the POSIXct format:

 $ InterviewDate  : POSIXct, format: "2018-07-14 16:29:00" "2018-07-19 08:36:00" "2018-07-13 18:32:00" "2018-07-14 13:44:00" ...

I know I cannot use that code:

sample_data <- data.frame(time_stamp = as.POSIXct(InterviewDate))

Can you help?

You cannot use as.POSIXct() as this already requires your date to be in ISO format. The package lubridate comes with all sorts of useful functionts to parse dates, and you should be able to uselubridate::dmy() to parse your Interview.Date

Alternatively you could convert Interview.Date to an ISO Date string with (for example) gsub(), substr(), strsplit(), and paste() or extract the numeric representation of day/month/year and feed it to ISOdate(). That's probably a good exercise for you to learn about string processing, but if you just want to get your task done quickly, use lubridate.

1 Like

I think that andresrcs is offering assolutely brilliant solution. I simply need to apply it to my entire variable rather than specific eight dates given in his code...

You are right @Slavek, you have to apply it to your own data, we usually gave answers with minimal example data and then you have to adapt it to your real data.

Just to clarify, in this part I'm just generation some made up data to use as example

sample_data <- data.frame(time_stamp = as.POSIXct(c("2018-06-02 08:58:00",
                                                    "2018-06-02 13:37:00",
                                                    "2018-06-09 05:37:00",
                                                    "2018-06-12 17:19:00",
                                                    "2018-07-02 08:58:00",
                                                    "2018-07-02 13:37:00",
                                                    "2018-07-09 05:37:00",
                                                    "2018-07-12 17:19:00")))

The actuall solution is on this part

sample_data %>% 
    mutate(half = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))

I fully understand and I'm really obliged for your brilliant solution :+1:.
All I need is a code taking into account all possible dates rather than just a range specilied here:

sample_data <- data.frame(time_stamp = as.POSIXct(c("2018-06-02 08:58:00",
                                                    "2018-06-02 13:37:00",
                                                    "2018-06-09 05:37:00",
                                                    "2018-06-12 17:19:00",
                                                    "2018-07-02 08:58:00",
                                                    "2018-07-02 13:37:00",
                                                    "2018-07-09 05:37:00",
                                                    "2018-07-12 17:19:00")))

Once I've got it, I could hopefully work out everything else myself...

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.