Calculating project days worked per month

Hi all,

I am trying to analyze data where individuals are involved in projects over the course of various months and years, and I need to generate counts of how many days individuals are involved in projects each month. (In my actual project, I will need to do this for more than 2000 individuals over the course of 2 years.)

Take the examples below:

  1. Person A – Works on project X April 27-June 5, 2019.
  2. Person B – Works on project Y Jan 14-Apr 1, 2021.
  3. Person C – Works on project Z Mar 21-July 4, 2020.

What code can I use to quickly generate these counts per month? Below is an example of how I need to determine project days per month using those same individuals I listed above.


library(tidyverse)

library(knitr)

data <- tibble::tribble(

  ~Project_Start, ~Project_End, ~`Apr-19`, ~`May-19`, ~`Jun-19`, ~`Jan-21`, ~`Feb-21`, ~`Mar-21`, ~`Apr-21`, ~`Mar-20`, ~`Apr-20`, ~`May-20`, ~`Jun-20`, ~`Jul-20`,

    "2019-04-27", "2019-06-05",        4L,       31L,        5L,        NA,        NA,        NA,        NA,        NA,        NA,        NA,        NA,        NA,

    "2021-01-14", "2021-04-01",        NA,        NA,        NA,       18L,       28L,       31L,        1L,        NA,        NA,        NA,        NA,        NA,

    "2020-03-21", "2020-07-04",        NA,        NA,        NA,        NA,        NA,        NA,        NA,       11L,       30L,       31L,       30L,        4L

  )

data %>% kable()

Thank you for the help!

Kind regards,

Sam

1 Like

Is this what you want?

library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.1.2
data <- tibble::tribble(
  
  ~Project_Start, ~Project_End, ~`Apr-19`, ~`May-19`, ~`Jun-19`, ~`Jan-21`, ~`Feb-21`, ~`Mar-21`, ~`Apr-21`, ~`Mar-20`, ~`Apr-20`, ~`May-20`, ~`Jun-20`, ~`Jul-20`,
  
  "2019-04-27", "2019-06-05",        4L,       31L,        5L,        NA,        NA,        NA,        NA,        NA,        NA,        NA,        NA,        NA,
  
  "2021-01-14", "2021-04-01",        NA,        NA,        NA,       18L,       28L,       31L,        1L,        NA,        NA,        NA,        NA,        NA,
  
  "2020-03-21", "2020-07-04",        NA,        NA,        NA,        NA,        NA,        NA,        NA,       11L,       30L,       31L,       30L,        4L
  
)

data |> pivot_longer(cols = -c("Project_Start", "Project_End"), 
                     names_to = "Month") |> group_by(Month) |> 
  summarize(Total = sum(value, na.rm = TRUE))
#> # A tibble: 12 x 2
#>    Month  Total
#>    <chr>  <int>
#>  1 Apr-19     4
#>  2 Apr-20    30
#>  3 Apr-21     1
#>  4 Feb-21    28
#>  5 Jan-21    18
#>  6 Jul-20     4
#>  7 Jun-19     5
#>  8 Jun-20    30
#>  9 Mar-20    11
#> 10 Mar-21    31
#> 11 May-19    31
#> 12 May-20    31

Created on 2022-06-28 by the reprex package (v2.0.1)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.