# Count scholarity by year

I have the following data set:

|3475 | PROF1 | 1988 | Undergraduate
|3478 | PROF2 | 1996 | Undergraduate
|3476 | PROF1 | 1998 | Master
|3479 | PROF2 | 2000 | Master
|3477 | PROF1 | 2002 | Doctoral
|3480 | PROF3 | 2004 | Doctoral

I am trying to do a script that accum the type of graduate by year, but, when change the scholing type decrease from another. I need to show something like this:

``````Undergraduate	Master	Doctoral
``````

1988 1 0 0
1989 1 0 0
1990 1 0 0
1991 1 0 0
1992 1 0 0
1993 1 0 0
1994 1 0 0
1995 1 0 0
1996 2 0 0
1997 2 0 0
1998 1 1 0
1999 1 1 0
2000 0 2 0
2001 0 2 0
2002 0 1 1
2003 0 1 1
2004 0 0 2

Explaining: Until 1997 we had 2 undergraduate professors, so in 1998 one of them concluded a master degree, then we need to decrease the number of undergraduates in 1 because now we need to account him as a master and not an undergraduate, and the same for doctoral.

Somebody would help?

You can do something like this

``````library(tidyr)
library(dplyr)

# Sample data on a copy/paste friendly format
sample_df <- data.frame(
stringsAsFactors = FALSE,
id = c(3475, 3478, 3476, 3479, 3477, 3480),
prof = c("PROF1", "PROF2", "PROF1", "PROF2", "PROF1", "PROF3"),
year = c(1988, 1996, 1998, 2000, 2002, 2004),
)

sample_df %>%
count(year, degree) %>%
pivot_wider(names_from = degree, values_from = n, values_fill = 0)
#> # A tibble: 6 x 4
#>   <dbl>         <int>  <int>    <int>
#> 1  1988             1      0        0
#> 2  1996             1      0        0
#> 3  1998             0      1        0
#> 4  2000             0      1        0
#> 5  2002             0      0        1
#> 6  2004             0      0        1
``````

Created on 2020-10-03 by the reprex package (v0.3.0)

If you need more specific help, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

my solution:

``````exdf <- structure(list(
code = c(3475, 3478, 3476, 3479, 3477, 3480),
p = c(
"PROF1", "PROF2", "PROF1", "PROF2", "PROF1",
"PROF3"
), y = c(1988, 1996, 1998, 2000, 2002, 2004), deg = c(
)
), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))

library(tidyverse)

maxy <- max(exdf\$y)

code_a_prof <- function(df,maxy){

df2 <- mutate(df,
) %>%
mutate(next_y = ifelse(is.na(next_y), maxy, next_y)) %>%
rowwise() %>%
mutate(
deg_list = list(rep(deg, 1 + next_y - y)),
y_list = list(seq(from = y, to = next_y, by = 1))
)

pname <- pull(df, p) %>% unique()

tibble(
y = unlist(df2\$y_list),
deg = unlist(df2\$deg_list)
) %>% mutate(p = pname)
}

# code_a_prof(exdf%>%filter(p=="PROF1"),maxy)

group_by(exdf,p) %>%
group_map(~code_a_prof(.,maxy),.keep=TRUE) %>%
bind_rows %>%
group_by(y,deg) %>%
summarise(n=n()) %>%
pivot_wider(names_from="deg",
values_from="n")``````

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.