Cumulative values calculations

I am trying to recreate the attached table that I created manually in excel for my thesis in college. The goal is to add up the value for each company, i.e. how many years the value has been with the company. Then if the value is switched, the counting starts again from the beginning. With each new company, the sum starts all over again anyway, even if it has the same value as the previous company. If the value is found several times in a company, but is interrupted by another value, counting also starts over again. I don't know if this is possible to reproduce in rstudio, since I'm still trying to improve myself in R, but this is where I fail. I have attached a screenshot to clarify. Can someone tell me if this is possible to recreate (not manually but by using formulas, so it also works for larger files) and if so, help me on how to do this? Can this be done by for example using the count formula, or is this too simple? Thanks in advance!
https://imgur.com/a/tgVohly

A brief task solved by group_by and summarise, try:

library(dplyr)

df <- data.frame(
  year = rep(1991:2000,3),
  Company = c(rep("A",10),rep("B",10),rep("C",10)),
  value = c(
    sample(1:5,10,replace = T) |> sort(),
    sample(1:5,10,replace = T) |> sort(),
    sample(1:5,10,replace = T) |> sort()
  )
)

df |> head(30)
   year Company value
1  1991       A     1
2  1992       A     1
3  1993       A     2
4  1994       A     2
5  1995       A     3
6  1996       A     4
7  1997       A     4
8  1998       A     5
9  1999       A     5
10 2000       A     5
11 1991       B     1
12 1992       B     1
13 1993       B     1
14 1994       B     2
15 1995       B     2
16 1996       B     3
17 1997       B     4
18 1998       B     4
19 1999       B     5
20 2000       B     5
21 1991       C     1
22 1992       C     1
23 1993       C     1
24 1994       C     1
25 1995       C     2
26 1996       C     4
27 1997       C     5
28 1998       C     5
29 1999       C     5
30 2000       C     5

df1 <- df |> group_by(Company,value) |> summarise(n = n())

df1
# A tibble: 14 x 3
# Groups:   Company [3]
   Company value     n
   <chr>   <int> <int>
 1 A           1     2
 2 A           2     2
 3 A           3     1
 4 A           4     2
 5 A           5     3
 6 B           1     3
 7 B           2     2
 8 B           3     1
 9 B           4     2
10 B           5     2
11 C           1     4
12 C           2     1
13 C           4     1
14 C           5     4
1 Like

Thanks for the response firstly! I wrote my question not clear enough i think, i am trying to create the exact same table as in the image. With my dataset, i already have the year, company and value variable obtained/calculated, and i need to create the "total value cumulative in years" variable. I am trying to recreate it exactly as in the picture, so .i.e. how many times a "value" occurs in succession, but is reset when another "value" joins the company, or when a company is switched in the dataset. Also, if it would be like this, so c(1,1,1,1,1,2,2,1,1,1), the outcome would be like this: 1,2,3,4,5,1,2,1,2,3, but then as a variable To make it clearer, it's about how long an employment lasts between a firm and a partner. The "value" implies the partner, and the total value is the total duration of employment between the firm and partner. If i am understanding it correctly, what you are doing in the script is calculating how often a value occurs in a company.

It's never too hard, the column you want is basically the row number of each group.
do:

df1 <- df |> group_by(Company,value) |> mutate(cum_sum_val = row_number())
df1
# A tibble: 30 x 4
# Groups:   Company, value [14]
    year Company value cum_sum_val
   <int> <chr>   <int>       <int>
 1  1991 A           1           1
 2  1992 A           1           2
 3  1993 A           1           3
 4  1994 A           1           4
 5  1995 A           2           1
 6  1996 A           3           1
 7  1997 A           5           1
 8  1998 A           5           2
 9  1999 A           5           3
10  2000 A           5           4
# ... with 20 more rows

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.