help with list of subsets multiplying with different constant values in each row

Hello,

I am trying to figure out how to multiply a set of constant values to specific subsets of dataframe.

In the below example, I need to be able to multiply percentages from table "perc" to sales from table "vol". But I am only interested in multiplying these values with last 3 years sales per year. In other words, I need subsets for each year with its previous 3 years sales that can be multiplied with percentages from perc table in its same sequence from 80% to 55%.

So, in this example, I am looking for 2 subsets: one for year 2021 with sales values from 2019-2021 and another for year 2020 with sales values from 2018-2020. These subsets need to be multiplied by values from Percentage table. These subsets can be in the form of list also which I can then combine later to get one table as well.

Thanks in advance for the help!

library(tidyverse)
library(lubridate)

perc <- data.frame(
Percentage = c(80, 40, 55)
)

vol <- data.frame(
  Date = c("2018 Jan",
           "2018 Feb","2018 Mar","2018 Apr","2018 May","2018 Jun",
           "2018 Jul","2018 Aug","2018 Sep","2018 Oct","2018 Nov",
           "2018 Dec","2019 Jan",
           "2019 Feb","2019 Mar","2019 Apr","2019 May","2019 Jun",
           "2019 Jul","2019 Aug","2019 Sep","2019 Oct","2019 Nov",
           "2019 Dec","2020 Jan",
           "2020 Feb","2020 Mar","2020 Apr","2020 May","2020 Jun",
           "2020 Jul","2020 Aug","2020 Sep","2020 Oct","2020 Nov",
           "2020 Dec", "2021 Jan",
           "2021 Feb","2021 Mar","2021 Apr","2021 May","2021 Jun",
           "2021 Jul","2021 Aug","2021 Sep","2021 Oct","2021 Nov",
           "2021 Dec"),
  
  Country = c("CA","CA","CA","CA","CA","CA","US","US","US","US","US","US",
              "CA","CA","CA","CA","CA","CA","US","US","US","US","US","US",
              "CA","CA","CA","CA","CA","CA","US","US","US","US","US","US",
              "CA","CA","CA","CA","CA","CA","US","US","US","US","US","US"),
  
  Type = c("A", "B", "C", "D", "A", "B", "C", "D","A", "B", "C", "D",
           "A", "B", "C", "D", "A", "B", "C", "D","A", "B", "C", "D",
           "A", "B", "C", "D", "A", "B", "C", "D","A", "B", "C", "D",
           "A", "B", "C", "D", "A", "B", "C", "D","A", "B", "C", "D"),
  
  Sales = c(100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,
            220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,
            340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,
            460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570)
)

vol$Date <- ym(vol$Date)
vol$year <- year(vol$Date)


time_periods_of_interest <- c(
  2021,
  2020
)
evaluations <- map(time_periods_of_interest,
                   ~ seq(to = ., by = 1, length.out = 3)
                   ) %>% set_names(time_periods_of_interest)


(step_1 <- map(
  evaluations,
  ~ {
    filter(
      vol,
      year %in% .x
    ) %>%
      group_by(year) %>%
      summarise(sumsales = sum(Sales)) %>%
      bind_cols(perc)  })
  )

(step_2 <- map(step_1, ~ {
  mutate(.x, per_sum_sales = sumsales * Percentage / 100)
}))

This is great! Thank you for your help @nirgrahamuk !
I have never used functions the way you have used here with closing brackets end to end. Thank you!

Its a shorthand for print(), saves a lot of typing !

1 Like

@nirgrahamuk , for my learning purpose, if I didn't want to use values of 2021 in first subset and instead it was for previous 3 years (2018-2020) but for year 2021, then how would you modify
~ seq(to = ., by = 1, length.out = 3)

Thanks again!

@nirgrahamuk ! Thanks again for your help.
Your solution worked great on my sample data. But when I am trying to apply it on my actual & bigger data, I get the following error. I think because I have sub-groups within each month of teh year and then bind_cols doesn't have the same size. Any take on this? Any help here would be appreciated. Thanks!

Error: Can'trecycle `..1` (size 274) to match `..2` (size 10).

sorry I can't guess.
at minimum it would help to know at which step the error appears.

Yes, I think I made my sample data too small. I am resending the sample data here to understand better. This is still a smaller version of real data, but I believe its the grouping in Country & Type which is creating the issue as the size of percentage table is smaller than vol1 dataframe.

At the same time, I do want to see the results along with Country and Type here instead of just year.

Thanks again for looking into this issue!

vol1 <- data.frame(
  Date = c("2018 Jan","2018 Jan","2018 Jan","2018 Jan",
           "2018 Feb","2018 Feb","2018 Feb","2018 Feb",
           "2018 Mar","2018 Mar","2018 Mar","2018 Mar",
           "2018 Apr","2018 Apr","2018 Apr","2018 Apr",
           "2018 May","2018 May","2018 May","2018 May",
           "2018 Jun","2018 Jun","2018 Jun","2018 Jun",
           "2018 Jul","2018 Jul","2018 Jul","2018 Jul",
           "2018 Aug","2018 Aug","2018 Aug","2018 Aug",
           "2018 Sep","2018 Sep","2018 Sep","2018 Sep",
           "2018 Oct","2018 Oct","2018 Oct","2018 Oct",
           "2018 Nov","2018 Nov","2018 Nov","2018 Nov",
           "2018 Dec","2018 Dec","2018 Dec","2018 Dec",
           "2019 Jan","2019 Jan","2019 Jan","2019 Jan",
           "2019 Feb","2019 Feb","2019 Feb","2019 Feb",
           "2019 Mar","2019 Mar","2019 Mar","2019 Mar",
           "2019 Apr","2019 Apr","2019 Apr","2019 Apr",
           "2019 May","2019 May","2019 May","2019 May",
           "2019 Jun","2019 Jun","2019 Jun","2019 Jun",
           "2019 Jul","2019 Jul","2019 Jul","2019 Jul",
           "2019 Aug","2019 Aug","2019 Aug","2019 Aug",
           "2019 Sep","2019 Sep","2019 Sep","2019 Sep",
           "2019 Oct","2019 Oct","2019 Oct","2019 Oct",
           "2019 Nov","2019 Nov","2019 Nov","2019 Nov",
           "2019 Dec","2019 Dec","2019 Dec","2019 Dec",
           "2020 Jan","2020 Jan","2020 Jan","2020 Jan",
           "2020 Feb","2020 Feb","2020 Feb","2020 Feb",
           "2020 Mar","2020 Mar","2020 Mar","2020 Mar",
           "2020 Apr","2020 Apr","2020 Apr","2020 Apr",
           "2020 May","2020 May","2020 May","2020 May",
           "2020 Jun","2020 Jun","2020 Jun","2020 Jun",
           "2020 Jul","2020 Jul","2020 Jul","2020 Jul",
           "2020 Aug","2020 Aug","2020 Aug","2020 Aug",
           "2020 Sep","2020 Sep","2020 Sep","2020 Sep",
           "2020 Oct","2020 Oct","2020 Oct","2020 Oct",
           "2020 Nov","2020 Nov","2020 Nov","2020 Nov",
           "2020 Dec", "2020 Dec","2020 Dec", "2020 Dec",
           "2021 Jan","2021 Jan","2021 Jan","2021 Jan",
           "2021 Feb","2021 Feb","2021 Feb","2021 Feb",
           "2021 Mar","2021 Mar","2021 Mar","2021 Mar",
           "2021 Apr","2021 Apr","2021 Apr","2021 Apr",
           "2021 May","2021 May","2021 May","2021 May",
           "2021 Jun","2021 Jun","2021 Jun","2021 Jun",
           "2021 Jul","2021 Jul","2021 Jul","2021 Jul",
           "2021 Aug","2021 Aug","2021 Aug","2021 Aug",
           "2021 Sep","2021 Sep","2021 Sep","2021 Sep",
           "2021 Oct","2021 Oct","2021 Oct","2021 Oct",
           "2021 Nov","2021 Nov","2021 Nov","2021 Nov",
           "2021 Dec","2021 Dec", "2021 Dec","2021 Dec"),
  
  Country = c("CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US"),
  
  Type = c("A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B"
           ),
  
  Sales = c(100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,
            220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,20, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,
            340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,
            460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 60, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570)
)

vol1$Date <- ym(vol1$Date)

I took your new data.frame. vol1 rather than vol.
I renamed vol to vol1 in my original code, and expected to see an error, but it seemed to run fine.
Therefore its hard to know how to support you further.

Thanks @nirgrahamuk for looking into this again!
It works when we consider only years and sales. But when we try to see the similar results by groups, ie, while considering year, country & type, then the error occurs as the size of percentage table doesn't match the sales table

perhaps something along the following lines

library(tidyverse)
library(lubridate)

vol <- data.frame(
  Date = c("2018 Jan","2018 Jan","2018 Jan","2018 Jan",
           "2018 Feb","2018 Feb","2018 Feb","2018 Feb",
           "2018 Mar","2018 Mar","2018 Mar","2018 Mar",
           "2018 Apr","2018 Apr","2018 Apr","2018 Apr",
           "2018 May","2018 May","2018 May","2018 May",
           "2018 Jun","2018 Jun","2018 Jun","2018 Jun",
           "2018 Jul","2018 Jul","2018 Jul","2018 Jul",
           "2018 Aug","2018 Aug","2018 Aug","2018 Aug",
           "2018 Sep","2018 Sep","2018 Sep","2018 Sep",
           "2018 Oct","2018 Oct","2018 Oct","2018 Oct",
           "2018 Nov","2018 Nov","2018 Nov","2018 Nov",
           "2018 Dec","2018 Dec","2018 Dec","2018 Dec",
           "2019 Jan","2019 Jan","2019 Jan","2019 Jan",
           "2019 Feb","2019 Feb","2019 Feb","2019 Feb",
           "2019 Mar","2019 Mar","2019 Mar","2019 Mar",
           "2019 Apr","2019 Apr","2019 Apr","2019 Apr",
           "2019 May","2019 May","2019 May","2019 May",
           "2019 Jun","2019 Jun","2019 Jun","2019 Jun",
           "2019 Jul","2019 Jul","2019 Jul","2019 Jul",
           "2019 Aug","2019 Aug","2019 Aug","2019 Aug",
           "2019 Sep","2019 Sep","2019 Sep","2019 Sep",
           "2019 Oct","2019 Oct","2019 Oct","2019 Oct",
           "2019 Nov","2019 Nov","2019 Nov","2019 Nov",
           "2019 Dec","2019 Dec","2019 Dec","2019 Dec",
           "2020 Jan","2020 Jan","2020 Jan","2020 Jan",
           "2020 Feb","2020 Feb","2020 Feb","2020 Feb",
           "2020 Mar","2020 Mar","2020 Mar","2020 Mar",
           "2020 Apr","2020 Apr","2020 Apr","2020 Apr",
           "2020 May","2020 May","2020 May","2020 May",
           "2020 Jun","2020 Jun","2020 Jun","2020 Jun",
           "2020 Jul","2020 Jul","2020 Jul","2020 Jul",
           "2020 Aug","2020 Aug","2020 Aug","2020 Aug",
           "2020 Sep","2020 Sep","2020 Sep","2020 Sep",
           "2020 Oct","2020 Oct","2020 Oct","2020 Oct",
           "2020 Nov","2020 Nov","2020 Nov","2020 Nov",
           "2020 Dec", "2020 Dec","2020 Dec", "2020 Dec",
           "2021 Jan","2021 Jan","2021 Jan","2021 Jan",
           "2021 Feb","2021 Feb","2021 Feb","2021 Feb",
           "2021 Mar","2021 Mar","2021 Mar","2021 Mar",
           "2021 Apr","2021 Apr","2021 Apr","2021 Apr",
           "2021 May","2021 May","2021 May","2021 May",
           "2021 Jun","2021 Jun","2021 Jun","2021 Jun",
           "2021 Jul","2021 Jul","2021 Jul","2021 Jul",
           "2021 Aug","2021 Aug","2021 Aug","2021 Aug",
           "2021 Sep","2021 Sep","2021 Sep","2021 Sep",
           "2021 Oct","2021 Oct","2021 Oct","2021 Oct",
           "2021 Nov","2021 Nov","2021 Nov","2021 Nov",
           "2021 Dec","2021 Dec", "2021 Dec","2021 Dec"),
  
  Country = c("CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US",
              "CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US","CA","CA","US","US"),
  
  Type = c("A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B",
           "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B"
  ),
  
  Sales = c(100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,100,110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210,
            220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,20, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330,
            340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450,
            460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 60, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570)
)

vol$Date <- ym(vol$Date)
vol$year <- year(vol$Date)

perc <- data.frame(
  y = 1:3,
  Percentage = c(80, 40, 55)
)




time_periods_of_interest <- c(
  2021,
  2020
)
evaluations <- map(time_periods_of_interest,
                   ~ seq(to = ., by = 1, length.out = 3)
) %>% set_names(time_periods_of_interest)


(step_1 <- map(
  evaluations,
  ~ {
    filter(
      vol,
      year %in% .x
    ) %>%
      group_by(Country,Type,year) %>%
      summarise(sumsales = sum(Sales)) %>%
      ungroup() %>%
      mutate(y=dense_rank(desc(year))) %>% left_join(perc)  })
)

(step_2 <- map(step_1, ~ {
  mutate(.x, per_sum_sales = sumsales * Percentage / 100)
}))

Yes! This is working great!
It looks like I will be using dense_rank a lot hereafter.
Thank you so much @nirgrahamuk! It was very helpful!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.