# Count number of times purchased in a given time range

Hello,

I am struggling to develop a code that counts the number of times a given product has been purchased in the last 3 months not including the month it is in.

Here is a sample of the data:
data =
Col # Date Yr/Month Product
1 1/11/2018 2018/01 A
2 1/15/2018 2018/01 B
3 1/17/2018 2018/01 B
4 1/17/2018 2018/01 C
5 2/20/2018 2018/02 A
6 2/20/2018 2018/02 A
7 2/21/2018 2018/02 A
8 2/26/2018 2018/02 A
9 2/27/2018 2018/02 C
10 3/27/2018 2018/03 C
11 5/10/2018 2018/05 C
12 5/21/2018 2018/05 C
13 6/26/2018 2018/06 C
14 7/9/2018 2018/07 B
15 7/20/2018 2018/07 A
16 8/3/2018 2018/08 C
17 8/3/2018 2018/08 C
18 8/3/2018 2018/08 A
19 8/9/2018 2018/08 B
20 8/9/2018 2018/08 B
21 8/29/2018 2018/08 B
22 9/10/2018 2018/09 C
23 10/3/2018 2018/10 A
24 10/3/2018 2018/10 A
25 10/6/2018 2018/10 A
26 10/6/2018 2018/10 B
27 10/12/2018 2018/10 A
28 10/12/2018 2018/10 B
29 10/25/2018 2018/10 C
30 10/29/2018 2018/10 C
31 11/12/2018 2018/11 A
32 12/19/2018 2018/12 B
33 12/19/2018 2018/12 C

For example, looking at column number 16:

Product C was bought in August of 2018 in both column 16 and column 17. Counting the number of times it was purchased in this month for both column 16 and 17 should be equal to 2.

I can code that part using dplyr:
data1 <- data %>%
group_by(Yr/Month,Product) %>%
summarise(Products = n())

This outputs just the year and month and the number of times a given product is purchased within that month.

Now I want to be able to find how many times product C was purchased between May and July (the 3 months prior to August but not including August).

The result for Col 16 should be 3; since it was purchased twice in May (col 11 and 12) and once in June (col 13) and it wasn't purchased at all in July.

Can anyone point me to a package/code that I would need in order to perform this analysis?

Thank you!

A little clarification first, you are talking about columns, but I think you actually mean rows (horizontal).

I think this produces your desired output

``````library(tidyverse)
library(lubridate)

df <- data.frame(stringsAsFactors=FALSE,
Row = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33),
Date = c("1/11/2018", "1/15/2018", "1/17/2018", "1/17/2018",
"2/20/2018", "2/20/2018", "2/21/2018", "2/26/2018",
"2/27/2018", "3/27/2018", "5/10/2018", "5/21/2018", "6/26/2018",
"7/9/2018", "7/20/2018", "8/3/2018", "8/3/2018", "8/3/2018", "8/9/2018",
"8/9/2018", "8/29/2018", "9/10/2018", "10/3/2018", "10/3/2018",
"10/6/2018", "10/6/2018", "10/12/2018", "10/12/2018",
"10/25/2018", "10/29/2018", "11/12/2018", "12/19/2018", "12/19/2018"),
Yr.Month = c("2018/01", "2018/01", "2018/01", "2018/01", "2018/02",
"2018/02", "2018/02", "2018/02", "2018/02", "2018/03",
"2018/05", "2018/05", "2018/06", "2018/07", "2018/07", "2018/08",
"2018/08", "2018/08", "2018/08", "2018/08", "2018/08", "2018/09",
"2018/10", "2018/10", "2018/10", "2018/10", "2018/10", "2018/10",
"2018/10", "2018/10", "2018/11", "2018/12", "2018/12"),
Product = c("A", "B", "B", "C", "A", "A", "A", "A", "C", "C", "C", "C",
"C", "B", "A", "C", "C", "A", "B", "B", "B", "C", "A", "A",
"A", "B", "A", "B", "C", "C", "A", "B", "C")
)

df <- df %>%
mutate(Date = mdy(Date))

df %>%
mutate(prev_purchase = pmap_int(.l = list(floor_date(Date - months(3), unit = "month"),
floor_date(Date, unit = "month"),
Product),
~ df %>%
filter(Date >= ..1, Date < ..2, Product == ..3) %>%
count() %>%
pull()
))
#>    Row       Date Yr.Month Product prev_purchase
#> 1    1 2018-01-11  2018/01       A             0
#> 2    2 2018-01-15  2018/01       B             0
#> 3    3 2018-01-17  2018/01       B             0
#> 4    4 2018-01-17  2018/01       C             0
#> 5    5 2018-02-20  2018/02       A             1
#> 6    6 2018-02-20  2018/02       A             1
#> 7    7 2018-02-21  2018/02       A             1
#> 8    8 2018-02-26  2018/02       A             1
#> 9    9 2018-02-27  2018/02       C             1
#> 10  10 2018-03-27  2018/03       C             2
#> 11  11 2018-05-10  2018/05       C             2
#> 12  12 2018-05-21  2018/05       C             2
#> 13  13 2018-06-26  2018/06       C             3
#> 14  14 2018-07-09  2018/07       B             0
#> 15  15 2018-07-20  2018/07       A             0
#> 16  16 2018-08-03  2018/08       C             3
#> 17  17 2018-08-03  2018/08       C             3
#> 18  18 2018-08-03  2018/08       A             1
#> 19  19 2018-08-09  2018/08       B             1
#> 20  20 2018-08-09  2018/08       B             1
#> 21  21 2018-08-29  2018/08       B             1
#> 22  22 2018-09-10  2018/09       C             3
#> 23  23 2018-10-03  2018/10       A             2
#> 24  24 2018-10-03  2018/10       A             2
#> 25  25 2018-10-06  2018/10       A             2
#> 26  26 2018-10-06  2018/10       B             4
#> 27  27 2018-10-12  2018/10       A             2
#> 28  28 2018-10-12  2018/10       B             4
#> 29  29 2018-10-25  2018/10       C             3
#> 30  30 2018-10-29  2018/10       C             3
#> 31  31 2018-11-12  2018/11       A             5
#> 32  32 2018-12-19  2018/12       B             2
#> 33  33 2018-12-19  2018/12       C             3
``````

Created on 2019-12-13 by the reprex package (v0.3.0.9000)

This correctly gives the result for the sample data provided.

However, when trying to apply this to the raw data I get this error:

message: Result 1 must be a single integer, not an integer vector of length 0
class: `purrr_error_bad_element_vector`

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.