Using map_df with multiple variables

Hi Guys,
I am trying to use map_df to create a data frame rotating through different date ranges returning a value for them. Basically, I want to calculate the sum of the last year in sales with the last year being based on various dates. I am having trouble doing this. Here is what I have so far.

Sale_Date = seq(from = as.Date("2009-01-01"), to = as.Date("2010-03-02"), by= 'month')

blue_chips_market_cap_example <- data.frame(graded_title = "Mickey_Mantle", price = 1:15, Sale_Date = Sale_Date)

market_cap_late_dates <- (seq(from = as.Date("2009-01-01"), to = as.Date("2019-07-09"), by = 'day')) 
market_cap_early_dates <- (seq(from = as.Date("2008-01-01"), to = as.Date("2018-07-08"), by = 'day'))

market_cap_dates <- as.data.frame(market_cap_early_dates) %>%
  mutate(market_cap_late_dates)
#> Error in as.data.frame(market_cap_early_dates) %>% mutate(market_cap_late_dates): could not find function "%>%"
market_cap_dates
#> Error in eval(expr, envir, enclos): object 'market_cap_dates' not found
as.
blue_chips_market_cap_index_df <- map_df(market_cap_dates, function(market_cap_early_dates, market_cap_late_dates){
  blue_chips_market_cap <- blue_chips %>%
    mutate(in_range = ((Sale_Date >= as.Date("market_cap_early_dates")) & (Sale_Date <= "market_cap_late_dates"))) %>%
    mutate(Market_Cap = sum(if_else(in_range == TRUE, price, 0))) %>%
    distinct(.keep_all = TRUE)
    mutate(Total_Market_Cap = sum(Market_Cap)) %>%
    mutate(Index_Date = market_cap_late_dates)
})
#> Error in map_df(market_cap_dates, function(market_cap_early_dates, market_cap_late_dates) {: could not find function "map_df"

This will fix the syntax but I don't understand what are you trying to accomplish with your code.

library(tidyverse)

blue_chips_market_cap_example <- data.frame(graded_title = "Mickey_Mantle",
                                            price = 1:15,
                                            Sale_Date = seq(from = as.Date("2009-01-01"),
                                                            to = as.Date("2010-03-02"), by= 'month')
)

market_cap_late_dates <- seq(from = as.Date("2009-01-01"), to = as.Date("2019-07-09"), by = 'day') 
market_cap_early_dates <- seq(from = as.Date("2008-01-01"), to = as.Date("2018-07-08"), by = 'day')


blue_chips_market_cap_index_df <- map2_dfr(market_cap_early_dates, market_cap_late_dates, ~{
    blue_chips_market_cap_example %>%
        mutate(in_range = ((Sale_Date >= as.Date(.x)) & (Sale_Date <= .y))) %>%
        mutate(Market_Cap = sum(ifelse(in_range == TRUE, price, 0))) %>%
        distinct(.keep_all = TRUE) %>% 
        mutate(Total_Market_Cap = sum(Market_Cap)) %>%
        mutate(Index_Date = .x)
}) %>% 
    as_tibble()

blue_chips_market_cap_index_df
#> # A tibble: 57,630 x 7
#>    graded_title price Sale_Date  in_range Market_Cap Total_Market_Cap
#>    <fct>        <int> <date>     <lgl>         <dbl>            <dbl>
#>  1 Mickey_Mant…     1 2009-01-01 TRUE              1               15
#>  2 Mickey_Mant…     2 2009-02-01 FALSE             1               15
#>  3 Mickey_Mant…     3 2009-03-01 FALSE             1               15
#>  4 Mickey_Mant…     4 2009-04-01 FALSE             1               15
#>  5 Mickey_Mant…     5 2009-05-01 FALSE             1               15
#>  6 Mickey_Mant…     6 2009-06-01 FALSE             1               15
#>  7 Mickey_Mant…     7 2009-07-01 FALSE             1               15
#>  8 Mickey_Mant…     8 2009-08-01 FALSE             1               15
#>  9 Mickey_Mant…     9 2009-09-01 FALSE             1               15
#> 10 Mickey_Mant…    10 2009-10-01 FALSE             1               15
#> # … with 57,620 more rows, and 1 more variable: Index_Date <date>

Created on 2019-08-26 by the reprex package (v0.3.0.9000)

I'm trying to create a data frame which shows based on a daily basis the sum of sales over the last year.

So the data frame I created has two columns the first column has a date and the second column is a year after that column. Then the in_range checks if it is within a year of those dates and the Market Cap sums those true rows and gets the Market Cap. I want a data frame which gives me the market cap based on each date throughout the date data frame I created.

The data set I am actually working with has over 1.4 million sales points

By using mutate() instead of summarise() you are making 3842 (number of dates in your range) copies of your sample dataset, so the result ends up with 57630 rows, is this what you want? maybe a sample of your desired output would be useful to understand your issue.

So once I get the market cap for each card I want to sum all of those market caps to get a total market cap. The total market cap is all I want. I want the total market cap for each date in the market cap dates to create an index tracking the change in total market cap over time.

So at the end I want a data frame with date, and the total market cap over all cards on that date.

Does that make sense?

Not entirely, have in mind that we don't have the same background context and domain specific knowledge as you, this is what I understand from what you have said.

library(tidyverse)

blue_chips_market_cap_example <- data.frame(graded_title = "Mickey_Mantle",
                                            price = 1:15,
                                            Sale_Date = seq(from = as.Date("2009-01-01"),
                                                            to = as.Date("2010-03-02"), by= 'month')
)

market_cap_late_dates <- seq(from = as.Date("2009-01-01"), to = as.Date("2019-07-09"), by = 'day') 
market_cap_early_dates <- seq(from = as.Date("2008-01-01"), to = as.Date("2018-07-08"), by = 'day')


blue_chips_market_cap_index_df <- map2_dfr(market_cap_early_dates,
                                           market_cap_late_dates,
                                           ~{blue_chips_market_cap_example %>%
                                                   filter(Sale_Date >= .x, Sale_Date <= .y) %>%
                                                   mutate(Index_Date = .y) %>%
                                                   group_by(Index_Date) %>%
                                                   summarise(Market_Cap = sum(price))}) %>%
    as_tibble()

blue_chips_market_cap_index_df
#> # A tibble: 791 x 2
#>    Index_Date Market_Cap
#>    <date>          <int>
#>  1 2009-01-01          1
#>  2 2009-01-02          1
#>  3 2009-01-03          1
#>  4 2009-01-04          1
#>  5 2009-01-05          1
#>  6 2009-01-06          1
#>  7 2009-01-07          1
#>  8 2009-01-08          1
#>  9 2009-01-09          1
#> 10 2009-01-10          1
#> # … with 781 more rows
2 Likes

Thank you so much sorry for the confusion.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.