Here's a totally useless response, because it doesn't even work!
But my reason for posting this is to try to get a clearer (for me) sense of the logic of what you're trying to do. I found your description of your desired process a little confusing. But anyway, here's some code that tries to achieve your intention that
for a given name, say, "a", I want to take the sum of sales of all other related names (by their ID) and divide by 1-t for the respective names.
library(dplyr)
df <- tibble(
name = letters[1:6],
id = rep(1:2, each = 3),
sales = c(100, 250, 300, 50, 600, 390),
t = c(0.1, 0.3, 0.4, 0.05, 0.15, 0.2),
n = rep(1:3, times = 2)
)
add_result <- function(df, name = name) {
id <- df %>%
dplyr::filter(name == name) %>%
pull(id)
filt <- df %>%
dplyr::filter(id == id)
t <- filt %>%
dplyr::filter(name == name) %>%
dplyr::pull(t) %>%
head(1) # or mean() ... ensure that t is a single value
sales_sum <- filt %>%
dplyr::filter(!name == name) %>%
dplyr::pull(sales) %>%
sum()
sales_sum / (1 - t)
}
df %>%
dplyr::mutate(result = add_result(df = df))
#> # A tibble: 6 x 6
#> name id sales t n result
#> <chr> <int> <dbl> <dbl> <int> <dbl>
#> 1 a 1 100 0.1 1 0
#> 2 b 1 250 0.3 2 0
#> 3 c 1 300 0.4 3 0
#> 4 d 2 50 0.05 1 0
#> 5 e 2 600 0.15 2 0
#> 6 f 2 390 0.2 3 0
Created on 2020-09-24 by the reprex package (v0.3.0)
OK that didn't work, but I'm not sure why. Now I looked more carefully at your intentions, and tweaked the function a bit:
library(dplyr)
df <- tibble(
name = letters[1:6],
id = rep(1:2, each = 3),
sales = c(100, 250, 300, 50, 600, 390),
t = c(0.1, 0.3, 0.4, 0.05, 0.15, 0.2),
n = rep(1:3, times = 2)
)
add_result <- function(df, name = name) {
id <- df %>%
dplyr::filter(name == name) %>%
dplyr::pull(id)
filt <- df %>%
dplyr::filter(id == id)
key_t <- filt %>%
dplyr::filter(name == name) %>%
dplyr::pull(t) %>%
head(1) # or mean() ... ensure that t is a single value
filt %>%
dplyr::filter(!name == name) %>%
dplyr::mutate(result = (sales/(1-t)) * (key_t-t)) %>%
dplyr::pull(result) %>%
sum()
}
df %>%
dplyr::mutate(result = add_result(df = df))
#> # A tibble: 6 x 6
#> name id sales t n result
#> <chr> <int> <dbl> <dbl> <int> <dbl>
#> 1 a 1 100 0.1 1 0
#> 2 b 1 250 0.3 2 0
#> 3 c 1 300 0.4 3 0
#> 4 d 2 50 0.05 1 0
#> 5 e 2 600 0.15 2 0
#> 6 f 2 390 0.2 3 0
Created on 2020-09-24 by the reprex package (v0.3.0)
That still doesn't work. So I tried something different:
library(dplyr)
df <- tibble(
name = letters[1:6],
id = rep(1:2, each = 3),
sales = c(100, 250, 300, 50, 600, 390),
t = c(0.1, 0.3, 0.4, 0.05, 0.15, 0.2),
n = rep(1:3, times = 2)
)
df %>%
dplyr::filter(id == id[1]) %>%
dplyr::filter(!name == name[1]) %>%
dplyr::mutate(result = (sales/(1-t)) * (df$t[1]-t)) %>%
dplyr::pull(result) %>%
sum()
#> [1] -221.4286
Created on 2020-09-24 by the reprex package (v0.3.0)
This approach gives you the result you're expecting, for row 1 only, but if I then try to generalise it:
library(dplyr)
df <- tibble(
name = letters[1:6],
id = rep(1:2, each = 3),
sales = c(100, 250, 300, 50, 600, 390),
t = c(0.1, 0.3, 0.4, 0.05, 0.15, 0.2),
n = rep(1:3, times = 2)
)
calc_result <- function(df, n) {
df %>%
dplyr::filter(id == id[n]) %>%
dplyr::filter(!name == name[n]) %>%
dplyr::mutate(result = (sales/(1-t)) * (df$t[n]-t)) %>%
dplyr::pull(result) %>%
sum()
}
purrr::map_dbl(1:nrow(df), ~ calc_result(df = df, .)) %>%
dplyr::bind_cols(df, result = .)
#> # A tibble: 6 x 6
#> name id sales t n result
#> <chr> <int> <dbl> <dbl> <int> <dbl>
#> 1 a 1 100 0.1 1 0
#> 2 b 1 250 0.3 2 0
#> 3 c 1 300 0.4 3 0
#> 4 d 2 50 0.05 1 0
#> 5 e 2 600 0.15 2 0
#> 6 f 2 390 0.2 3 0
Created on 2020-09-24 by the reprex package (v0.3.0)
we still get the zeroes in the result column. I'm a bit perplexed.
Anyway, this is all not much use to you at all in investigating the efficiency issue, but perhaps here might be something of value in the difficulties I've been experiencing in getting this "simpler" (haha) approach to work.