Hello Mara,
I fully agree with the given advice to provide a reprex, but I also like to point you to section 20.4.3 Scalars and recycling rules which explains how R is capable of vectorised operations.
The example given below contains 1.000.000 records and calculates per case the price and percentage difference.
library(tidyverse)
set.seed(1)
df <- tibble(retail_price = (runif(10^6) * 100) %>% round(2),
sales_price = (retail_price * (1 + runif(10))) %>% round(2)
)
df %>% glimpse()
#> Rows: 1,000,000
#> Columns: 2
#> $ retail_price <dbl> 26.55, 37.21, 57.29, 90.82, 20.17, 89.84, 94.47, 66.08...
#> $ sales_price <dbl> 30.27, 63.09, 99.05, 99.14, 21.51, 144.90, 133.03, 113...
df <- df %>%
mutate(diff_price = sales_price - retail_price,
diff_perc = sales_price/retail_price - 1,
)
df
#> # A tibble: 1,000,000 x 4
#> retail_price sales_price diff_price diff_perc
#> <dbl> <dbl> <dbl> <dbl>
#> 1 26.6 30.3 3.72 0.140
#> 2 37.2 63.1 25.9 0.696
#> 3 57.3 99.0 41.8 0.729
#> 4 90.8 99.1 8.32 0.0916
#> 5 20.2 21.5 1.34 0.0664
#> 6 89.8 145. 55.1 0.613
#> 7 94.5 133. 38.6 0.408
#> 8 66.1 113. 46.9 0.710
#> 9 62.9 100. 37.0 0.589
#> 10 6.18 10.2 3.97 0.642
#> # ... with 999,990 more rows
df %>%
## count occurences for each diff_perc
count(diff_perc, sort = TRUE)
#> # A tibble: 64,895 x 2
#> diff_perc n
#> <dbl> <int>
#> 1 0.0667 5389
#> 2 0.643 3001
#> 3 0.696 2699
#> 4 0.643 1813
#> 5 0.613 1702
#> 6 0.408 1604
#> 7 0.613 1472
#> 8 0.696 1048
#> 9 0.729 817
#> 10 0.589 615
#> # ... with 64,885 more rows
Created on 2021-02-26 by the reprex package (v1.0.0)
Hope this helps you further.