This is a follow up on this question. Now I want to do a few more steps:
-
Group by columns ID and order
-
For every val in df_dat, look up the corresponding ratio in the df_lookup table with the following conditions:
o If val < min(df_lookup$val), set new_ratio = min(df_lookup$ratio)
o If val > max(df_lookup$val), set new_ratio = max(df_lookup$ratio)
o If val falls within df_lookup$val range, do a simple linear interpolation
library(dplyr)
df_lookup <- tribble(
~ID, ~order, ~pct, ~val, ~ratio,
"batch1", 1, 1, 1, 0.2,
"batch1", 1, 10, 8, 0.5,
"batch1", 1, 25, 25, 1.2,
"batch2", 2, 1, 2, 0.1,
"batch2", 2, 10, 15, 0.75,
"batch2", 2, 25, 33, 1.5,
"batch2", 2, 50, 55, 3.2,
)
df_dat <- tribble(
~order, ~ID, ~val,
1, "batch1", 0.1,
1, "batch1", 30,
1, "batch1", 2,
1, "batch1", 12,
2, "batch1", 45,
2, "batch2", 1.5,
2, "batch2", 30,
2, "batch2", 13,
2, "batch2", 60,
)
I have a working data.table code. Is it possible to achieve the same goal with tidyverse's verbs?
library(data.table)
setDT(df_lookup)
setDT(df_dat)
df_lookup[, m := (ratio - shift(ratio, -1L)) / (val - shift(val, -1L))]
df_dat[, new_ratio :=
df_lookup[.SD, on=.(order, ID, val), roll=Inf, rollends=c(FALSE, FALSE),
x.m * (i.val - x.val) + x.ratio]
]
df_dat[is.na(new_ratio), new_ratio :=
df_lookup[copy(.SD), on=.(order, ID, val), roll=Inf, x.ratio]]
df_dat[is.na(new_ratio), new_ratio :=
df_lookup[copy(.SD), on=.(order, ID, val), roll=-Inf, x.ratio]]
order ID val new_ratio
1 1 batch1 0.1 0.2
2 1 batch1 30 1.2
3 1 batch1 2 0.243
4 1 batch1 12 0.643
5 2 batch1 45 NA
6 2 batch2 1.5 0.1
7 2 batch2 30 1.38
8 2 batch2 13 0.65
9 2 batch2 60 3.2
Example:
- For
order = 1, ID = batch2 and val = 30, new_ratio = 1.2 (max ratio value).
- For
order = 1, ID = batch1 and val = 2, new_ratio = 0.243 which is the interpolated ratio value between 0.2 and 0.5.
- For
order = 2 and ID = batch1, new_ratio = NA as those conditions aren’t in the lookup table.
Any help appreciated!