How can I use R (Base or Tidyverse) to flag each patient IDs last non-missing screening record as baseline?

First of all, you should make sure that you have a reproducible example to make it easier for us trying to answer your question:

But, is this what you are after?

library(tidyverse)

df <- tibble::tribble(
  ~subjid,       ~Date, ~SBP, ~Visit_Type,
        1, "15-Jan-19",  125, "Screening",
        1, "16-Jan-19",  130, "Screening",
        1, "17-Jan-19",  127,          NA,
        1, "18-Jan-19",  120,          NA,
        2,  "9-Jan-19",  145, "Screening",
        2, "10-Jan-19",  130, "Screening",
        2, "11-Jan-19",  140,          NA,
        2, "12-Jan-19",  120,          NA,
        3, "10-Feb-19",  145, "Screening",
        3, "12-Feb-19",   NA, "Screening",
        3, "13-Feb-19",  140,          NA,
        3, "15-Feb-19",  120,          NA
  )

df

df2 <- df %>% 
  group_by(subjid) %>% 
  filter(Visit_Type == "Screening") %>% 
  slice(-1) %>% 
  mutate(Baseline_flag = TRUE) %>% 
  select(-SBP, -Visit_Type)


df3 <- left_join(df, df2, by = c("subjid", "Date"))

> df3
# A tibble: 12 x 5
   subjid Date        SBP Visit_Type Baseline_flag
    <dbl> <chr>     <dbl> <chr>      <lgl>        
 1      1 15-Jan-19   125 Screening  NA           
 2      1 16-Jan-19   130 Screening  TRUE         
 3      1 17-Jan-19   127 NA         NA           
 4      1 18-Jan-19   120 NA         NA           
 5      2 9-Jan-19    145 Screening  NA           
 6      2 10-Jan-19   130 Screening  TRUE         
 7      2 11-Jan-19   140 NA         NA           
 8      2 12-Jan-19   120 NA         NA           
 9      3 10-Feb-19   145 Screening  NA           
10      3 12-Feb-19    NA Screening  TRUE         
11      3 13-Feb-19   140 NA         NA           
12      3 15-Feb-19   120 NA         NA 

2 Likes