Drop duplicate in formula

Hello,
Im stuck in a simple problem.
I need to run something like this

refreshg$ff=ifelse(refreshg$hh %in% listhh,0,1)

So, if the value from listhh exists in refreshg$hh, resfresh$ff will be 0. Anything else must be equal to 1.

However, I need to modify this in order to be applied to the first occurrence.
I mean, I detected some duplicates of hh in resreshg, so when I count the cases I see some hh vales that are repeated, some ff values with a count of 2 or more.
I just can't specify the rule of just run the code above over the first, and only first, occurrence.

I tried to perform this using dplyr, and I failed.

library(tidyverse)

oo=c(1,2,3,4,5,6,7,8,9,10,11,12,13)
hh=c(1,2,3,4,10,6,4,8,9,10,1,6,6)

data=data.frame(oo,hh)

listhh=c(1,4,6,8,10)

data$ff=ifelse(data$hh %in% listhh,1,0)
data
data$desired=c(1,0,0,1,1,1,0,1,0,0,0,0,0)
data

Thanks for your work and time

I think solved It. Can you confirm this? Im not sure.

data$ff=ifelse(data$hh %in% listhh,0,1)
data$ff=ifelse(duplicated(data$hh)==1,0,1)
data

I think that duplicates keep the first occurrence as unique, dropping the remaining conditions.

data2 <- group_by(data,
                     hh)%>%
  mutate( rn=row_number(),
          xx =ifelse(hh %in% listhh & rn== 1 ,1,0))

all.equal(data2$xx,data2$desired)
# [1] TRUE

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.