# Drop duplicate in formula

Hello,
Im stuck in a simple problem.
I need to run something like this

``````refreshg\$ff=ifelse(refreshg\$hh %in% listhh,0,1)

``````

So, if the value from listhh exists in refreshg\$hh, resfresh\$ff will be 0. Anything else must be equal to 1.

However, I need to modify this in order to be applied to the first occurrence.
I mean, I detected some duplicates of hh in resreshg, so when I count the cases I see some hh vales that are repeated, some ff values with a count of 2 or more.
I just can't specify the rule of just run the code above over the first, and only first, occurrence.

I tried to perform this using dplyr, and I failed.

``````library(tidyverse)

oo=c(1,2,3,4,5,6,7,8,9,10,11,12,13)
hh=c(1,2,3,4,10,6,4,8,9,10,1,6,6)

data=data.frame(oo,hh)

listhh=c(1,4,6,8,10)

data\$ff=ifelse(data\$hh %in% listhh,1,0)
data
data\$desired=c(1,0,0,1,1,1,0,1,0,0,0,0,0)
data
``````

Thanks for your work and time

I think solved It. Can you confirm this? Im not sure.

``````data\$ff=ifelse(data\$hh %in% listhh,0,1)
data\$ff=ifelse(duplicated(data\$hh)==1,0,1)
data

``````

I think that duplicates keep the first occurrence as unique, dropping the remaining conditions.

``````data2 <- group_by(data,
hh)%>%
mutate( rn=row_number(),
xx =ifelse(hh %in% listhh & rn== 1 ,1,0))

all.equal(data2\$xx,data2\$desired)
# [1] TRUE``````

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.