Removing duplicates with multiple constraints

Juliavv · April 4, 2023, 2:05pm

Hi all,

I'm trying to remove duplicates in my dataset, but it's not working. I'm using the following function:

CapitalIQ <-
CapitalIQ %>%
group_by(companyname, interestratehighvalue, fyear) %>%

Eliminate firm years with duplicate

filter(n() == 1)

So what I want is that for each fyear and gvkey, I want to remove the duplicates based on the variable interestratehighvalue.

But if I run this code it removes a lot more than what I wanted.

Could anyone help me or give some tweaks to my code?

jrkrideau · April 4, 2023, 5:52pm

I think we need to see some sample data.

A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need. Just do dput(mydata) where mydata is your data. Copy the output and paste it here.

Juliavv · April 6, 2023, 6:22pm

Hi,

Thanks for helping, but i figured it out. thanks!

system · April 27, 2023, 6:22pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.