Compare lines and sort them by compatibility


I'm trying to figure out how to classify my data based on their similarity, compatibility with a reference line that can change.

My data looks like this: (sorry, can't make a reprex)

Nationality / Gender / Race / Child

for example :

R1 : French / Male / White / 2
R2 : German / Male / Black / 0
R3 : French / Male / Black / 0

I'm trying to reorganize my data in :


by comparison with a chosen line, here D2 for example.
I tried the "semi-joined" function but you have to fill in the elements for comparison. Rather, I'm looking to compare my data to a selected or filled in row.

Thx for reading :slight_smile:

I think you forgot the part where is some criteria of comparison.

Well they're all criteria. I want to compare each variable (with an other one, here for example i take R2 as criteria and i short my data by similarity with R2

I have no idea what you want to do , but good luck !

If anybody is going to be able to help we will need some idea of what you are trying to do. At the very least we will need a reproducible example (reprex)

with a decent amount of sample data. A handy way to supply sample data is to use the dput() function. See ?dput. If you have a very large data set then something like head(dput(myfile), 100) will likely supply enough data for us to work with.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.