I have a large dataframe with entries in 3 columns. For simplicity and Minimal reprex I'll relate it to colors:
Let's say I asked 4 people to combine their 2 favorite colors. Some people couldn't just pick one combination so they answered twice. How can I find out the total number of unique combinations?
"red+blue" should be the same as "blue+red".
In the reprex, there should be 3 combinations (blue_green is duplicated, and yellow_red should be the same as red_yellow).
> mydf
person color1 color2 combined
1 bob blue green blue_green
2 alice red yellow red_yellow
3 peter green pink green_pink
4 paul yellow red yellow_red
5 paul blue green blue_green
#And count the unique combined strings..
length(unique(mydf$combined))
I know a solution has been found but I wanted to address a very common mistake that I make and see in this question. When dealing with a data frame length() will return the number of columns as it is a list of vectors. To get the actual length of the data set you should use nrow() or NROW().