Dear R Studio Community,
I am trying to count the unique values in multiple columns (IPC1, IPC2, ... ) and not group them together. So I just need to know the amount of different unique values for now. I used the following coding to group them in years, but when I remove the group function it doesn't work. (I tried to reprex the code but keep getting errors)
library(dplyr)
library(tidyr)
IPCS <- data.frame(Dataset_Steam_Engine_Complete %>%
gather(key = "IPC",value = "Value",IPC1, IPC2, IPC3, IPC4, IPC5,IPC6, IPC7, IPC8, IPC9, IPC10,IPC11, IPC12, IPC13, IPC14, IPC15,IPC16, IPC17, IPC18, IPC19, IPC20,IPC21, IPC22, IPC23, IPC24, IPC25,IPC26, IPC27, IPC28, IPC29, IPC30,IPC31, IPC32, IPC33) %>%
group_by(Year) %>%
summarise(distinct_IPC_count = n_distinct(Value, na.rm = TRUE))
)
Each IPC code in the different columns is formatted like "H01B11/11". The next step of my research requires me to know how many unique codes there are in the multiple columns sorted per the first letter of the IPC code, so in this case the amount of unique IPC codes beginning with an "H".
I've played around with the code stated above but didn't get any results. I would really appreciate it if someone could help me out!
Kind regards,
Luuk