making new variables off 89 columns with rowwise()

I have a very long data frame that looks like this

owner cat dog ........... monkey elephant rhino giraffe
James  1   2               44     22323    NA     13

for each owner(unique row), i am interested in seeing which of the 88 animals they have by creating a string. if they have even one animal in a given column, I want it to count. I want the output to look like this for every row in a new variable called "animal_list":

owner animal_list
James cat, dog, monkey, elephant, giraffe

I also want another column that tells me the number of occurrences ordered from most common column to least:

owner animal_list_number
James elephant (22323),  monkey (44),  giraffe (13), dog (2),  cat (1)

is this possible to do using tidy verse principle. thank you !

I believe this does what you're asking.

require(tidyverse)


df1 <- tibble(owner = 'James',
              cat = 1,
              dog = 2,
              monkey = 44,
              elephant = 22323,
              rhino = NA_integer_,
              giraffe = 13)

## Turn many columns into many rows to get data in a format usable by the following functions
## paste0 concatenates strings
## collapse argument takes all elements in a vector and squishes them into one element

df2 <- df1 %>% 
  pivot_longer(., cols = -c('owner'), names_to = 'species',
               values_to = 'num_animals') %>% 
  group_by(owner) %>% 
  filter(!is.na(num_animals)) %>% 
  arrange(desc(num_animals)) %>% 
  mutate(animal_list_number = paste0(species, ' (', num_animals, ')', collapse = ', '),
    animal_list = paste0(species, collapse = ', ')) %>% 
  distinct(owner, animal_list, animal_list_number) %>% 
  ungroup()
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.