data manipulation using dplyr

Suppose i had two vectors which i incorporate into a data frame like so:

a <- c("dog", "cat", "dog", "camel", "dog", "cat", "dog")
b <- c(1,0,1,1,1,0,1)
x <- data.frame(a,b)

vector b represents a 0 or 1 value for presence or absence of the corresponding value(animal) in vector a in data frame x.
Using group_by and summarise i found the frequency of occurence/presence of "dog", "cat", and "camel" in vector b. The frequency of camel is 1, of cat is 0, and of dog is 4. Now i wanted to create a new vector c in the data frame x which would have the values 4,0,4, 1,4,0, 4. Vector c contains the total frequency of occurrence/presence of the animals for every corresponding animal name in vector a.

I am giving the relevant code with associated comments in the following reprex, please help if you can.

http://rpubs.com/blue_eyes/523835

You can use mutate instead of summarise

library(dplyr)

a <- c("dog", "cat", "dog", "camel", "dog", "cat", "dog")
b <- c(1,0,1,1,1,0,1)
x <- data.frame(a,b)

x %>% 
  group_by(a) %>% 
  mutate(c = sum(b))
5 Likes

@nutterb's answer is totally correct :tada: I'm just adding this minor caveat since it may matter if you're continuing to use the data frame in a pipeline, and it can easily trip you up.

If you don't want to continue operating on the tibble in a grouped fashion, you have to remember to ungroup() at the end. If you do want those groups for subsequent steps, then you'd keep the code as-is in nutterb's reply.

library(dplyr)

a <- c("dog", "cat", "dog", "camel", "dog", "cat", "dog")
b <- c(1,0,1,1,1,0,1)
x <- data.frame(a,b)

# grouped tibble output
x %>% 
  group_by(a) %>% 
  mutate(c = sum(b))
#> # A tibble: 7 x 3
#> # Groups:   a [3]
#>   a         b     c
#>   <fct> <dbl> <dbl>
#> 1 dog       1     4
#> 2 cat       0     0
#> 3 dog       1     4
#> 4 camel     1     1
#> 5 dog       1     4
#> 6 cat       0     0
#> 7 dog       1     4

# ungrouped tibble output
x %>% 
  group_by(a) %>% 
  mutate(c = sum(b)) %>%
  ungroup()
#> # A tibble: 7 x 3
#>   a         b     c
#>   <fct> <dbl> <dbl>
#> 1 dog       1     4
#> 2 cat       0     0
#> 3 dog       1     4
#> 4 camel     1     1
#> 5 dog       1     4
#> 6 cat       0     0
#> 7 dog       1     4

Created on 2019-09-02 by the reprex package (v0.3.0)

Also, just as a plug for using reprex, it's nice to be able to see the output, too! (Note how the head of the grouped tibble says how many groups are present).

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.