Suppose i had two vectors which i incorporate into a data frame like so:
a <- c("dog", "cat", "dog", "camel", "dog", "cat", "dog")
b <- c(1,0,1,1,1,0,1)
x <- data.frame(a,b)
vector b represents a 0 or 1 value for presence or absence of the corresponding value(animal) in vector a in data frame x.
Using group_by and summarise i found the frequency of occurence/presence of "dog", "cat", and "camel" in vector b. The frequency of camel is 1, of cat is 0, and of dog is 4. Now i wanted to create a new vector c in the data frame x which would have the values 4,0,4, 1,4,0, 4. Vector c contains the total frequency of occurrence/presence of the animals for every corresponding animal name in vector a.
I am giving the relevant code with associated comments in the following reprex, please help if you can.
@nutterb's answer is totally correct I'm just adding this minor caveat since it may matter if you're continuing to use the data frame in a pipeline, and it can easily trip you up.
If you don't want to continue operating on the tibble in a grouped fashion, you have to remember to ungroup() at the end. If you do want those groups for subsequent steps, then you'd keep the code as-is in nutterb's reply.
library(dplyr)
a <- c("dog", "cat", "dog", "camel", "dog", "cat", "dog")
b <- c(1,0,1,1,1,0,1)
x <- data.frame(a,b)
# grouped tibble output
x %>%
group_by(a) %>%
mutate(c = sum(b))
#> # A tibble: 7 x 3
#> # Groups: a [3]
#> a b c
#> <fct> <dbl> <dbl>
#> 1 dog 1 4
#> 2 cat 0 0
#> 3 dog 1 4
#> 4 camel 1 1
#> 5 dog 1 4
#> 6 cat 0 0
#> 7 dog 1 4
# ungrouped tibble output
x %>%
group_by(a) %>%
mutate(c = sum(b)) %>%
ungroup()
#> # A tibble: 7 x 3
#> a b c
#> <fct> <dbl> <dbl>
#> 1 dog 1 4
#> 2 cat 0 0
#> 3 dog 1 4
#> 4 camel 1 1
#> 5 dog 1 4
#> 6 cat 0 0
#> 7 dog 1 4
Also, just as a plug for using reprex, it's nice to be able to see the output, too! (Note how the head of the grouped tibble says how many groups are present).