To find the mean population of all of the counties in each state, use the
data %>%
group_by(group) %>%
summarize(mean_ctypop = mean(county_pop)
construction,
as in the toy example below with only 2 states and 5 counties
library(tidyverse) #load library
# now build vectors for the toy example
cty <- c("Autauga", "Baldwin", "Baldwin", "Washtenaw", "Wayne")
state <- c(rep("Alabama", 3), rep("Michigan", 2))
pop <- c(48612, 162586, 28414, 1753893, 370963)
# now assemble into a data frame from vectors
df <- data.frame(cty, state, pop)
df # show the dataframe df
# now run the pipe
df %>% # take the data frame, then
group_by(state) %>% # group_by state, then
summarize(mean_cty = mean(pop)) # summarize with a new variable, mean_cty
#> # A tibble: 2 x 2
#> state mean_cty
#> <fct> <dbl>
#> 1 Alabama 79871.
#> 2 Michigan 1062428
You will find that this construction of
data %>%
group_by(group) %>%
summarize(make a new variable)
comes in handy often
Created on 2019-09-05 by the reprex package (v0.3.0)