I'm a R novice, currently learning to use R as part of an online statistics course I'm taking. I have a question pertaining to the data above.
The table I have pasted is the output, grouped by a US state, of a statistic I created using the summarize() function. I am using RStudio to create RMarkdown and have loaded the dplyr package.
Given that the figures you see are in fact summary statistics of an underlying dataset, I'd like to know how do I convert this table into a dataframe so that I can do further summary statistics (median, SD, etc.) on these summary statistics.
Hope that is clear enough. Any help would be greatly appreciated.
The value returned by summarize() is a data.frame, or actually a tibble, but the difference does not matter. Just store the result of summarize in a variable and continue with your calculations. Have you encountered a problem with that?
Expanding on this just a bit. You need to bind the result of summarize() to a name, so that you can use it again. Your result will be the same, but you'll be able to access it to continue your work.
I have the same code twice, below. The first time, it prints, but I didn't actually give it a name, so I don't have a way to access it again. The second time is exactly the same as the first, except I give it a name, mt_summary. The result doesn't print like it does the first time. If I want to see it, I just call it by its name again.
library(dplyr)
# not bound to a name
mtcars %>%
group_by(cyl) %>%
summarise(mean = mean(disp), n = n())
#> # A tibble: 3 x 3
#> cyl mean n
#> <dbl> <dbl> <int>
#> 1 4 105. 11
#> 2 6 183. 7
#> 3 8 353. 14
# bound to a name
mt_summary <- mtcars %>%
group_by(cyl) %>%
summarise(mean = mean(disp), n = n())
mt_summary
#> # A tibble: 3 x 3
#> cyl mean n
#> <dbl> <dbl> <int>
#> 1 4 105. 11
#> 2 6 183. 7
#> 3 8 353. 14