calculating proportions using dplyr

Hello!

I am very new to R. How do I go about calculating the proportion of a response for a certain subset of a data set. For example, of those who are college graduates, how many are stem? So far I have something like this...

df = %>%
filter(education = "college_grad") %>%
filter(major = "stem") %>%

and I am not sure what to do after this.

Help will be appreciated,

xbechtel

See the gtsummary package

Scroll down

suppressPackageStartupMessages({library(dplyr)
                                library(gtsummary)})

trial %>% select(trt, age, grade) %>% tbl_summary()
html { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif; } #fmptnhwich .gt_table { display: table; border-collapse: collapse; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #fmptnhwich .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #fmptnhwich .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #fmptnhwich .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 0; padding-bottom: 4px; border-top-color: #FFFFFF; border-top-width: 0; } #fmptnhwich .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #fmptnhwich .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #fmptnhwich .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #fmptnhwich .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #fmptnhwich .gt_column_spanner_outer:first-child { padding-left: 0; } #fmptnhwich .gt_column_spanner_outer:last-child { padding-right: 0; } #fmptnhwich .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; overflow-x: hidden; display: inline-block; width: 100%; } #fmptnhwich .gt_group_heading { padding: 8px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; } #fmptnhwich .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #fmptnhwich .gt_from_md > :first-child { margin-top: 0; } #fmptnhwich .gt_from_md > :last-child { margin-bottom: 0; } #fmptnhwich .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #fmptnhwich .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 12px; } #fmptnhwich .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #fmptnhwich .gt_first_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; } #fmptnhwich .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #fmptnhwich .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #fmptnhwich .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #fmptnhwich .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #fmptnhwich .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #fmptnhwich .gt_footnote { margin: 0px; font-size: 90%; padding: 4px; } #fmptnhwich .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #fmptnhwich .gt_sourcenote { font-size: 90%; padding: 4px; } #fmptnhwich .gt_left { text-align: left; } #fmptnhwich .gt_center { text-align: center; } #fmptnhwich .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #fmptnhwich .gt_font_normal { font-weight: normal; } #fmptnhwich .gt_font_bold { font-weight: bold; } #fmptnhwich .gt_font_italic { font-style: italic; } #fmptnhwich .gt_super { font-size: 65%; } #fmptnhwich .gt_footnote_marks { font-style: italic; font-size: 65%; }

Characteristic

N = 2001

Chemotherapy Treatment

Drug A

98 (49%)

Drug B

102 (51%)

Age

47 (38, 57)

Unknown

11

Grade

I

68 (34%)

II

68 (34%)

III

64 (32%)

1

Statistics presented: n (%); Median (IQR)

Created on 2020-09-29 by the reprex package (v0.3.0.9001)

1 Like

Hello!

Is there a way to do this just using the tidyverse package? Essential I was asked to use inline rcode to produce this proportion when describing the data set?

Begin with the usual analysis in term of f(x) = y

x is the data at hand, y is the subset desired, and f is the function, or composite function, to turn the one into the other.

The choice of f will depend on whether the count or a proportion is required.

The OP code is off to a false start by filtering down df to only college graduates in stem majors. This precludes any proportionality test. If only a count is needed all that's required is

suppressPackageStartupMessages({library(dplyr)})

# prefer dat or my_df over df, which is a built-in function name
# this avoids situation where `df` can be read as a closure

mtcars %>%
  filter(cyl == 4 & carb == 2) %>%
  summarise_all(~ sum(., trim = .2))
#>     mpg  cyl  disp    hp  drat     wt   qsec  vs  am gear carb
#> 1 155.6 24.2 699.8 522.2 25.05 14.588 113.82 5.2 4.2 26.2 12.2

Created on 2020-09-29 by the reprex package (v0.3.0.9001)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.