I need help recreating an Excel summary table

I am pretty new at R and I think I've bitten off more than I can chew. I have done my tables and figures in Excel, but would like it to look more professional by coding it with R.

Here is the table I am trying to duplicate right now:

My data set is here: https://docs.google.com/spreadsheets/d/e/2PACX-1vSyQRiTb9gUVNHE-WwDGgDMi8v5GAP_phBymLrH5s46Pl4uWYsfXbdlh8WvOlROyA18aufGfN9bXMOR/pub?gid=1732065428&single=true&output=csv

I've attempted this so far but have gotten stuck:

library(data.table)
fish <- fread('https://docs.google.com/spreadsheets/d/e/2PACX-1vSyQRiTb9gUVNHE-WwDGgDMi8v5GAP_phBymLrH5s46Pl4uWYsfXbdlh8WvOlROyA18aufGfN9bXMOR/pub?gid=1732065428&single=true&output=csv')
head(fish)

#I want to split summarize the table into performers, sustained swimming, and prolonged/burst swimmers by percentage."

library( dplyr )
fish.percent<- select(fish, Absolute.Velocity, TtF)
fish.percent%>%
group_by(Absolute.Velocity)%>%
summarize(n())

The SM count and percentage would come from TtF >60, and the PBM data is 0<PBM<200.

Should I just stick with Excel for this? Any advice is welcome.

Where do these come from?

Performers would be anything with TtF>0. prolonged/burst swimming (PBM) is TtF between 0 and 200, and sustained swimming (SM) is TtF = 200.

Something like this then?

library(tidyverse)

df1 <- fish %>% 
  mutate(Group = case_when(TtF >0 & TtF < 200 ~ "PBM",
                           TtF == 200 ~ "SM",
                           TRUE ~ NA_character_)) %>% 
  group_by(Absolute.Velocity) %>% 
  count(Group) %>% 
  pivot_wider(Absolute.Velocity, names_from = Group, values_from = n) %>% 
  replace(is.na(.), 0) %>% 
  mutate(Total = PBM + SM + `NA`) %>% 
  select(- `NA`)

df2 <-  fish %>% 
  mutate(Performer = if_else(TtF > 0, 1, 0)) %>% 
  group_by(Absolute.Velocity) %>% 
  summarise(Performer = sum(Performer))

df3 <- left_join(df1, df2, by = "Absolute.Velocity") %>% 
  mutate(Non_Performer = Total - Performer) %>% 
  mutate(prop_PBM = PBM/Total, prop_SM = SM/Total, prop_Perf = Performer/Total, prop_NP = Non_Performer/Total)


> df3
# A tibble: 8 x 10
# Groups:   Absolute.Velocity [8]
  Absolute.Velocity   PBM    SM Total Performer Non_Performer prop_PBM prop_SM prop_Perf prop_NP
              <dbl> <int> <int> <int>     <dbl>         <dbl>    <dbl>   <dbl>     <dbl>   <dbl>
1                30     8     5    13        13             0    0.615  0.385      1       0    
2                35     3     7    10        10             0    0.3    0.7        1       0    
3                40     8     7    15        15             0    0.533  0.467      1       0    
4                45    10     2    14        12             2    0.714  0.143      0.857   0.143
5                50     8     3    15        11             4    0.533  0.2        0.733   0.267
6                55     8     1    13         9             4    0.615  0.0769     0.692   0.308
7                60     8     1    14         9             5    0.571  0.0714     0.643   0.357
8                65     0     0    10         0            10    0      0          0       1    

Thank you so much for helping! I really appreciate your effort.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.