Matching columns and rows

dplyr
#1

I have two data frames, one data frame has columns of four letter codes "ABCD, etc." that correspond to specific species codes, with the input value being how many of these species were present in a given plot observation.

The other data frame has these exact codes but as rows. However, one of the columns specifies a property (big, small, medium) of the species code.

My problem is I need to fig
ure out how many four letter codes that are "big" are in the first matrix.
So I need to somehow match the columns of one data frame with the rows of another, so I can know on average how many "big species" each plot had.

(see screenshot for a simplified example; the two data frames are much larger and have other columns as well)


I've been battling at this for awhile, trying to use merge, trying to use the pipe operator, etc. but can't seem to figure it out. There is probably some really elegant way, but I'm new to this. Any advice is much appreciated.

0 Likes

#2

For future reference this is the correct way of sharing sample data (pasting screen captures is not a good thing to do here).

I think this is what you want

library(dplyr)
library(tidyr)

table_1 <- data.frame(stringsAsFactors = FALSE,
                      Plot = c('Plot1', 'Plot2', 'Plot3'),
                      ABCD = c(1, 3, 2),
                      EFGH = c(2, 3, 0),
                      IJKL = c(2, 3, 1),
                      MNOP = c(3, 2, 1)
)
table_2 <- data.frame(stringsAsFactors = FALSE,
                      Code = c('ABCD', 'EFGH', 'IJKL', 'MNOP'),
                      Size = c('Big', 'Small', 'Medium', 'Big')
)

table_1 %>% 
    gather(Code, Count, ABCD:MNOP) %>% 
    left_join(table_2, by = 'Code') %>% 
    group_by(Plot, Size) %>% 
    summarise(n = sum(Count))
#> # A tibble: 9 x 3
#> # Groups:   Plot [?]
#>   Plot  Size       n
#>   <chr> <chr>  <dbl>
#> 1 Plot1 Big        4
#> 2 Plot1 Medium     2
#> 3 Plot1 Small      2
#> 4 Plot2 Big        5
#> 5 Plot2 Medium     3
#> 6 Plot2 Small      3
#> 7 Plot3 Big        3
#> 8 Plot3 Medium     1
#> 9 Plot3 Small      0

Created on 2019-01-15 by the reprex package (v0.2.1)

4 Likes

#3

thanks for your help and for the comment about best practices; I'll make sure to do that next time. I'll try your code.

0 Likes

#4

I understand your thinking here but I'm hoping to streamline it a little bit if possible. I have 67 Plots, and 100s of species, so manually entering everything would be very hard.

0 Likes

#5

What exactly you would be entering manually? I don't understand

0 Likes

#6
table_1 <- data.frame(stringsAsFactors = FALSE,
                      Plot = c('Plot1', 'Plot2', 'Plot3'),
                      ABCD = c(1, 3, 2),
                      EFGH = c(2, 3, 0),
                      IJKL = c(2, 3, 1),
                      MNOP = c(3, 2, 1) 

The values for the sizes

0 Likes

#7

That is just the data you gave as example, the code below that, is what you have to apply to your real data

table_1 %>% 
    gather(Code, Count, -Plot) %>% 
    left_join(table_2, by = 'Code') %>% 
    group_by(Plot, Size) %>% 
    summarise(n = sum(Count))
2 Likes

#8

Ah, duh. Thanks ; I didn't know about the gather() function.

1 Like

#9

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

0 Likes

closed #10

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

0 Likes