Rbind and Joining Date Frames

I have to join two data frames.:

NewGads
# A tibble: 1,740 x 7
# Groups:   Date, Time [1,740]
   Date       Time  Gadus.morhua Merlangius.merl… Trisopterus.lus…
   <date>     <tim>        <dbl>            <dbl>            <dbl>
 1 1988-10-10 12:33           84              240              532
 2 1988-10-10 14:36            4                0                0
 3 1988-10-10 16:37          100               16              808
 4 1988-10-11 08:03           13               22              257
 5 1988-10-11 09:03           36              104              620
 6 1988-10-11 10:20            0              928              456
 7 1988-10-11 12:43            8              312                0
 8 1988-10-11 13:22           32             1051                0
 9 1988-10-11 14:58            3                6                0
10 1988-10-12 09:28            2                2                0
# … with 1,730 more rows, and 2 more variables: Trisopterus.minutus <dbl>,
#   Pollachius.pollachius <dbl>
Rays2_wide
# A tibble: 941 x 3
# Groups:   Date, Time [941]
   Date       Time   Raja.clavata
   <date>     <time>        <dbl>
 1 1988-10-10 12:33             4
 2 1988-10-10 16:37            24
 3 1988-10-11 08:03             6
 4 1988-10-11 09:03             4
 5 1988-10-11 10:20             4
 6 1988-10-11 12:43             8
 7 1988-10-13 07:22             8
 8 1988-10-16 17:32            12
 9 1988-10-17 07:59             2
10 1988-10-17 13:35             6
# … with 931 more rows

The dates and the times are the same, but there are some absences for Raja.clavata which I want to be NA's. However, they're not joining together properly with rbind.

Instead of grouping all the same dates together, it repeats the list of dates and times again at the end of the first data frame. It gives NA values for all of Raja.Clavata before repeating the dates again, where it then gives NA values for all the other species.

Why are the same dates and times not grouping together? I tried group_by as well and that didn't change anything.

OhBoy <- rbind(NewGads, Rays2_wide) 

OhBoy <- OhBoy %>%
  group_by(Date, Time)

you probably want dplyr::full_join()

1 Like

Ok that mostly worked. Thanks!

However, there's a few dates that are only listed for Raja.Clavata and not the other species. These still get added to the end of the joined data frame and won't sort chronologically by date with all the other values. Any idea how to fix it?

wont sort ?

dates_in_january2020 <- lubridate::ymd(paste0("202001",1:31))
set.seed(42)
df1<-tibble(
  dates = sample(dates_in_january2020,
                 size = 10),
  facts1 = letters[1:10]
)

df2<-tibble(
  dates = sample(dates_in_january2020,
                 size = 10),
  facts2 = LETTERS[1:10]
)

full_join(df1,df2) %>% arrange(dates)
# Joining, by = "dates"
# # A tibble: 15 x 3
# dates      facts1 facts2
# <date>     <chr>  <chr> 
# 1  2020-01-01 c      NA    
# 2  2020-01-03 NA     I     
# 3  2020-01-04 f      A     
# 4  2020-01-05 b      C     
# 5  2020-01-07 j      NA    
# 6  2020-01-09 NA     J     
# 7  2020-01-10 e      NA    
# 8  2020-01-14 NA     D     
# 9  2020-01-15 i      H     
# 10 2020-01-17 a      NA    
# 11 2020-01-18 g      G     
# 12 2020-01-20 NA     E     
# 13 2020-01-25 d      B     
# 14 2020-01-26 NA     F     
# 15 2020-01-31 h      NA    

can we use this example to discuss any further changes?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.