How would I go about merging these two data.frames?

I tried looking it up on my own, but couldn't quite find exactly what I was looking for. The "city" names are an exact match to their counterparts within the "contest" names, so each "contest" has only one "city" that perfectly fits into it.

I was (somehow) having issues with reprex, but I think this looks pretty presentable. I put the tribbles in there for readability, but, since I'm still getting started with R, I'm taking it one step at a time and working with data.frames for now.

tibble::tribble(
              ~contest,      ~name,
  "city alpha, member",  "olivier",
  "city alpha, member",   "darius",
  "city alpha, member",   "edward",
  "city gamma, member",    "maria",
  "city gamma, member",     "jean",
  "city gamma, member",     "alex",
   "town beta, member",      "roy",
   "town beta, member", "alphonse"
  )

tibble::tribble(
    ~city, ~seats,
  "alpha",      1,
   "beta",      4,
  "gamma",      3
  )

#Desired final result
tibble::tribble(
              ~contest,      ~name, ~seats,
  "city alpha, member",  "olivier",      1,
  "city alpha, member",   "darius",      1,
  "city alpha, member",   "edward",      1,
  "city gamma, member",    "maria",      3,
  "city gamma, member",     "jean",      3,
  "city gamma, member",     "alex",      3,
   "town beta, member",      "roy",      4,
   "town beta, member", "alphonse",      4
  )

#As for the format I'm working with

mockframe<-as.data.frame(
              tibble(
                contest=c("city alpha, member","city alpha, member","city alpha, member","city gamma, member","city gamma, member","city gamma, member","town beta, member","town beta, member"),
                name=c("olivier","darius","edward","maria","jean","alex","roy","alphonse")
              )
           )

mockframe2<-as.data.frame(
              tibble(
                city=c("alpha","beta","gamma"),
                seats=c(1,4,3)
              )
            )

#Desired final result
mockframefinal<-as.data.frame(
                  tibble(
                    contest=c("city alpha, member","city alpha, member","city alpha, member","city gamma, member","city gamma, member","city gamma, member","town beta, member","town beta, member"),
                    name=c("olivier","darius","edward","maria","jean","alex","roy","alphonse"),
                    seats=c(1,1,1,3,3,3,4,4)
                  )
                )

I used a regular expression to extract the city name from the contest column. It looks for the second word in the contest column, limiting the first word to no more than 20 characters. I suspect there is a better method but I can't come up with it at the moment.

library(tibble)
mockframe<-as.data.frame(
  tibble(
    contest=c("city alpha, member","city alpha, member","city alpha, member","city gamma, member","city gamma, member","city gamma, member","town beta, member","town beta, member"),
    name=c("olivier","darius","edward","maria","jean","alex","roy","alphonse")
  )
)

mockframe2<-as.data.frame(
  tibble(
    city=c("alpha","beta","gamma"),
    seats=c(1,4,3)
  )
)

#Desired final result
mockframefinal<-as.data.frame(
  tibble(
    contest=c("city alpha, member","city alpha, member","city alpha, member","city gamma, member","city gamma, member","city gamma, member","town beta, member","town beta, member"),
    name=c("olivier","darius","edward","maria","jean","alex","roy","alphonse"),
    seats=c(1,1,1,3,3,3,4,4)
  )
)
library(stringr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
Final <- mutate(mockframe, city = str_extract(contest, "(?<=\\w{1,20} )\\w+")) %>% 
  inner_join(mockframe2, by = "city") %>% 
  select(-city)
Final
#>              contest     name seats
#> 1 city alpha, member  olivier     1
#> 2 city alpha, member   darius     1
#> 3 city alpha, member   edward     1
#> 4 city gamma, member    maria     3
#> 5 city gamma, member     jean     3
#> 6 city gamma, member     alex     3
#> 7  town beta, member      roy     4
#> 8  town beta, member alphonse     4
mockframefinal
#>              contest     name seats
#> 1 city alpha, member  olivier     1
#> 2 city alpha, member   darius     1
#> 3 city alpha, member   edward     1
#> 4 city gamma, member    maria     3
#> 5 city gamma, member     jean     3
#> 6 city gamma, member     alex     3
#> 7  town beta, member      roy     4
#> 8  town beta, member alphonse     4

Created on 2020-02-06 by the reprex package (v0.3.0)

Hi, and welcome.

To combine the two data frames, a key column is needed that appears in both. Consider restructuring your data as follows:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tibble)
tib1 <- tibble(citytype = c("city","city","city","city","city", "city", "town", "town"), greek = c("alpha","alpha","alpha","gamma","gamma","gamma", "beta", "beta"), name=c("olivier","darius","edward","maria","jean","alex","roy","alphonse"))
tib2 <- tibble(greek=c("alpha","beta","gamma"),seats=c(1,4,3))
tib1 %>% group_by(greek) %>% inner_join(.,tib2, by = "greek")
#> # A tibble: 8 x 4
#> # Groups:   greek [3]
#>   citytype greek name     seats
#>   <chr>    <chr> <chr>    <dbl>
#> 1 city     alpha olivier      1
#> 2 city     alpha darius       1
#> 3 city     alpha edward       1
#> 4 city     gamma maria        3
#> 5 city     gamma jean         3
#> 6 city     gamma alex         3
#> 7 town     beta  roy          4
#> 8 town     beta  alphonse     4

Created on 2020-02-06 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.