Difficulty with join

fcas80 · February 13, 2021, 10:16pm

Hi. I am having trouble visualizing how to do the join I want.

I have a table t that looks like this:

Person Commission
====== ==========
Art 10%
Bill 15%
Ed 17%
Sue 20%

I have a data frame df that looks like this:

word1 word2 word3 word4
===== ===== ===== =====
Art sold a book
Sue sold a painting
Bill and Sue sold

I would like to join t and df and get 10% in row 1, 20% in row 2, and it is possible to get both 15% and 20% in row3?

t <- data.frame(Person=c("Art","Bill","Ed","Sue"), Commission=c(10%, 15%, 17%, 20%))
df <- data.frame(word1=c("Art","Sue","Bill"), word2=c("sold","sold","and"), word3=c("a","a","Sue"), word4=c("book","painting","sold"))

andresrcs · February 13, 2021, 11:14pm

Can you provide a sample of the output as well?, it is not clear what kind of output you expect, specially for row3.

CALUM_POLWART · February 14, 2021, 10:57am

A solution for the example is quite simple, bit not sure how scalable it would be

df %>%
  left_join( t, by = c("word1" = "Person") %>%
  left_join(t, by = c("word2" = "Person")

fcas80 · February 14, 2021, 3:15pm

Calum, your code gives me an error "unexpected end of input".

Also, I am hoping the joining can be with the rows of df, not with its columns.

fcas80 · February 14, 2021, 3:19pm

Andresrcs

row1: Art .10 in two columns
row2: Sue .20 in two columns
row3: Bill .15 Sue .20 in four columns?

andresrcs · February 14, 2021, 3:52pm

A proper dataframe in code would be easier to grasp, it is always better to make technical questions in a reproducible manner.

CALUM_POLWART · February 14, 2021, 4:57pm

My bad:

df %>%
  left_join( t, by = c("word1" = "Person")) %>%
  left_join(t, by = c("word2" = "Person"))

CALUM_POLWART · February 14, 2021, 5:15pm

And based on what you replied:

t <- data.frame(Person=c("Art","Bill","Ed","Sue"), 
                Commission=c("10%", "15%", "17%", "20%"))
df <- data.frame(word1=c("Art","Sue","Bill"), 
                 word2=c("sold","sold","and"), 
                 word3=c("a","a","Sue"), 
                 word4=c("book","painting","sold"))

require(dplyr)

df %>%
    left_join( t, by = c("word1" = "Person")) %>%
    left_join(t, by = c("word3" = "Person")) %>%
    select(word1, word3, Commission.x, Commission.y)

word1 word3 Commission.x Commission.y
1 Art a 10%
2 Sue a 20%
3 Bill Sue 15% 20%

nirgrahamuk · February 14, 2021, 7:25pm

I amused myself going a little overboard on this...

library(tidyverse)
t <- tibble(
  Person = c("Art", "Bill", "Ed", "Sue"),
  Commission = c("10%", "15%", "17%", "20%")
)

df <- tibble(
  word1 = c("Art", "Sue", "Bill"),
  word2 = c("sold", "sold", "and"),
  word3 = c("a", "a", "Sue"),
  word4 = c("book", "painting", "sold")
)


(wf1 <- map(
  names(df),
  ~ {select(df, .x) %>%
      rename("Person" = .x)}
))

(wf2 <- map(
  wf1,
  ~ {
    if (any(.$Person %in% t$Person)) {
      .
    } else {
      NULL
    }}
))
(wf3 <- wf2[lengths(wf2)])


(comframes <- map(
  wf3,  ~ left_join(., t)
))

bind_cols(comframes) %>%
  select(
    starts_with("Person"),
    starts_with("Comm")
  ) %>%
  select_if(~ !all(is.na(.)))

# A tibble: 3 x 4
  Person...1 Person...3 Commission...2 Commission...4
  <chr>      <chr>      <chr>          <chr>         
1 Art        Art        10%            10%           
2 Sue        Sue        20%            20%           
3 Bill       Bill       15%            15%

system · March 7, 2021, 7:26pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.