Adding new rows based on second data either grouped or by join...?

Being honest, I guess there's a simple way to do this but I really can't figure out what to do. Here is the problem:

Suppose we have this data:

    Name = c("James", "James", "James", "Ana", "Ana"),
    Fruits = c("Apple", "Orange", "Watermelon", "Grapes", "Apple")


#> A tibble: 5 x 2
#> Name  Fruits    
#>  <chr> <chr>     
#>1 James Apple     
#>2 James Orange    
#>3 James Watermelon
#>4 Ana   Grapes    
#>5 Ana   Apple  

And this is the second data:

    Market_Foods = c("Apple", "Orange", "Watermelon", "Grapes", "Peach", "Avocado")


# A tibble: 6 x 1
#>  Market_Foods
#>  <chr>       
#>1 Apple       
#>2 Orange      
#>3 Watermelon  
#>4 Grapes      
#>5 Peach       
#>6 Avocado   

What I want to do is to do some kind of join where I can list whether James or Ana have any fruits from the second data.....Something like this:

#> A tibble: 12 x 3
#>   Name  Has        Does_not_have
#>  <chr> <chr>      <chr>        
#> 1 James Apple      NA           
#> 2 James Orange     NA           
#> 3 James Watermelon NA           
#> 4 James NA         Grapes       
#> 5 James NA         Peach        
#> 6 James NA         Avocado      
#> 7 Ana   Grapes     NA           
#> 8 Ana   Apple      NA           
#> 9 Ana   NA         Orange       
#>10 Ana   NA         Watermelon   
#>11 Ana   NA         Peach        
#>12 Ana   NA         Avocado   

Or maybe it is better to have it like this:

#> A tibble: 6 x 3
#> Name  Has_or_Not Fruits
#>  <chr> <chr>      <chr> 
#>1 James Has        Apple 
#>2 James Has        Orange
#>3 James Not        Grapes
#>4 Ana   Not        Orange
#>5 Ana   Has        Grapes
#>6 Ana   Not        Apple 
#>.....        .....            ......
#>.....        .....            ......

I did some inner joins with transmute but no luck :frowning:

Do you have any idea of how to solve this?

Thanks in advance!

I have always been helped by everyone in the Rstudio community.
I would like to contribute as well and will send you the code.


tbl_1<- tibble(
  Name = c("James", "James", "James", "Ana", "Ana"),
  Fruits = c("Apple", "Orange", "Watermelon", "Grapes", "Apple")

tbl_2<- tibble(
  Market_Foods = c("Apple", "Orange", "Watermelon", "Grapes", "Peach", "Avocado")

left_join(tbl_1,tbl_2,by=c("Fruits"="Market_Foods")) %>% 
  mutate(has_or_not_flag = 1) %>% 
  complete(Name,Fruits,fill = list(has_or_not_flag=0)) %>% 
  mutate(has_or_not=ifelse(has_or_not_flag == 1, "has","not"))

I think that you can use the join_* function to join tables.

complete() is a function that allows you to create patterns that you don't have.
If you don't specify fill, it becomes NA, but if you specify 0 in fill, it becomes a flag that can be used in conditional statements.

  Name  Fruits     has_or_not_flag has_or_not
  <chr> <chr>                <dbl> <chr>     
1 Ana   Apple                    1 has       
2 Ana   Grapes                   1 has       
3 Ana   Orange                   0 not       
4 Ana   Watermelon               0 not       
5 James Apple                    1 has       
6 James Grapes                   0 not       
7 James Orange                   1 has       
8 James Watermelon               1 has 

Is that what you wanted?

1 Like

Thank you!!!!! This is exactly what I wanted, the complete function is totally new for me :slight_smile:

P.S. that's why I love R ... because the community is amazing!

This community is wonderful.
I hope you enjoy R.
And Let's help together those in need.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.