How to use join of dplyr in r?

I need help regarding this database European Soccer Database | Kaggle I need a table of the date , the hometeam name and the goals the hometeam scored , for a random game ( I chose Match.match_api_id = 492476). I want to get this table in two ways: by an SQL statement and by r using the dplyr package. The SQL statement is the following :

SELECT Match.date ,Team.team_long_name ,Match.home_team_goal FROM Team JOIN Match ON Match.home_team_api_id = Team.team_api_id WHERE Match.match_api_id = 492476;

I am struggling with dplyr though. Question: In order to use join in r , to join two lazy querys , they must both have a column with the same name, so the join is done by that column? And also should one of theese two columns be the foreign key of this query and correspond to the column of the other query with the same name?

I run this code

`con <- DBI::dbConnect(RSQLite::SQLite(), "data/database.sqlite")
country<-tbl(con,"Country")
league<-tbl(con,"League")
match<-tbl(con,"Match")
team<-tbl(con,"Team")
team_attributes<-tbl(con,"Team_Attributes")
player_attributes<-tbl(con,"Player_Attributes")
player<-tbl(con,"Player")

table_1<-match %>%
  filter(match_api_id=492476) %>%
  select(date,home_team_goal,home_team_api_id) %>%
  left_join(team)`


and get this error

! `by` required, because the data sources have no common variables. Run `rlang::last_error()` to see where the error occurred.

* Run `rlang::last_error()` to see where the error occurred.

The two data frames do not have to have columns with the same name but in that case you have to tell left_join which column to match. Try something like

table_1<-match %>%
  filter(match_api_id=492476) %>%
  select(date,home_team_goal,home_team_api_id) %>%
  left_join(team, by = c(home_team_api_id = "team_api_id")

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.