SQL Joins Across connections

Is there a way to do a join across different database connections - I can't really do a true reprex - due to the need to connect to a database, but I can give an Idea of what I do within my rmarkdown documents.

first I connect to database 1 to pull a list of items.

```{sql connection=con_desc, output.var = "Item_Desc"}

/*pull item features*/

SELECT 
   I.ITEM_CODE AS Item_Code                                                               
 , Itm.UOM_ID AS Quant_Measure                                                          
  FROM #Database1 I                                                  
    ;

Then pull sales from another database (on a different server/with a different connection)


SELECT
SID.ITEM_ID AS Item_Code,
SUM(SID.SALES_VOLUME) AS Sales_Volume
FROM 
Database2 AS SID 
WHERE 
GROUP BY
SID.ITEM_ID

I'll then merge them together with a simple join in an r chunk

Item_Desc %>% 
  inner_join(Item_Sales, by = "Item_Code")

This works - but I'd really be able to join the tables together directly. Without having to do the intermediate steps of saving the results from each connection to a variable, and then merging in an r chunk.

While I generally just call SQL directly from rmarkdown documents, it looks like you would have the same issues in dbplyr since you specify the connection up front. Is this just how it is, or is there a possible work around?

I would say that this is just how it is. I'm not sure there is anything anyone can do to magically join two things that are not connected in any way.

What you can do is get your first result (or whichever is smaller) and then copy it to a second connection (there is dplyr::copy_to function) and do joins there. Not sure if that helps in any way since you still will have to export and then import data, so it might be faster to import two datasets in memory and then join, as you are doing now. But you can test to see if there is any difference.

1 Like

Another option would be to use Foreign Data Wrappers in one of your databases (This is supported on postgresql, not sure about other sql servers) and just query against that server.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.