Pass overwrite=TRUE to dplyr::semi_join()

As I'm doing some analysis in R, I need to be able to do a ````semi_join()``` where I get all values in an Oracle database table where there's a matching row in a local R tibble.

Normally, I'd do that like this:

    con  <- DBI::dbConnect(odbc::odbc(), "THEDATABASE")
    db_tbl  <- tbl(con, in_schema("OTHER_USER", "table_I_care_about")

    local_tbl  <- tibble(x = 1:5, y = 5:10)

    new_tbl_from_db  <- db_tbl %>%
        semi_join(local_tbl, by = 'x', copy =T)

Sometimes this works. But often, I get this error:
Error: Table "dbplyr_001" exists in database, and both overwrite and append are FALSE (Of course, it's not always dbplyr_001--it could be just about any number).

If I look in my schema in the database, I see that a table named dbplyr_001 exists, which I'm not surprised about. I don't mind dbplyr writing little temporary tables. I've tried manually inserting values into this table, which works fine too.

What I can't figure out how to do is to specify overwrite=TRUE. I've tried passing that as a further argument to semi_join(), but still get the same error. Is there an easy way to pass overwrite=TRUE or append=TRUE via dplyr::semi_join()?

PS. I asked about this on StackOverflow and got some decent workaround advice, particularly the workaround I've been using: pass local_tbl$x in just a variable and do something like:

     db_tbl %>%
           filter(x %in% local_tbl$x)

This is fine advice and a decent workaround. But if I want to do a left_join(), it's no help. And of course, it's still a workaround--I'd like to figure out a way to do it properly.

 db_tbl  <- tbl(con, in_schema("OTHER_USER", "table_I_care_about",
                          overwrite = TRUE)

I for sure don't 'know' but I'd be tempted to try

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

Hm. This is a really interesting idea---I tried it a couple times and it seemed to work (despite the fact that I don't have (or SHOULDN'T HAVE!) write access to table_I_care_about).

I suppose I need to get in and check the dplyr-generated tables in my schema, see if they being changed, to confirm if this is working.