Recently I have a task of getting data from web API and write them to database. The data are updating everyday. There is a unique identifier column called serial_no and a column called updated_time. So I need to update the database table with data got from web API based on the serial_no column.
It's just like a MERGE statement in SQL Server. I'm wondering how to do this task in R and what's the best practice?
This is more of a database best practice than R best practice
You did not mention what database backend you are using, and it is certain that the optimal solution will vary somewhat. The code below is using Postgres dialect.
In general I would suggest using two tables - let us call them a staging layer and core layer.
dbSendQuery(con, "insert into api_results
select * from stg_api_results
on conflict (serial_no) do update
set updated_time = current_timestamp")
Also worth considering (though likely not in your use case) is on conflict do nothing, variant which is helpful in enforcing unique values (only new rows are inserted).
I am assuming an unique constraint on the field serial_no.
Send a sql statement to database to insert and update table as need
I rarely do SQL DML operation with the help of R. And it seems R is not very suitable to do this task. Maybe I shoud use Python to do the similar task next time after knowing more about it.
Thanks for your reply. And I gained more confidence of R.
Thanks for your kind comments; the process I suggested is in fact language independent (it is about data architecture & executing SQL commands, not some fancy dplyr pipes). Both R and python will give you comparable results, so use whichever you have the most confidence in.
If you are using MS SQL Server my suggestion would be to set up the two tables - stage for current results and historized one with unique constraint for the end result - via Visual Studio. It is well suited for this kind of work. As is often the case with Microsoft products their tools are well integrated (and somewhat of a pain to access from outside the walled garden).