I am working with the R programming language. I have the following SQL query:
select b.var1 as var1, b.var2 from
(select *, rank() over( partition by var1 order by var3) as rank1
from my_table)b;
The goal of this SQL code is to:
- Find groups of records containing duplicate values of the same "var1"
- For each of these groups, sort these records based on their values of "var3"
- For each of these groups of sorted duplicate records, only keep the the record with the largest value of "var3"
- Note: Records containing non-duplicate values of "var1" are left untouched
My Question: Does anyone know if it is possible to run this same code in R? For example:
library(RODBC)
library(dbi)
library(odbc)
library(sqldf)
library(SQLite)
dbWriteTable(my_db_connection, SQL(" select b.var1 as var1, b.var2 from
(select *, rank() over( partition by var1 order by var3) as rank1
from my_table)b " ), results_of_this_query)
Does anyone know if it is possible to do this in R? Does R recognize SQL commands like "rank()", "over" and "partition"?
Thanks!