In dplyr, I can easily rename columns in data.frame within my chaining operatons by doing things like
%>% rename("newname" = "oldname")
I was wondering in data.table, how can I do that? If I use the setnames function, it seems that I need to break my chaining operation and start a new block of code. Maybe some thing like this:
[...][, rename("newname" = "oldname")]
I believe it is just
[...][, .(newname = oldname)]
and the above is chained.
https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html
Also if you ever to run into a command that won't chain (which is very unusual for data.table
as it is designed to chain), you can use a dot notation (with the wrapr dot-arrow-pipe ) to try and get out of the corner.
library("wrapr")
library("data.table")
iris %.>% setorderv(., c("Sepal.Length", "Sepal.Width"))[] %.>% head(.)
One can use tricks like %.>% -> .[...]
to move in and out of the form.
But honestly I think everything has a chaining version in data.table
.
1 Like
Will the oldname column be replaced or it just creates an identical column with a new name?
I am not a data.table
expert (yet).
It looks like you have to name all columns you want to live.
The following has two columns.
library("data.table")
as.data.table(iris)[, .(WWW = Petal.Width, Petal.Length)]
One could try this (which does leave the uninvolved old columns in, removing "Petal.Width"):
library("wrapr")
library("data.table")
as.data.table(iris) %.>% setnames(., old = "Petal.Width", new = "WWW")[]
(I have note here as to why the following does not work.
library("magrittr")
library("data.table")
as.data.table(iris) %>% setnames(., old = "Petal.Width", new = "WWW")[]
The following does work.
library("magrittr")
library("data.table")
as.data.table(iris) %>% {setnames(., old = "Petal.Width", new = "WWW")[]}
)
By doing that, I think the other columns are dropped, which is different from the rename function in dplyr.
cderv
September 19, 2018, 6:31am
6
Can I asked why you want it chained "like a pipe" absolutely ?
As data.table is working by reference, there is no assignment and it is like it is chained by default without any pipe-like operator. Pipe operators are a way to chain operation without making an assignment.
library(data.table)
iris_dt <- as.data.table(iris)
iris_dt[, is.SETOSA := Species == "setosa"]
setnames(iris_dt, "is.SETOSA", "is_setosa")
iris_dt
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species is_setosa
#> 1: 5.1 3.5 1.4 0.2 setosa TRUE
#> 2: 4.9 3.0 1.4 0.2 setosa TRUE
#> 3: 4.7 3.2 1.3 0.2 setosa TRUE
#> 4: 4.6 3.1 1.5 0.2 setosa TRUE
#> 5: 5.0 3.6 1.4 0.2 setosa TRUE
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica FALSE
#> 147: 6.3 2.5 5.0 1.9 virginica FALSE
#> 148: 6.5 3.0 5.2 2.0 virginica FALSE
#> 149: 6.2 3.4 5.4 2.3 virginica FALSE
#> 150: 5.9 3.0 5.1 1.8 virginica FALSE
You can "pipe" it using the :=
operator using one of the multiple column syntax to create a new column with a new name then deleting the old one.
library(data.table)
iris_dt <- as.data.table(iris)
iris_dt[, is.SETOSA := Species == "setosa"][
, `:=`(is_setosa = is.SETOSA, is.SETOSA = NULL)]
names(iris_dt)
#> [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#> [5] "Species" "is_setosa"
Otherwise, like mentioned in a previous post, using the %>%
pipe should work with setnames
but you may have difficulties to continue the chain without %>%
library(data.table)
library(magrittr)
iris_dt <- as.data.table(iris)
iris_dt[, is.SETOSA := Species == "setosa"] %>%
setnames(., "is.SETOSA", "is_setosa")
names(iris_dt)
#> [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#> [5] "Species" "is_setosa"
3 Likes
Indeed, it is "][
" that is the correct pipe-operator for data.table
(though method chaining may be the term of choice).
dww
January 19, 2019, 7:41pm
9
The best way to take advantage of data.table's pass-by-reference for this is using .SD
as.data.table(iris)[
, is.SETOSA := Species == "setosa"
][, setnames(.SD, "is.SETOSA", "is_setosa")]
2 Likes
dww
January 19, 2019, 7:56pm
10
will copy data to a new column, then delete the old column. Less efficient than set.names. Better to use .SD in a chain.
1 Like