You don't need to do it in the same data frame. Think of it as a dim table in a typical relational database.
Keep your original table, perform friendly-naming on the side, join back to original table... Boom!
Wow, this may be the most elegant solution of them all!
For example:
df <- cbind(df, name = paste0("host",xtfrm(df$host_name)))
I learned about xtfrm because it powers dplyr::desc()
;). There are quite a few useful base r functions with weird names. rle()
is another gem.
The left join doesn't work I have tried joing using the host_name.
just do
x$nice_name <- paste0("host_", xtfrm(x$long_weird_name))
# or
x %>%
dplyr::mutate(nice_name = paste0("host_", xtfrm(long_weird_name)))
Many Thanks for everyone's help with this.
If i had two tables with host names but both showing different columns of data how can i ensure that the hostname will match after conversting it in each one separatley. The order of the host name differ in each dataframe
Make an inner_join()
before converting the names
library(dplyr)
table_x %>%
inner_join(table_y, by = 'host_namet') %>%
mutate(host_name = paste0("host_", xtfrm(host_name)))
hmm for that you need one of the join based solutions. make a new data frame with a single column containing all host names (I am assuming not all hosts are in all tables), assign the nice names as described above, and then join this lookup table with your other data frames on the long host name. you can use merge()
or if you prefer dplyr::left_join()
for that
At the risk of adding yet another way ...
library(tidyverse)
df <- tibble(host_name = c(
"95b4ae6d890e4c46986d91d7ac4bf08200000W",
"95b4ae6d890e4c46986d91d7ac4bf08200000W",
"95b4ae6d890e4c46986d91d7ac4bf08200000V",
"95b4ae6d890e4c46986d91d7ac4bf08200000V",
"95b4ae6d890e4c46986d91d7ac4bf08200000Z",
"95b4ae6d890e4c46986d91d7ac4bf08200000Z",
"95b4ae6d890e4c46986d91d7ac4bf082000011",
"95b4ae6d890e4c46986d91d7ac4bf082000011",
"95b4ae6d890e4c46986d91d7ac4bf082000011",
"95b4ae6d890e4c46986d91d7ac4bf082000011",
"95b4ae6d890e4c46986d91d7ac4bf08200000H",
"95b4ae6d890e4c46986d91d7ac4bf08200000H"))
df %>% mutate(group = group_indices(., host_name))
#> # A tibble: 12 x 2
#> host_name group
#> <chr> <int>
#> 1 95b4ae6d890e4c46986d91d7ac4bf08200000W 3
#> 2 95b4ae6d890e4c46986d91d7ac4bf08200000W 3
#> 3 95b4ae6d890e4c46986d91d7ac4bf08200000V 2
#> 4 95b4ae6d890e4c46986d91d7ac4bf08200000V 2
#> 5 95b4ae6d890e4c46986d91d7ac4bf08200000Z 4
#> 6 95b4ae6d890e4c46986d91d7ac4bf08200000Z 4
#> 7 95b4ae6d890e4c46986d91d7ac4bf082000011 5
#> 8 95b4ae6d890e4c46986d91d7ac4bf082000011 5
#> 9 95b4ae6d890e4c46986d91d7ac4bf082000011 5
#> 10 95b4ae6d890e4c46986d91d7ac4bf082000011 5
#> 11 95b4ae6d890e4c46986d91d7ac4bf08200000H 1
#> 12 95b4ae6d890e4c46986d91d7ac4bf08200000H 1
Created on 2019-01-10 by the reprex package (v0.2.1)
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.