R converting into more friendly names

You don't need to do it in the same data frame. Think of it as a dim table in a typical relational database.
Keep your original table, perform friendly-naming on the side, join back to original table... Boom!

Wow, this may be the most elegant solution of them all!

For example:

df <- cbind(df, name = paste0("host",xtfrm(df$host_name)))
1 Like

I learned about xtfrm because it powers dplyr::desc() ;). There are quite a few useful base r functions with weird names. rle() is another gem.

2 Likes

The left join doesn't work I have tried joing using the host_name.

just do

x$nice_name <- paste0("host_", xtfrm(x$long_weird_name))
# or
x %>% 
  dplyr::mutate(nice_name = paste0("host_", xtfrm(long_weird_name)))
1 Like

Many Thanks for everyone's help with this.

If i had two tables with host names but both showing different columns of data how can i ensure that the hostname will match after conversting it in each one separatley. The order of the host name differ in each dataframe

Make an inner_join() before converting the names

library(dplyr)
table_x %>% 
    inner_join(table_y, by = 'host_namet') %>% 
    mutate(host_name = paste0("host_", xtfrm(host_name)))

hmm for that you need one of the join based solutions. make a new data frame with a single column containing all host names (I am assuming not all hosts are in all tables), assign the nice names as described above, and then join this lookup table with your other data frames on the long host name. you can use merge() or if you prefer dplyr::left_join() for that

At the risk of adding yet another way :slight_smile:...

library(tidyverse)
df <- tibble(host_name = c(
  "95b4ae6d890e4c46986d91d7ac4bf08200000W",
  "95b4ae6d890e4c46986d91d7ac4bf08200000W",
  "95b4ae6d890e4c46986d91d7ac4bf08200000V",
  "95b4ae6d890e4c46986d91d7ac4bf08200000V",
  "95b4ae6d890e4c46986d91d7ac4bf08200000Z",
  "95b4ae6d890e4c46986d91d7ac4bf08200000Z",
  "95b4ae6d890e4c46986d91d7ac4bf082000011",
  "95b4ae6d890e4c46986d91d7ac4bf082000011",
  "95b4ae6d890e4c46986d91d7ac4bf082000011",
  "95b4ae6d890e4c46986d91d7ac4bf082000011",
  "95b4ae6d890e4c46986d91d7ac4bf08200000H",
  "95b4ae6d890e4c46986d91d7ac4bf08200000H"))

df %>% mutate(group = group_indices(., host_name))
#> # A tibble: 12 x 2
#>    host_name                              group
#>    <chr>                                  <int>
#>  1 95b4ae6d890e4c46986d91d7ac4bf08200000W     3
#>  2 95b4ae6d890e4c46986d91d7ac4bf08200000W     3
#>  3 95b4ae6d890e4c46986d91d7ac4bf08200000V     2
#>  4 95b4ae6d890e4c46986d91d7ac4bf08200000V     2
#>  5 95b4ae6d890e4c46986d91d7ac4bf08200000Z     4
#>  6 95b4ae6d890e4c46986d91d7ac4bf08200000Z     4
#>  7 95b4ae6d890e4c46986d91d7ac4bf082000011     5
#>  8 95b4ae6d890e4c46986d91d7ac4bf082000011     5
#>  9 95b4ae6d890e4c46986d91d7ac4bf082000011     5
#> 10 95b4ae6d890e4c46986d91d7ac4bf082000011     5
#> 11 95b4ae6d890e4c46986d91d7ac4bf08200000H     1
#> 12 95b4ae6d890e4c46986d91d7ac4bf08200000H     1

Created on 2019-01-10 by the reprex package (v0.2.1)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.