Create character strings with a certain length from a data frame


#1

Hey,
i wanna create character strings from a data_frame with a certain length, lets say nchar = 25.
for instance:

i have a data frame:

# A tibble: 8,025 x 1
   type     
   <chr>     
 1 G8920      
 2 G7769         
 3 G7720      
 4 G7722     
 5 G8920      
 6 G5434     
 7 G7652
 8 G7547
 9 G7644
10 G5535
# ... with 8,015 more rows

and from this data frame i want to produce character strings with a maximum length of 25 character as follows:

"G8920 G7769 G7720 G7722"
"G8920 G5434 G7652 G7547"
"G7644 G5535 ... "

and so on, until every row entry is used up.

every help is appreciated, thanks!


#2

Could you please turn this into a self-contained reprex (short for reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

install.reprex("reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.


#3

Hello,

It's important, as Mara pointed out, that you could turn your questions into self-contained reprex.

If I understood correctly, maybe this piece of code could point you in the right direction:

library(dplyr)

x <- data.frame(
  col1 = c("G8920","G7769","G7720","G7722","G8920","G5434","G7652","G7547","G7644","G5535")
  )

x %>%
  mutate(col2 = paste(head(col1), lead(col1,1), lead(col1,2), lead(col1,3))) %>%
  slice(seq(1,n(),4)) %>%
  select(col2)
#> # A tibble: 3 x 1
#>   col2                   
#>   <chr>                  
#> 1 G8920 G7769 G7720 G7722
#> 2 G8920 G5434 G7652 G7547
#> 3 G7720 G5535 NA NA

You're going to have problems with the last rows as you don't have "next strings to concatenate". Also, I'm fixing the length to 4 strings or less.

Hope it helps.

Regards,