Convert Rows Into Strings

Hi All,

I have some integer-based survey data. N = 10,000, X = 6. I would like to encode each of the 10,000 observations as a string of 6 variables

Could someone please show me how to encode this in R, such that rows with duplicate strings take on the same assignment?

Would be greatly appreciated :slight_smile:

dat <- data.frame(
  matrix(
    sample(1:3,5*12, replace=TRUE),12,5,
    dimnames=list(1:12,c("X1","X2","X3","X4","X5"))
  ),
  Sex=rep(c("Male", "Female")))

Do you mean something like this?

dat <- data.frame(
  matrix(
    sample(1:3,5*12, replace=TRUE),12,5,
    dimnames=list(1:12,c("X1","X2","X3","X4","X5"))
  ),
  Sex=rep(c("Male", "Female")))
combo <- apply(dat, MARGIN = 1, function(Vec) paste(Vec, collapse = ","))

cbind(dat, combo)
#>    X1 X2 X3 X4 X5    Sex            combo
#> 1   2  1  3  1  2   Male   2,1,3,1,2,Male
#> 2   2  1  1  2  1 Female 2,1,1,2,1,Female
#> 3   1  2  3  1  1   Male   1,2,3,1,1,Male
#> 4   1  2  1  1  2 Female 1,2,1,1,2,Female
#> 5   3  2  1  3  2   Male   3,2,1,3,2,Male
#> 6   2  3  2  2  1 Female 2,3,2,2,1,Female
#> 7   3  2  3  3  2   Male   3,2,3,3,2,Male
#> 8   2  3  1  3  3 Female 2,3,1,3,3,Female
#> 9   3  1  1  2  3   Male   3,1,1,2,3,Male
#> 10  2  1  1  2  2 Female 2,1,1,2,2,Female
#> 11  3  3  1  2  1   Male   3,3,1,2,1,Male
#> 12  2  3  2  3  1 Female 2,3,2,3,1,Female

Created on 2022-11-14 with reprex v2.0.2

Thanks @FJCC .

That's really helpful. Ideally, I am looking to transform data into string form, for string-like clustering using the GrpStringer package https://journal.r-project.org/archive/2018/RJ-2018-002/RJ-2018-002.pdf.

But tbh, I am struggling to do this with my own little example. I shall keep trying, however.

Appreciate your help as always :slight_smile:

I know nothing about such data processing, so excuse the dumb question. Do you want all of the answers concatenated with no separator? If so, use

combo <- apply(dat, MARGIN = 1, function(Vec) paste(Vec, collapse = ""))

Hi @FJCC.

Not dumb at all. My question wasn't too well expressed in the first instance given some unknown unknowns on my behalf.

However, I think in the example below, they have what looks like data in long form and then use the EveString function to transform the long data into sets of strings?

I'll potentially try and mimic that with the `dat' example provided above

install.packages("GrpString")
library(GrpString)

data(eventChar.df)
event1d <- paste(path.package("GrpString"), "/extdata/eve1d.txt", sep = "")
EveString(event1d, eventChar.df$event, eventChar.df$char)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.