Arrangement of columns in a data frame

kuttan98 · March 24, 2022, 6:09am

I have a survey data where the scoring columns have been created next to the response column. But there is a column for the total score (here in the below data it is l1c1_total). But after I create the scoring columns the position of this column changes and it comes to the front of the data. Can I do something so that the l1c_total is does not come to the front?

library(tidyverse)
data1<-tibble::tribble(
  ~Student_ID, ~l1c1_identify_pic1, ~l1c1_identify_pic2, ~l1c1_identify_pic3, ~l1c1_identify_pic4, ~l1c1_total,
  "S001",                  1L,                  1L,                  1L,                  1L,          6L,
  "S002",                  1L,                  0L,                  1L,                  1L,          4L,
  "S003",                  1L,                  0L,                  1L,                  0L,          6L,
  "S004",                  1L,                  0L,                  1L,                  0L,          4L,
  "S005",                  0L,                  1L,                  1L,                  0L,          2L,
  "S006",                  0L,                  0L,                  1L,                  0L,          4L,
  "S007",                  1L,                  1L,                  1L,                  0L,          6L,
  "S008",                  0L,                  1L,                  1L,                  0L,          6L,
  "S009",                  1L,                  1L,                  0L,                -99L,          4L,
  "S010",                  0L,                  0L,                -99L,                  0L,          4L,
  "S011",                  1L,                  0L,                 99L,                  0L,          4L,
  "S012",                  0L,                  1L,                -99L,                  0L,          6L,
  "S013",                  0L,                  0L,                  1L,                  0L,          6L,
  "S014",                  0L,                  0L,                  1L,                  0L,          6L,
  "S015",                  1L,                  1L,                  1L,                  1L,          6L
)
#Greate groups of columns with different scoring rules
scoring1 = c("l1c1_identify_pic1","l1c1_identify_pic2")
scoring2 = c("l1c1_identify_pic3","l1c1_identify_pic4")

#Create the scoring columns
scoringCols = c(scoring1, scoring2)

scores = data1 %>% select(all_of(scoringCols)) %>% 
  #Set the rules for each scoring group
  mutate(across(all_of(scoring1), function(x){ifelse(x==1, 2, 0)}),
         across(all_of(scoring2), function(x){ifelse(x == 1, 1, 0)}))

#Edit the new col names
colnames(scores) = paste(colnames(scores), "score", sep = "_")

#Bind the data together and sort the scoring columns next to the original
data2 = bind_cols(data1, scores)
data2 = data2 %>% 
  select(-all_of(c(scoringCols, colnames(scores))), sort(colnames(data2)))

data2
#> # A tibble: 15 x 10
#>    Student_ID l1c1_total l1c1_identify_pic1 l1c1_identify_pic1~ l1c1_identify_p~
#>    <chr>           <int>              <int>               <dbl>            <int>
#>  1 S001                6                  1                   2                1
#>  2 S002                4                  1                   2                0
#>  3 S003                6                  1                   2                0
#>  4 S004                4                  1                   2                0
#>  5 S005                2                  0                   0                1
#>  6 S006                4                  0                   0                0
#>  7 S007                6                  1                   2                1
#>  8 S008                6                  0                   0                1
#>  9 S009                4                  1                   2                1
#> 10 S010                4                  0                   0                0
#> 11 S011                4                  1                   2                0
#> 12 S012                6                  0                   0                1
#> 13 S013                6                  0                   0                0
#> 14 S014                6                  0                   0                0
#> 15 S015                6                  1                   2                1
#> # ... with 5 more variables: l1c1_identify_pic2_score <dbl>,
#> #   l1c1_identify_pic3 <int>, l1c1_identify_pic3_score <dbl>,
#> #   l1c1_identify_pic4 <int>, l1c1_identify_pic4_score <dbl>

^{Created on 2022-03-24 by the reprex package (v2.0.1)}

duringju211 · March 24, 2022, 11:48am

Hi!

You could change the ordering in your select statement. Or you could have a look at dplyr::relocate.

library(tidyverse)

relocate(diamonds, carat, .after = last_col())
#> # A tibble: 53,940 × 10
#>    cut       color clarity depth table price     x     y     z carat
#>    <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#>  1 Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43  0.23
#>  2 Premium   E     SI1      59.8    61   326  3.89  3.84  2.31  0.21
#>  3 Good      E     VS1      56.9    65   327  4.05  4.07  2.31  0.23
#>  4 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63  0.29
#>  5 Good      J     SI2      63.3    58   335  4.34  4.35  2.75  0.31
#>  6 Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48  0.24
#>  7 Very Good I     VVS1     62.3    57   336  3.95  3.98  2.47  0.24
#>  8 Very Good H     SI1      61.9    55   337  4.07  4.11  2.53  0.26
#>  9 Fair      E     VS2      65.1    61   337  3.87  3.78  2.49  0.22
#> 10 Very Good H     VS1      59.4    61   338  4     4.05  2.39  0.23
#> # … with 53,930 more rows

Created on 2022-03-24 by the reprex package (v2.0.1)

kuttan98 · March 26, 2022, 5:00am

Thanks a lot. I will try this out.

Regards,
NP

kuttan98 · March 26, 2022, 10:03am

Just an additional question here. Can I use this to relocate multiple columns of a data frame?
Like

diamonds %>%
relocate(carat,.after=last_col(),
                   z,,after=x)

duringju211 · March 28, 2022, 6:46am

Yes you can. Try this for example:

library(tidyverse)

diamonds %>%
  relocate(c(carat, cut), .after = last_col())
#> # A tibble: 53,940 x 10
#>    color clarity depth table price     x     y     z carat cut      
#>    <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <ord>    
#>  1 E     SI2      61.5    55   326  3.95  3.98  2.43  0.23 Ideal    
#>  2 E     SI1      59.8    61   326  3.89  3.84  2.31  0.21 Premium  
#>  3 E     VS1      56.9    65   327  4.05  4.07  2.31  0.23 Good     
#>  4 I     VS2      62.4    58   334  4.2   4.23  2.63  0.29 Premium  
#>  5 J     SI2      63.3    58   335  4.34  4.35  2.75  0.31 Good     
#>  6 J     VVS2     62.8    57   336  3.94  3.96  2.48  0.24 Very Good
#>  7 I     VVS1     62.3    57   336  3.95  3.98  2.47  0.24 Very Good
#>  8 H     SI1      61.9    55   337  4.07  4.11  2.53  0.26 Very Good
#>  9 E     VS2      65.1    61   337  3.87  3.78  2.49  0.22 Fair     
#> 10 H     VS1      59.4    61   338  4     4.05  2.39  0.23 Very Good
#> # ... with 53,930 more rows

^{Created on 2022-03-28 by the reprex package (v2.0.1)}

kuttan98 · March 28, 2022, 8:56am

A follow-up question, what if I need to locate variable "carat" after "z" and the variable "cut" after "x"? Is that possible?

duringju211 · March 28, 2022, 9:17am

There are basically no limitations on what you can do:

library(tidyverse)

diamonds %>%
  relocate(carat, .after = z) %>%
  relocate(cut, .after = x)
#> # A tibble: 53,940 x 10
#>    color clarity depth table price     x cut           y     z carat
#>    <ord> <ord>   <dbl> <dbl> <int> <dbl> <ord>     <dbl> <dbl> <dbl>
#>  1 E     SI2      61.5    55   326  3.95 Ideal      3.98  2.43  0.23
#>  2 E     SI1      59.8    61   326  3.89 Premium    3.84  2.31  0.21
#>  3 E     VS1      56.9    65   327  4.05 Good       4.07  2.31  0.23
#>  4 I     VS2      62.4    58   334  4.2  Premium    4.23  2.63  0.29
#>  5 J     SI2      63.3    58   335  4.34 Good       4.35  2.75  0.31
#>  6 J     VVS2     62.8    57   336  3.94 Very Good  3.96  2.48  0.24
#>  7 I     VVS1     62.3    57   336  3.95 Very Good  3.98  2.47  0.24
#>  8 H     SI1      61.9    55   337  4.07 Very Good  4.11  2.53  0.26
#>  9 E     VS2      65.1    61   337  3.87 Fair       3.78  2.49  0.22
#> 10 H     VS1      59.4    61   338  4    Very Good  4.05  2.39  0.23
#> # ... with 53,930 more rows

^{Created on 2022-03-28 by the reprex package (v2.0.1)}

kuttan98 · March 28, 2022, 10:09am

This was ok. i just wondered whether it could be done with a single function as I have many columns to transfer. But ok repetition is fine.

Regards,
NP

system · April 4, 2022, 10:10am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.