Order multiple date columns in R

I have a data set with many date columns which have index. How to easily order them in the dataframe. Appreciate the help. Thanks

data <- data.frame(x1 = 1:10,
                    date3 = c("2020-01-25", "2021-02-10","2021-03-15","2020-04-09","2021-05-11","2020-06-07","2021-08-08", NA,"2020-10-18", "2021-11-11"),
                    date1 = c("2020-02-25", "2021-04-10","2021-04-15","2020-05-09","2021-06-11","2020-07-07","2021-09-08", NA,"2020-11-18", "2021-12-11"),
                    date4 = c("2020-01-25", "2021-02-10","2021-03-15","2020-04-09","2021-05-11",NA,"2021-09-08", NA,"2020-05-18", "2021-10-11"),
                    date2 = c("2020-01-25", "2021-02-10","2021-03-15","2020-04-09","2021-05-11","2020-06-07","2021-08-08", NA,"2020-10-18", "2021-11-11"),
                    x4 = factor(c("B", "B", "A", "F", "A", "B", "A", "B","A", "B")),
                    stringsAsFactors = FALSE)
 # one way is this, but is there any other easy way?
 data<-data.table::setcolorder(data, c("x1","date1","date2","date3","date4"


data %>% 
  select(.,sort(names(.))) %>% 
1 Like

Hi nigrahamuk. Thank you very much for your response and help . It worked with adding dplyr for this example (see below), but it did not work with my actual data which has over 20 transposed date columns. Thank you again.

    data %>% 
     dplyr::select(.,sort(names(.))) %>% 

it doesn't work means that
a) it throws an error ?
--- please share the error
b) it gave incorrect results
--- please describe how the results are incorrect

even better may be to make a reprex of your issue

Without "dplyr" it gives the below error for this example. For my actual data, even with adding dplyr does not change the order of date (n=20) columns in the data. Thanks

Error in select(., x1, starts_with("date")) : 
  unused arguments (x1, starts_with("date"))



to see what name collision you are having with select such that it doesnt work unless you add dplyr:: first.
anyhow, given that you can prefix with dplyr::
it seems you will need to add additional code to account for the fact that the way r sorts strings will put '19' before '2' etc. the classic way to solve is to pad with left zeros. here is code that shows that.

d1 <- data.frame(
  stringsAsFactors = FALSE,
  x1 = c(1L, 2L, 3L),
  date3 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  date1 = c("2020-02-25", "2021-04-10", "2021-04-15"),
  date4 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  date2 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  date9 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  date10 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  date11 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  date19 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  date12 = c("2020-01-25", "2021-02-10", "2021-03-15"),
  x4 = as.factor(c("B", "B", "A"))


  .fn = ~ {
    paste0("date", str_pad(parse_number(.x), 2, pad = "0"))
  .cols = starts_with("date")
) %>%
  dplyr::select(., sort(names(.))) %>%
  dplyr::select(x1, starts_with("date"))
1 Like

Thank you very much. It worked.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.