convert such tibble to data frame

I need to change shape of a data frame from long to wide. After reshaping, it results a tibble [1 x 7], each column in tibble is a list. I want to change tibble to data frame. Using code below, only first 3 columns (lists) can be done, rest of them are failed because lengths of these columns (lists) are different. Is there a way to convert such tibble to data frame?

library(tidyverse)
test <- data.frame(
  stringsAsFactors = FALSE,
  cat = c("15918-OP-18d","15918-OP-18d",
          "15918-OP-18d","15918-OP-18d","15918-OP-18d",
          "15918-OP-18d","15918-OP-18d","15918-OP-18d","15918-OP-18d",
          "15918-OP-18d","15918-OP-18d","15918-OP-18d",
          "15918-OP-18d","15918-OP-18d","15918-OP-18d","15918-OP-18d",
          "15918-OP-18d","15918-OP-18d","15918-OP-18d",
          "15918-OP-18d","15923-OP-23","15923-OP-23","15923-OP-23",
          "15923-OP-23","15923-OP-23","15923-OP-23","15923-OP-23",
          "15923-OP-23","15923-OP-23","15923-OP-23","15923-OP-23",
          "15923-OP-23","15923-OP-23","15923-OP-23","15923-OP-23",
          "15923-OP-23","15923-OP-23","15923-OP-23",
          "15923-OP-23","15923-OP-23","15976-VE-6","15976-VE-6",
          "15976-VE-6","15976-VE-6","15976-VE-6","15976-VE-6",
          "15976-VE-6","15976-VE-6","15976-VE-6","15976-VE-6",
          "16055-PC-05","16055-PC-05","16055-PC-05","16055-PC-05",
          "16055-PC-05","16055-PC-05","16055-PC-05",
          "16055-PC-05","16055-PC-05","16055-PC-05","16055-PC-05",
          "16055-PC-05","16055-PC-05","16055-PC-05","16055-PC-05",
          "16055-PC-05","16055-PC-05","16055-PC-05","16055-PC-05",
          "16055-PC-05","14854-HB-5e","14854-HB-5e",
          "14854-HB-5e","14854-HB-5e","14854-HB-5e",
          "14854-HB-5e","14854-HB-5e","14854-HB-5e",
          "14854-HB-5e","14854-HB-5e","14854-HB-5e","14854-HB-5e",
          "14854-HB-5e","14854-HB-5e","14854-HB-5e",
          "16215-EE-1a","16215-EE-1a","16215-EE-1a","16215-EE-1a",
          "16215-EE-1a","16215-EE-1a","16215-EE-1a",
          "16215-EE-1a","16215-EE-1a","16215-EE-1a","16215-EE-1a",
          "16215-EE-1a","16215-EE-1a","16215-EE-1a","16215-EE-1a",
          "16580-eC-3","16580-eC-3","16580-eC-3","16580-eC-3",
          "16580-eC-3","16580-eC-3","16580-eC-3",
          "16580-eC-3","16580-eC-3","16580-eC-3","16580-eC-3",
          "16580-eC-3","16580-eC-3","16580-eC-3",
          "16580-eC-3"),
  score = c(250.667,408,226.118,192,
           364.833,371.167,290.6,398.857,437.6,467.6,423.8,
           402.875,264.667,302.833,137,251.265,299.421,370,578.5,
           343.833,0,59.1,100,92.3,40,100,0,50,87.5,38.5,
           84,84.2,66.7,40,85.7,50,70.6,100,100,83.3,0,0,
           9.1,0,7.1,12.5,7.7,11.1,0,16.7,69.8,57.3,55.6,
           50.9,42.2,87.4,84.9,31.8,73.6,69.7,75.3,80,66,
           53,66,70.7,69.3,61,45.5,22.3,0,50,50,0,80,0,
           75,100,67.7,66.7,50,100,50,0,100,710.902,
           352.768,266.309,375.199,352.045,346.25,387.726,298.459,
           252.964,288.243,395.552,329.736,279.374,394.579,
           322.825,100,0,50,40,57.8947368421053,93.3333333333333,
           66.6666666666667,21.0526315789474,100,69.2307692307692,
           95.6521739130435,16.6666666666667,76,
           79.4871794871795,100)
wkdat <- test %>% pivot_wider(names_from = cat, values_from = score)
str(wkdat)
datok = data.frame(op18 = wkdat[[1]],
                   op23 = wkdat[[2]],
                   ve6  = wkdat[[3]])
                  #  pc5=wkdat[[4]],
                  #  hb5e=wkdat[[5]],
                  # ee1a=wkdat[[6]],
                  # ec3=wkdat[[7]]
                  #  )

names(datok)[1] <- "op18"
names(datok)[2] <- "op23"
names(datok)[3] <- "ve6"
# names(datok)[4] <- "pc5"
# names(datok)[5] <- "hb5e"
# names(datok)[6] <- "ee1a"
# names(datok)[7] <- "ec3"



Have a look at this: converting efficiently between data.table, data.frame and tibble

The point of dataframes is to hold related data.
objects share a row because they relate to the same thing.
like facts about a person, and each row is a person.
its not clear how your cats and scores relate to each other if they even do...
If you want to just smash them all together, with NA's hanging off the end, then one would construct that like so. but this assumes that the data is somewhat arbitrary as it does not respect any relationships....


(my_groups <- group_split(test,cat))

(largest_group_length <- max(purrr::map_int(my_groups,nrow)))

(extend_groups <- purrr::map(my_groups,~{
  rows_to_add <- largest_group_length - nrow(.x)
  if(rows_to_add>0){
    cat_to_use <- unique(.x$cat)
    added_rows <- data.frame(
      cat= cat_to_use,
      score = rep(NA,rows_to_add))
   return( bind_rows(.x,added_rows))
  }
  return(.x)
}))

#smash together
dplyr::bind_cols(extend_groups)

Thanks for your answer regarding “call a function multiple times”, and all I tried to do are preparing data to generate histogram for each category (cat). In order to use the way as described in “ Automating exploratory plots with ggplot2 and purrr" , I want to have expl named vector based on the test data frame, then use map(). I don’t know how to prepare such data to meet the requirement. The way of R handling loop seems very weird to me. Thank you very much

draw_hist <- function(x) {
   ggplot(test,aes(x=.data[[x]], y=..count..)) + 
    geom_histogram(bins=50, fill="steelblue", color="white") 
}
# don't know how to get expl from test???
expl = set_names(expl)
hist_all <- map(expl, ~ draw_hist (.x) )

I am from SAS and C world - a new R user. I have a big gap to understand each line of your code. It is very confusing with data frame, tibble, list kind of data type and the way of R handling loop is strange to me. Thank you very much for your code, I'll try it out.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.