Creating a New Column for Each Unique Factor

I am working with R.

I have the following data:


v1 <- c("2010-01","2010-02", "2010-03", "2010-04", "2010-05") 
v2 <- c("A", "B", "C", "D", "E")


dates <- as.factor(sample(v1, 1000, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

types <- as.factor(sample(v2,1000, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

var = rnorm(1000,10,10)

problem_data = data.frame(var,dates, types)

> head(problem_data)

        var   dates types
1 -6.772497 2010-01     A
2  6.769367 2010-01     D
3 18.914358 2010-02     C
4  6.517997 2010-02     E
5 19.616047 2010-01     B
6  5.129928 2010-01     B

I am trying to make a final data set that contains a new column for each unique "group" within the "types" column. I know how to do this manually:


library(dplyr)

graph_data = data.frame(problem_data %>% group_by(dates, types) %>% summarise(count = n()))

col_A <- graph_data[which(graph_data$types == "A"), ]
col_B <- graph_data[which(graph_data$types == "B"), ]
col_C <- graph_data[which(graph_data$types == "C"), ]
col_D <- graph_data[which(graph_data$types == "D"), ]
col_E <- graph_data[which(graph_data$types == "E"), ]


final_data = data.frame(col_A$dates, col_A$count, col_B$count, col_C$count, col_D$count, col_E$count)

  col_A.dates col_A.count col_B.count col_C.count col_D.count col_E.count
1     2010-01         189         130          57          58          53
2     2010-02          63          62          25          18          30
3     2010-03          46          24          12          12          11
4     2010-04          45          17           8          16          15
5     2010-05          42          26          13          12          16

Is there a more direct way to do this in R?

Thanks!

Hi,

You can easily do this with the pivot_wider() function

library(tidyverse)

#Generating data
v1 <- c("2010-01","2010-02", "2010-03", "2010-04", "2010-05") 
v2 <- c("A", "B", "C", "D", "E")
dates <- as.factor(sample(v1, 1000, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))
types <- as.factor(sample(v2,1000, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))
var = rnorm(1000,10,10)
problem_data = data.frame(var,dates, types)

#Summarising
graph_data = data.frame(problem_data %>% group_by(dates, types) %>% 
                          summarise(count = n(), .groups = "drop"))

#Use pivot_wider()
graph_data %>% pivot_wider(dates, names_from = types, values_from = count)
#> # A tibble: 5 x 6
#>   dates       A     B     C     D     E
#>   <fct>   <int> <int> <int> <int> <int>
#> 1 2010-01   194   125    68    66    75
#> 2 2010-02    81    50    30    23    15
#> 3 2010-03    32    21    12     6     7
#> 4 2010-04    31    22     9     6    13
#> 5 2010-05    41    27    23    11    12

Created on 2022-01-18 by the reprex package (v2.0.1)

1 Like

@pieterjanvc : thank you so much! do you know if this can be done in base R? thank you so much for all your help!

Hi,

You can use the reshape() function if you don't want to use pivot_wider()
https://rstudio-pubs-static.s3.amazonaws.com/610313_0328bc102fc1468ba34a9e2ca4c295be.html

Hope this helps,
PJ

1 Like

You could do it using xtabs() in base R.

graph_data$types <- factor(graph_data$types, levels = c("A", "B", "C", "D", "E"))

xtabs(count ~ dates + types, graph_data)

         types
dates       A   B   C   D   E
  2010-01 190 120  68  53  71
  2010-02  76  61  20  26  26
  2010-03  37  28  16  11  15
  2010-04  38  20   7  13  12
  2010-05  36  20  13   8  15
2 Likes

@williaml : thank you for this answer! I really like it!

But is there a way to do this if you don't wan to manually list the "levels"?

Thank you so much!

I suppose they are already in the right order, but you might be able to do something like this:

types_levels <- sort(unique(graph_data$types)) 

graph_data$types <- factor(graph_data$types, levels = types_levels)

xtabs(count ~ dates + types, graph_data)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.