Tranpose data frame better aggregation

Hello everybody,

I would like to transform a table with the following data:

  ##data frame input:

Document<-c(1,2,3,4)
Day<-c(5,6,8,10)
Month<-c(25,12,8,6)
FORM_Day<-c(1,1,1,1)
CO_Day<-c(0,0,1,1)
FORM_Month<-c(0,0,1,1)
CO_Month<-c(0,0,1,0)

test_data<-data.frame(Document,Day,Month,FORM_Day,CO_Day,FORM_Month,CO_Month)
test_data

In this data frame we have information about document, day, and month. "FORM" and "CO" variables are 2 types of data validation.

For example, in the first document we have month_value=25. It is impossible bacause we only have 12 months, so in the FORM_Month variable we have a "1" in order to identify a data error.

My target is to restructure the table in order to show the information in a easy way.
Specifically I would like to get the following structure:

Validation_type<-c("FORM", "CO",   "FORM", "CO"  , "FORM", "CO",   "FORM", "CO",   "FORM" ,"CO",   "FORM", "CO"   ,"FORM", "CO",   "FORM", "CO")
camp<-c("Day", "Day",  "Month", "Month" ,"Day"  , "Day"  , "Month", "Month",  "Day" ,  "Day"  , "Month" ,"Month", "Day"  , "Day"  , "Month", "Month" ) 
value<-c( "5",  "5" ,  "25",   "25",    "6"    ,"6"   ,"12",   "12",  "8",    "8", "8", "8","10",   "10",    "6",    "6")
document<-c("1", "1", "1", "1", "2", "2", "2", "2",  "3", "3", "3", "3", "4", "4", "4", "4")
result<-c("0", "0", "1", "0", "0", "0", "0","0", "1", "1", "1", "1", "1", "1", "0", "0")

df_example<-data.frame(Validation_type,campo,value, document,result) 
df_example

I tried some transpose funcitons but I can't find out how to get this agregation level.

Could someone help me ,please?

Thanks in advance!

Robert

I do not think your two data frames test_data and df_example are consistent in labeling with 0 and 1. In any case, the following code gets you the shape your want. There is probably a neater way to do this with the pivot_longer function of tidyr but I have not learned that yet.

Document<-c(1,2,3,4)
Day<-c(5,6,8,10)
Month<-c(25,12,8,6)
FORM_Day<-c(1,1,1,1)
CO_Day<-c(0,0,1,1)
FORM_Month<-c(0,0,1,1)
CO_Month<-c(0,0,1,0)

test_data<-data.frame(Document,Day,Month,FORM_Day,CO_Day,FORM_Month,CO_Month)

library(tidyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
Step1 <- test_data %>% gather(key = "campo", value = "value", Day, Month) 
Step1
#>   Document FORM_Day CO_Day FORM_Month CO_Month campo value
#> 1        1        1      0          0        0   Day     5
#> 2        2        1      0          0        0   Day     6
#> 3        3        1      1          1        1   Day     8
#> 4        4        1      1          1        0   Day    10
#> 5        1        1      0          0        0 Month    25
#> 6        2        1      0          0        0 Month    12
#> 7        3        1      1          1        1 Month     8
#> 8        4        1      1          1        0 Month     6
Day <-  Step1 %>%  filter(campo == "Day") %>% 
  select(-ends_with("Month")) %>% 
  gather(key = "Validation_type", value = "result", FORM_Day, CO_Day) %>% 
  mutate(Validation_type = sub("([^_]+).+$", "\\1", Validation_type))
Day
#>   Document campo value Validation_type result
#> 1        1   Day     5            FORM      1
#> 2        2   Day     6            FORM      1
#> 3        3   Day     8            FORM      1
#> 4        4   Day    10            FORM      1
#> 5        1   Day     5              CO      0
#> 6        2   Day     6              CO      0
#> 7        3   Day     8              CO      1
#> 8        4   Day    10              CO      1
Month <-  Step1 %>%  filter(campo == "Month") %>% 
  select(-ends_with("Day")) %>% 
  gather(key = "Validation_type", value = "result", FORM_Month, CO_Month) %>% 
  mutate(Validation_type = sub("([^_]+).+$", "\\1", Validation_type))
Month
#>   Document campo value Validation_type result
#> 1        1 Month    25            FORM      0
#> 2        2 Month    12            FORM      0
#> 3        3 Month     8            FORM      1
#> 4        4 Month     6            FORM      1
#> 5        1 Month    25              CO      0
#> 6        2 Month    12              CO      0
#> 7        3 Month     8              CO      1
#> 8        4 Month     6              CO      0
df_final <- rbind(Day,Month) %>% select(Validation_type, campo, value, Document, result) %>% 
  arrange(Document, campo)
df_final
#>    Validation_type campo value Document result
#> 1             FORM   Day     5        1      1
#> 2               CO   Day     5        1      0
#> 3             FORM Month    25        1      0
#> 4               CO Month    25        1      0
#> 5             FORM   Day     6        2      1
#> 6               CO   Day     6        2      0
#> 7             FORM Month    12        2      0
#> 8               CO Month    12        2      0
#> 9             FORM   Day     8        3      1
#> 10              CO   Day     8        3      1
#> 11            FORM Month     8        3      1
#> 12              CO Month     8        3      1
#> 13            FORM   Day    10        4      1
#> 14              CO   Day    10        4      1
#> 15            FORM Month     6        4      1
#> 16              CO Month     6        4      0

Created on 2019-12-09 by the reprex package (v0.2.1)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.