Cases to Vars SPSS command replacement in R

I have a Sample dataset 1 which includes coloumn CASEID (345,355,365,345,355,365) and CONDITION (Anklyosi, chf, arrythmi, arrythmi, HTN, diabetes)

I want to reformat the data into dataset2 which I have been able to do in SPSS using casestovars command function in SPSS dataset2

I am having difficulty using reshape command to do this in R. Please help. Thanks!

Hi!

I don't know anything about the behavior of casestovars (or any other command in SPSS for that matter), but to get your desired output, you could add a ConditionID column to your data and then use tidyr's pivot_wider().

suppressPackageStartupMessages({
  library(tidyr)
  library(dplyr)  
  })

dataLong<-tibble(CASEID=c(345,355,365,345,355,365),
                 CONDITION=c("ankylosis","chf", "arrhythmia","arrhythmia","HTN","diabetes"))

# Add ConditionId column
dataLong<-dataLong%>%mutate(ConditionID=rep(1:2,each=3))

# Pivot into wide format
dataLong%>%
  pivot_wider(names_from = ConditionID,values_from=CONDITION,names_prefix="Condition")
#> # A tibble: 3 x 3
#>   CASEID Condition1 Condition2
#>    <dbl> <chr>      <chr>     
#> 1    345 ankylosis  arrhythmia
#> 2    355 chf        HTN       
#> 3    365 arrhythmia diabetes

Created on 2020-12-28 by the reprex package (v0.3.0)

Thank you so much for the reply. I am really grateful for the answer you provided. However, I see that you make the column condition ID before you pivot the table. Unfortunately, my actual data that I am working with ,has millions of observations and numbering the column condition ID would not be feasible. Is there a way in R that the conditions can be set up from long to wide (numbering condition 1, condition 2, and so forth) based on CaseID alone and with out having to add a column.
Thanks again for the help.

In this case you could try to group your data by the CASEID, create a nested column and use tidyr's unnest_wider() to bring it in the wide format. You will get NAs if the Case IDs have different frequencies, but this is unavoidable with the wide format.

suppressPackageStartupMessages({
  library(tidyr)
  library(dplyr)  
})

dataLong<-tibble(CASEID=c(345,355,365,345,355,365),
                 CONDITION=c("ankylosis","chf", "arrhythmia","arrhythmia","HTN","diabetes"))


dataLong%>%group_by(CASEID)%>%
  summarise(Condition=c(cur_data()))%>%
  unnest_wider(Condition,names_sep = ".")

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 3 x 3
#>   CASEID Condition.1 Condition.2
#>    <dbl> <chr>       <chr>      
#> 1    345 ankylosis   arrhythmia 
#> 2    355 chf         HTN        
#> 3    365 arrhythmia  diabetes

Created on 2020-12-28 by the reprex package (v0.3.0)

1 Like

Thank you again for an excellent answer. Unfortunately, I have to use another platform which has R embedded in it (data can't be downloaded due to privacy concerns) and may have old versions of libraries and I can't seem to update them. The cur_data function can not be performed on this platform despite uploading the libraries mentioned in the post. Is there a replacement for the function cur_data?

P.S your code works on regular R software for me.

I see, your not making this easy :smile:

 summarise(Condition=list(CONDITION)) 

should work in your case in the same way as summarise(Condition=c(cur_data())).

Does unnest_wider() work on that system? I think it was added to tidyr mid 2019.

1 Like

Hey,
Thanks man for the help. You are a life-savior. I have been working on this for the last whole week. I have been able to do it now. Yes Unnest_wider worked on that system. Looks like the system is not that old.

Again, appreciate the help!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.