Need help summarizing multiple columns into 1 column, but not in a normal way.

I am still a beginner R user. To simplify the problem I have, basically, I have a data frame very similar to the code indicated below. I have three separate variables with 1's if the variable is true, and NA if it is false. These variables should really all be in one column. How do I do that? Each color is a separate column, but I want there to be one color column with different colors as the values in the column. I tried using if else, else if statements, but was getting error messages that I could not understand.

Thank you so much for any assistance you are able to give!

# This is basiablly what my data is like.
df <- data.frame(ID = c(1, 2, 3, 4, 5),
                 Blue = c(1, NA, NA, NA, 1),
                 Green = c(NA, NA, 1, NA, NA),
                 Red = c(NA, 1, NA, 1, NA))
# I want to put the above data into this new column.
df$color <- NA

#This is a simplified example of what I tried, but got errors that I can't figure out
if (df$Blue == 1){
  df$color == "Blue"
} else if (df$Green == 1){
  df$color == "Green"
} else {
  df$color == "Red"
}

I would pivot the data to a longer format and filter out the NA rows.

df <- data.frame(ID = c(1, 2, 3, 4, 5),
                 Blue = c(1, NA, NA, NA, 1),
                 Green = c(NA, NA, 1, NA, NA),
                 Red = c(NA, 1, NA, 1, NA))
df
#>   ID Blue Green Red
#> 1  1    1    NA  NA
#> 2  2   NA    NA   1
#> 3  3   NA     1  NA
#> 4  4   NA    NA   1
#> 5  5    1    NA  NA
library(tidyr)
library(dplyr)

df_long <- df |> pivot_longer(cols = -ID, names_to = "color", values_to = "Value") |> 
  filter(!is.na(Value))
df_long
#> # A tibble: 5 × 3
#>      ID color Value
#>   <dbl> <chr> <dbl>
#> 1     1 Blue      1
#> 2     2 Red       1
#> 3     3 Green     1
#> 4     4 Red       1
#> 5     5 Blue      1
df_long <- df_long |> select(-Value)
df_long
#> # A tibble: 5 × 2
#>      ID color
#>   <dbl> <chr>
#> 1     1 Blue 
#> 2     2 Red  
#> 3     3 Green
#> 4     4 Red  
#> 5     5 Blue

Created on 2023-05-08 with reprex v2.0.2

Thank you for your response! I actually have a lot more columns than just the "color" variable, and it seemed to get in the way, so I wasn't able to use your example with my real data without errors.

If you can post code to make a data frame that demonstrates the error you are getting, I'll try to find a solution.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.