Mutate and replace question

Hi, I am using mutate and replace to change something in a data frame. In the preview of the file in the notepad it shows me that what I want to be replaced is indeed replaced with the right thing but when I "View" the file or export it and open it with excel, it is not actually replaced. It is still the original thing.

Here is the code I am using to convert the file to a data frame (I imported it as .txt file)

Convert to data frame and include _df at end of file name.

Merge.CD8.Phenotype.Part.2_cell_seg_data_df <- data.frame(Merge.CD8.Phenotype.Part.2_cell_seg_data)

Mutate CD8 to CD8+

Merge.CD8.Phenotype.Part.2_cell_seg_data_df %>% 
  mutate(Phenotype = as.character(Phenotype)) %>% 
  mutate(Phenotype = replace(Phenotype, Phenotype == 'CD8', 'CD8+'))

Phenotype Cell.ID Total.Cells

Other 91 NA
CD8+ 92 NA
Other 93 NA

The column header is "Phenotype" and I want to change all the CD8 to CD8+.  As I mentioned above, in the preview window in the notepad, it says CD8+, but when I "View" it or export it, it still says CD8.

Can anyone help me figure out what I am doing wrong?  Do you need more information, sorry, I am new to R.

Thanks for your help,

Try this:

Merge.CD8.Phenotype.Part.2_cell_seg_data_df <- 
  Merge.CD8.Phenotype.Part.2_cell_seg_data_df %>% 
  mutate(Phenotype = as.character(Phenotype)) %>% 
  mutate(Phenotype = ifelse(Phenotype == 'CD8', 'CD8+', Phenotype))

You need to save the result to another variable or itself (as in this example) if you want to use the result further in your analysis.

1 Like

Awesome! That did it.
Can you explain to me how the "ifelse" statement works there to fix the issue and why you had to include the additional Phenotype after the CD8+? I just want a better understanding of how this works to help me learn R.

Thanks again, I appreciate your help,

The ifelse() evaluates the condition Phenotype == 'CD8': if it's TRUE, then it returns 'CD8+', else it returns the existing value of Phenotype, so this replaces the value as intended.

In base R you could replace the value by subsetting the condition, but the above is the most straightforward way within dplyr::mutate().

Just short note to add to @martin.R's explanation: Your original code

mutate(Phenotype = replace(Phenotype, Phenotype == 'CD8', 'CD8+'))

mixes the requirements of replace(), which expects a vector of indices where the replacement should occur, and the requirements of ifelse() (or if_else(), an updated version), which expects a logical vector to indicate when replacement should occur. With this in mind, the following modification of your replace() command will work:

mutate(Phenotype = replace(Phenotype, which(Phenotype == 'CD8'), 'CD8+'))

Thanks, that is very helpful for my understanding!

And thanks to you too, dromano, that also helps me think about the logic behind the code better!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.