I have a dataframe with particular values for each variable I want to change. How would I filter a variable by a specific value and then change each value to something else? For example if I had a variable height and wanted to change all values at 20cm to NA, how would I do that?
df[df$height == 20, "height"] <- NA
You could also use
df <- df %>% mutate(height = replace(height, height == 20, NA))
Although note that you may want to leave your original data and add a new variable, rather than change values.
Ignoring specific variables this time, if I just do
df[df == 20] <- NA
does this replace all values at 20 in the whole data set to NA?
I think all the others answers work, but usually I do:
df <- df %>% mutate(height = ifelse(height == 20, NA, height))
ifelse – it strips attributes and is often slow.
if_else is generally a better idea if you are already working in tidyverse-land.
Thanks nick. Yes I knew
if_else and that is more strict/safe, but I don’t understand why. I don’t understand also what does mean from your reply the “it strips attributes” part . Thanks again
Attributes are essentially “metadata” about variables that can be stored and retrieved, and are used by several common systems. The most common is probably factors, where the attributes store what the levels of the factor are. The
if_else documentation has a good example of the factor attributes getting stripped by
# Unlike ifelse, if_else preserves types x <- factor(sample(letters[1:5], 10, replace = TRUE)) ifelse(x %in% c("a", "b", "c"), x, factor(NA)) #>  2 3 1 NA NA NA 3 NA 3 NA if_else(x %in% c("a", "b", "c"), x, factor(NA)) #>  b c a <NA> <NA> <NA> c <NA> c <NA> #> Levels: a b c d e
ifelse also lets you mix types in the output, which can happen inadvertently in some cases. This can lead to unstable results:
ifelse(c(TRUE, FALSE, TRUE), c(1, 2, 3), c("a", "b", "c")) #>  "1" "b" "3" ifelse(c(TRUE, TRUE, TRUE), c(1, 2, 3), c("a", "b", "c")) #>  1 2 3 dplyr::if_else(c(TRUE, TRUE, TRUE), c(1, 2, 3), c("a", "b", "c")) #> Error: `false` must be type double, not character
Ok. Understood …I think .At least I will always remenber to use
if_else(). Thanks nick
Basically, the type that comes out of
ifelse isn’t always the type that you intended to receive—even if you’re doing something apparently simple, like replacing a few values with
@nick’s examples there demonstrate that it’s hard to predict what’ll happen when you mix types. But I can’t tell you how many times I’ve had bugs crop up lines down the road because
ifelse had spat out a numeric full of garbage instead of the type I’d put into it.