Rewriting a new column based on condition from multiple columns

Hi All,

I have attached the below data frame with 6 columns
On the Observed cleaned column, I'm trying to replace the negative values with values from the mean.outliers column only if the anomaly column is "Yes" and the Observed cleaned column has values < 0.
I have tried using the mutate and case_when functions but seems to be not working correctly.
Could anyone please help me with this.
Thank you

library(tidyverse)
library(anomalize)
library(lubridate)
library(openxlsx)
library(tibble)
library(norm)
library(ggplot2)
test.df <-
structure(list(Product.ID = c("A", "A", "A", "A", "A", "A", "A"
), Location.ID = c("A1", "A1", "A1", "A1", "A1", "A1", "A1"), 
    Time.Period = c("Sep 2017", "Oct 2017", "Nov 2017", "Dec 2017", 
    "Jan 2018", "Feb 2018", "Mar 2018"), anomaly = c("No", "No", 
    "No", "Yes", "No", "No", "No"), observed_cleaned = c(5.399993, 
    9, 10, -3, 5, 1, 3), mean.outliers = c(6.34999825, 8.25, 
    6.75, 6.5, 3.5, 4, 1.66666666666667)), row.names = c(NA, 
7L), class = "data.frame")
test.df<-test.df %>% 
  mutate(observed_cleaned = case_when(
    anomaly == "Yes" && observed_cleaned <0 ~ mean.outliers,
    observed_cleaned
  ))

Two issues: & is required rather than && and case_when() requires a two-sided formula.

test.df <- test.df %>% 
  mutate(observed_cleaned = case_when(
    anomaly == "Yes" & observed_cleaned <0 ~ mean.outliers,
    TRUE ~ observed_cleaned
  ))

Thank you. It worked

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.