Multiple Column Selection in Function and Base R sd function

Hi Friends,

Below is the r code that works , Please suggest inputs for some user inputs

  1. want to make the column selection to be user defined like from m1:m3 or from m1:m12. when we call the function. please suggest ways to do it. for the function
    2.The standard deviation function in R ,sd gives results equivalent to stdev.s in excel, is there a func equivalent to stdev.p in excel or is it suggested to use only the sd function equivalent to stdev.s
x <- rep(c('a', 'b'), 5)
m1 <- rnorm(10, 100, 20)
m2 <- rnorm(10, 120, 30)
m3 <- rnorm(10, 140, 40)

df <- data.frame(x = x,
                 m1 = m1,
                 m2 = m2,
                 m3 = m3)

d1 <- df %>%
  group_nest(x, keep = TRUE)

ABC <- function(dt)
{
  dt %>%
    mutate(Sales_3M = rowSums(across(m1:m3))) %>%
    arrange(desc(Sales_3M)) %>%
    mutate(
      Cum_3M = cumsum(Sales_3M),
      Cum_Pct_3M = Cum_3M / sum(Sales_3M),
      ABC_3M = if_else(
        Cum_Pct_3M > 0.95,
        "C",
        if_else(Cum_Pct_3M > 0.80 & Cum_Pct_3M < 0.95, "B", "A")
      )
    )
}

t1 <- map(d1$data, ABC) %>% bind_rows()

XYZ <- function(dt) {
  dt %>%
    rowwise() %>%
    mutate(
      SD_3M = sd(across(m1:m3)),
      Mean_3M = rowMeans(across(m1:m3)),
      COV_3M = SD_3M / Mean_3M,
      XYZ_3M = if_else(COV_3M > 1, "Z", if_else(COV_3M > 0.5 &
                                                  COV_3M < 1, "Y", "X"))
    )
  
}

t2 <- XYZ(t1)

  1. you can do this sort of thing
    https://tidyeval.tidyverse.org/
library(tidyverse)
library(rlang)
library(glue)
df <- data.frame(x = rep(c('a', 'b'), 5),
                 m1 = rnorm(10, 100, 20),
                 m2 = rnorm(10, 120, 30),
                 m3 = rnorm(10, 140, 40))

d1 <- df %>%
  group_nest(x, keep = TRUE)

ABC <- function(dt,first_m,last_m)
{

  nc <- dt %>% filter(FALSE) %>% select({{first_m}}:{{last_m}}) %>% ncol
  print(nc)
  sales<- glue('Sales_{nc}M')
  cum <- glue('Cum_{nc}M')
  cump <- glue('Cum_Pct_{nc}M')
  abc<-glue('ABC_{nc}M')
  # browser()

  dt %>%
    mutate({{sales}} := rowSums(across({{first_m}}:{{last_m}}))) %>%
    arrange(desc({{sales}})) %>%
    mutate(
     {{cum}} := cumsum(!!sym(sales)),
       {{cump}}:= !!sym(cum) / sum(!!sym(sales)),
      {{abc}} := if_else(
        !!sym(cump) > 0.95,
        "C",
        if_else(!!sym(cump) > 0.80 & !!sym(cump) < 0.95, "B", "A")
      )
    )
}

map(d1$data, ~ABC(dt=.,first_m=m1,last_m=m3)) 

In STDEV.P function, the squared deviation is divided by the total number of arguments, mostly represented as N .
In STDEV.S or STDEV, the squared deviation is divided by the total number of sample -1. It is represented as N-1 .
source

Therefore by multiplication and division you can transform one to the other if you need to.

1 Like

Hi Sir,

Thanks for help and time.
Please help to understand.. what this step does dt %>% filter (False) does..

it empties the dataframe of data and keeps only the column headings

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.