Calculate descriptive parameters accross database

I am working with a huge database and I am trying several operations that I have done before with small database and few variable types

What I am trying to perform is sthg related with tapply function or similar but with conditions:

  • only numeric values
  • calculate quartiles (specifically 1/3 quartiles)
  • calculate specifying the range of columns to perform. [, 1130: 1200]

Parameters (mean, sd, iqr, quartiles)

Thanks in advance

I assume you mean data.frame rather than database... please clarify if not.
I recommend that you learn and make use of tidyverse as it is very flexible to do the sorts of things you want.
A good free resource is https://r4ds.had.co.nz/

I am trying to look for sthg in package purrr, and I have seen the map functions in which you are able to detail what you want.
But I need to add to these functions which columns I want to include, as 2000:2077. Also I need to specify NA.RM = T, and just operate with numeric or double.

df_xyz %>%
  summarise(
    across(
      .cols  = everything(),   --> [2000:2077]
      .fns   = mean,  
      na.rm  = TRUE,  --> is.numeric = T
      .names = "{col}_mean"
    )
  )

I guess it is not that hard to perform but, I have not found threads performing just a range of columns. I need that because I have too many columns to perform everything, but not that many to customize

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.