how define a function?

I want to clean my data and I have to repeat a cod many times. how can I define a function to have both of them under one script?

i have to use these two ...
table(df2$xxxx)
df2$xxxx[df2$xxx=="0"]<-NA

Y think that you need source() function. for reusing your code in scripts.
It is a good practice to create separate R scripts that you can use to store sets of related functions. You can then call those functions using the source() function, at the top of your script in the same way that you call an R package. R will then load those functions into memory and you can use them!
Resources:

Please, let me know if this helps. Best of wishes

1 Like

yes, I got its context by these links but I can't apply them for my data.
for example, I don't know how to define this that works for all variables:
df2$xxxx[df2$xxx=="0"]<-NA

Hi @fary,
The following code shows you how to build a simple user-defined function to do what you requested. This is only one of many possible ways to achieve the same result.
Note that the table() function does not "change" anything; it simply reports a frequency table across the vector.

df <- data.frame(a = c(1,2,2,0,0,4,5,2,7,8),
                 b = c(11,14,0,0,15,18,18,0,19,20))
df
#>    a  b
#> 1  1 11
#> 2  2 14
#> 3  2  0
#> 4  0  0
#> 5  0 15
#> 6  4 18
#> 7  5 18
#> 8  2  0
#> 9  7 19
#> 10 8 20
df_old <- df  # Save original fro later use.

# Check whether this code gives us what we need?
lapply(df, table)  # This works because a 'dataframe' is also a 'list' of columns.
#> $a
#> 
#> 0 1 2 4 5 7 8 
#> 2 1 3 1 1 1 1 
#> 
#> $b
#> 
#>  0 11 14 15 18 19 20 
#>  3  1  1  1  2  1  1
df[df == 0] <- NA  # This overwrites the original dataframe; works across all columns.
df
#>     a  b
#> 1   1 11
#> 2   2 14
#> 3   2 NA
#> 4  NA NA
#> 5  NA 15
#> 6   4 18
#> 7   5 18
#> 8   2 NA
#> 9   7 19
#> 10  8 20

# OK, now put those commands in a user-defined function.
my_fun <- function(df) {
  lst <- lapply(df, table)
  print(lst)
  df[df == 0] <- NA
  return(df)
}

# Run the function with a dataframe name as the input to the first argument.
# In this case we need to run this using the original df specification.
my_fun(df_old)
#> $a
#> 
#> 0 1 2 4 5 7 8 
#> 2 1 3 1 1 1 1 
#> 
#> $b
#> 
#>  0 11 14 15 18 19 20 
#>  3  1  1  1  2  1  1
#>     a  b
#> 1   1 11
#> 2   2 14
#> 3   2 NA
#> 4  NA NA
#> 5  NA 15
#> 6   4 18
#> 7   5 18
#> 8   2 NA
#> 9   7 19
#> 10  8 20

Created on 2021-07-04 by the reprex package (v2.0.0)

2 Likes

hi Davoww
yes i can have table for all my variably by lapply function but what should i do if i want to change the value = 0 to NA ??
i want to have table for each variable and if there is 0 value, then i change it to NA at the same time.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.