How to create a mean column from two different variables and account for Na data points.

This post was flagged by the community and is temporarily hidden.

Hi, and welcome.

This is something you can do with the dplyr package's mutate function, that allows you to create new columns (variables) or to modify existing variables.

To use this, you should check out the documentation for dplyr and tibble if you're not already familiar with them.

my_df
# A tibble: 1 x 5
  id    Plot  LAI.1 LAI.2 Stems.ha
  <chr> <chr> <chr> <chr> <chr>   
1 5     33    3.3   <NA>  2500    
my_df %>% mutate(LAI.2 = ifelse(is.na(LAI.2), LAI.1, LAI.2))
# A tibble: 1 x 5
  id    Plot  LAI.1 LAI.2 Stems.ha
  <chr> <chr> <chr> <chr> <chr>   
1 5     33    3.3   3.3   2500   

What I did was to create a single-row tibble from the first record with an NA (make sure that this is an actual NA, not "NA", the character).

I then piped %>% the tibble to the mutate function. You could assign it thusly

new_df <- my_df %>% mutate(LAI.2 = ifelse(is.na(LAI.2), LAI.1, LAI.2))

The argument to mutate is the LAI.2 variable (column) which is assigned the result of the ifelse function. That function performs a test: Is LAI.2 an NA? If so, return LAI.1, if not, use the existing value of LAI.2.