 # Creating new column with new values calculated from other columns

I have a large dataset with demographic data--two columns of which are Weight and Height. Using data for Weight and Height, I want to create a "BMI" column. With my number, the BMI equation would be: ((Weight)/(Height/100)^2).

Here is my sample data--of note, where "NA" is, my data says "NULL"--I could not put NULL and create a reprex so NA it is:

``````DF <- data.frame(
ID = c("A", "B", "C", "D", "E", "F"),
Weight = c(15, 56.3, 56.8, 16, 56.2, 14.5),
Height = c(103, NULL, NULL, 99, 185.4, 88)) %>%
replace(.=="NULL", "0")
#> Error in data.frame(ID = c("A", "B", "C", "D", "E", "F"), Weight = c(15, : could not find function "%>%"
``````

For my data with NULL, I tried to make it numerical with this coding below, and added a column with just the first step in my calculation:

DF2<- DF %>% replace(.==="NULL", "0")
DF2\$BMI <- with(DF2, Height/100)

What I get is this error message: "Error in Height/100: non-numeric argument to binary operator"

I think my original dataset must not be numeric. Help!

I think you are right here. But you can use `str(DF)` to check the type of each of your columns.

If you are sure that the Height column only contains values that can be converted into numeric, try using `DF2\$Height<-as.numeric(DF2\$Height)`

By the way, are you sure that `NA` wouldn't be a better way to display missing values in your data? `NA` would alway show up as such while `0` will give results in most operations. The could, for example, skew the results if you want to compute the mean of some of your data. `NAs` on the other hand can just be excluded, so that you'll get valid results for the rest of your data.

