Hello everyone, so I'm new to programming and I would like to be help with a (basic) problem on Rstudio.
So, I have 3 vectors from a dataframe and I would like to create a new one of the same length (6000) based on the other 3. This new vector should provide me all the rows that comply my requirements.
How can I get all the data that are above one (>1) in the first colomn, under one (<1) in the second colomn and above one or Inf in the third one (>1 or Inf) and create a new vector that have a true or false value if the data comply the requirements ?
You do not really need to write any loops; in fact any loops would be highly undesirable for the task you describe in an R context. The mighty vectorization machine will take care of that behind the scenes.
Consider this reproducible example.
What it does is:
create a random data.frame with 6000 values between zero and two
create a new logical (i.e. either true or false) column called "test" based on the formula you describe (first column above one, second below one, and third above one; I am omitting the Inf as it is always more than one)
create a new data frame that contains only those rows from the first that comply with your criteria (i.e. for which the value of the test variable is TRUE
library(dplyr)
set.seed(42) # to get replicable results
# some random data, with range from zero to two
animals <- data.frame(cats = runif(6000, min = 0, max = 2),
dogs = runif(6000, min = 0, max = 2),
pigs = runif(6000, min = 0, max = 2))
# calculate the logical column
animals <- animals %>%
mutate(test = ifelse(cats > 1 & dogs < 1 & pigs > 1, TRUE, FALSE))
# create a new data frame by filtering the animals
selection <- animals %>%
filter(test == T)