I have a table of counties, states and their minimum and maximum temperatures. I need to select only the counties that have temperatures in a range such as -15 degrees to 40 degrees. What function would I use?
Hi, can you put your question into reprex?
It would help everyone here help you in a most straightforward manner.
One possible approach is to group by county and then summarize it with minimum and maximum of their temperatures. Then you can use this information to filter out all the counties that are in the range and join it with your original table by name.
group_by(County, State) %>%
summarise(temp_min = min(temp_min),
temp_max = max(temp_max))
I did summarize it and that is all I have. Now I need to filter out the ones that do not fit in my desired range. How do I do that?
You can use
temp_max in your new dataset to create a new variable with
mutate (something like
mutate(include = temp_min >= -15 & temp_max <= 40))
Then you filter to only have rows with
TRUE and use
dplyr::semi_join on your original data.
If you understand sql you can even try data.table package which is the fastest in entire R programming.
library(data.table) climate.data %>% setDT() climate.data[,.(temp_min=min(temp_min), temp_max=max(temp_max)), by=.(County,State)][ (temp_min > -15) & (temp_max <40),]
data.table has a syntax like sql something like this
from[where, select, group by]