HI everyone I have a table containing 20 columns, one of the columns has a mixtures of cells containing only numbers and other cells containing numbers and letters, like this: 23 23 243q 435 426yy5756 78443 2445 78789 6dfg546 kj45 cvb 3456tt
I want to get a subset of my table based on the above column, where we only include the cells containing numbers only, ie deleting any row with a cell in the above column that contains a number. The column itself is stored now as Factor any help is greatly apreciated
Can you put your question into reproducible example form (reprex)? It'll help everyone to answer your question most efficiently.
The approach I'd take is to use regex to find lines where there is anything else other than numbers. It'll look something like this:
vec <- c("78443", "2445", "78789", "6dfg546") grepl(pattern = "[a-zA-Z]", x = vec) [1] FALSE FALSE FALSE TRUE
You can use result to filter out any rows that contain letters.
@mishabalyasin is pointing you in the right direction. Although I see in your comments that your data is stored as factors, not strings. You'll want to convert those to strings, most likely:
Here's a good discussion of how to do that:
Like so?
tibble(x = factor(c('23','23','243q','435','426yy5756','78443','2445','78789','6dfg546','kj45','cvb','3456tt'))) %>% mutate(x = x %>% as.character %>% as.numeric) %>% filter(!is.na(x))
Thank you for the suggested solution. I will also use repex in the future