Subsetting dataset to include numerical values in a column

HI everyone
I have a table containing 20 columns, one of the columns has a mixtures of cells containing only numbers and other cells containing numbers and letters, like this:
23
23
243q
435
426yy5756
78443
2445
78789
6dfg546
kj45
cvb
3456tt

I want to get a subset of my table based on the above column, where we only include the cells containing numbers only, ie deleting any row with a cell in the above column that contains a number. The column itself is stored now as Factor
any help is greatly apreciated

Can you put your question into reproducible example form (reprex)? It'll help everyone to answer your question most efficiently.

The approach I'd take is to use regex to find lines where there is anything else other than numbers. It'll look something like this:

vec <- c("78443", "2445", "78789", "6dfg546")
grepl(pattern = "[a-zA-Z]", x = vec)
[1] FALSE FALSE FALSE  TRUE

You can use result to filter out any rows that contain letters.

3 Likes

@mishabalyasin is pointing you in the right direction. Although I see in your comments that your data is stored as factors, not strings. You'll want to convert those to strings, most likely:

Here's a good discussion of how to do that:

Like so?

tibble(x = factor(c('23','23','243q','435','426yy5756','78443','2445','78789','6dfg546','kj45','cvb','3456tt'))) %>%
  mutate(x = x %>% as.character %>% as.numeric) %>%
  filter(!is.na(x))
1 Like

Thank you for the suggested solution. I will also use repex in the future

1 Like