extract first digit of one column into other column!

Hello everyone,

I've come upon something I'm a bit unsure about:

Basically, I have a dataset where one column is a series of three digit codes. I only really need the first digit of those codes though for my analysis. For this reason, I want to create a new column containing only the first digit of that column, so I can do some dummy variables based on that. For example, if A is the original column, I want to create column B, as below:

A B
123 1
999 9
222 2

Anyone got an idea how to do this? I can't find anything online. Alternatively, if its possible to base dummy variables based on only the first digit of A, without creating B, that would also work.

Thanks!

Hi there,

This can easily be done using a simple bit of RegEx

myData = data.frame(A = c("123", "999", "222"))
myData$B = stringr::str_extract(myData$A, "\\d")

The expression "\d" means that to take the first digit (\d). Since the expression is written as a string, you need to escape the "\" hence "\\"

If you like to learn more of regular expressions, here is a great online site to get started:
https://regexone.com/

Hope this helps,
PJ

I think RegEx might be an overkill for this simple task, you can get away with using base R substr()

sample_df <- data.frame(A = c("123", "999", "222"))
sample_df$B <- substr(sample_df$A, 1, 1)
sample_df
#>     A B
#> 1 123 1
#> 2 999 9
#> 3 222 2

Created on 2022-02-06 by the reprex package (v2.0.1)

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.