I am working on an online sales dataset mixed with character and numeric vectors in a column and I am using RStudio.

Some invoice and Stock numbers have "C" as prefix and "c" as suffix respectively. Within the Invoice Column, for instance, I see C54009, 554098....., and likewise the Stock Column 342789c, 342980, 345820........ The "str" or "class" function is printing out characters (Chr) for both columns. How do I fit this? Thanks.

You can use parse_number() and col_number() in the {readr} package to read your file with the correct specification:

library(readr)

parse_number(c("C54009", "554098"))
#> [1]  54009 554098
parse_number(c("342789c", "342980", "345820"))
#> [1] 342789 342980 345820

read_csv("a,b
         C54009,342789c
         554098,342980",
         col_types = cols(a = col_number(),
                          b = col_number()))
#> # A tibble: 2 × 2
#>        a      b
#>    <dbl>  <dbl>
#> 1  54009 342789
#> 2 554098 342980

Created on 2022-06-30 by the reprex package (v2.0.1)

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.