Changing values in a column, when column label contains a number

I'm trying to change the string values of n.a to 0 in my dataset however when the column label has a number within it and I run the code below I get an error message. Can someone please help?

Code:
Current_Directors_excluding_previous_year$Turnover GBP 2019[Current_Directors_excluding_previous_year$Turnover GBP 2019 == "n.a"] <- 0

Error:
Unknown or uninitialised column: 'Turnover
GBP 2019'.

Hi @Toz, welcome to RStudio Community.

It's hard for us to help you without seeing your data. It would be great if you could create a reprex (reproducible example). Read the following link for more details:

However, from the error message, it seems that you've referenced a variable that does not exist in your data frame. Are you sure that your variable is called Turnover GBP 2019 and not Turnover GBP?

awkward names with spaces should be avoided when possible.. they require wrapping in backticks to be interpreted as one continuous name like this:

Current_Directors_excluding_previous_year$`Turnover GBP 2019`
1 Like

Hi @siddharthprabhu the dataset is quite large so I've put the first 10 rows

Turnover\r\nGBP 2019 Turnover\r\nGBP 2018 Turnover\r\nGBP 2017

1 63911000000 57491000000 55917000000
2 29007000000 28456000000 26224000000
3 17735000000 17262000000 16317000000
4 15824000000 15574000000 15357000000
5 NA 15729000000 16307000000
6 NA 10060000000 8007000000
7 4763100000 4874800000 4650200000
8 4167400000 4055500000 4097300000
9 NA 3843000000 3524000000
10 NA 3737600000 3422300000

@nirgrahamuk I will encounter the same problem when wrapping in backticks

Thank you for providing sample data. I see two issues with your code. First, your data frame subsetting call has incorrect syntax. Secondly, your data contains NA values which are not the same thing as "n.a" strings. Operations on NA values should be performed with special functions; in this case with is.na().

library(tibble)

df <- tribble(~`Turnover GBP 2019`, ~`Turnover GBP 2018`, ~`Turnover GBP 2017`, 
              63911000000, 57491000000, 55917000000,
              29007000000, 28456000000, 26224000000,
              17735000000, 17262000000, 16317000000,
              15824000000, 15574000000, 15357000000,
              NA, 15729000000, 16307000000,
              NA, 10060000000, 8007000000,
              4763100000, 4874800000, 4650200000,
              4167400000, 4055500000, 4097300000,
              NA, 3843000000, 3524000000,
              NA, 3737600000, 3422300000)

df$`Turnover GBP 2019`[is.na(df$`Turnover GBP 2019`)] <- 0

print(df)
#> # A tibble: 10 x 3
#>    `Turnover GBP 2019` `Turnover GBP 2018` `Turnover GBP 2017`
#>                  <dbl>               <dbl>               <dbl>
#>  1         63911000000         57491000000         55917000000
#>  2         29007000000         28456000000         26224000000
#>  3         17735000000         17262000000         16317000000
#>  4         15824000000         15574000000         15357000000
#>  5                   0         15729000000         16307000000
#>  6                   0         10060000000          8007000000
#>  7          4763100000          4874800000          4650200000
#>  8          4167400000          4055500000          4097300000
#>  9                   0          3843000000          3524000000
#> 10                   0          3737600000          3422300000

Created on 2020-03-05 by the reprex package (v0.3.0)

If you're open to using the tidyverse, there is a dedicated function in tidyr called replace_na() for exactly this task.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.