Hello, I have a trouble converting the missing values marked as dots (.)


#1

Hello, I have a trouble converting the missing values. In my data set, the missing values were marked as . (dot). I tried many ways to convert them to NA but it doesn't work. Please help me.


#2

Could you show us an example of what you tried?

```{r}
# Your code goes here :-)
```

#3

What did you try?

It is quite easy to change using ifelse:

First we create some fake data with the dots

a <- c(3,4,5,'.',6,7,6,'.',4,5,'.')
a
[1] "3" "4" "5" "." "6" "7" "6" "." "4" "5" "."

We change dots with NA using ifelse, and make the output numeric:

 a <- as.numeric(ifelse(a =='.', NA, a))
 a
[1]  3  4  5 NA  6  7  6 NA  4  5 NA

But we don't know how your data looks and the exact error/problem you are facing :slight_smile:


#4

If you're using read.table, read.csv, or read.delim to read the data from a file, you can set the na.strings argument to specify that dots are missing data.

read.csv("myfile.csv", na.strings = ".")

Reading functions from other packages (e.g., readr, data.table::fread) often have similar arguments.


#5

Thank you for your help. I'm really new to Rstudio. Sorry for the unclear sentence. I tried a lot of function in excel trying to replace the data. The example of my data set is above.
But there are much more data than you can see from the picture (~600 cells).
As you can see, some missing data were written as "." and some just left as blank cells.


#8

You can import your data from excel directly with readxl::read_xlsx (or readxl::read_xls depending on your file extension). This function will consider blank cells to be NA by default and you can specify what other strings should also be considered as NA.

So, after you install the package if you haven't already (note that it is part of the tidyverse so you might have installed it without knowing it), run:

readxl::read_xlsx("/path/to/your/file.xlsx", na = ".")

and both your blank cells and your cells with . will be converted to NA.

If your data is not on the first sheet or if you want to customize the function further, you should read the help file for it (?readxl::read_xlsx)


#9

I never use Excel, but there should be a "search and replace" that should allow to replace all your . by blanks if you really want to do that (maybe Ctrl + H?).

But it is much better to leave your data file alone and to do this in R instead, for instance in the import process as I suggested above.


#10

Thank you so much for your suggestion.