please help :D Remove rows in a SpatialPointsDataFrame depending on the rows in a matrix

Hello everybody!

I have got two files, which contain a lot of values. In this example they have been greatly shortened.

The first one is a SpatialPointsDataFrame (spatial1) with 6 observations of one variable (class):

class(spatial1)
# [1] "SpatialPointsDataFrame"
# attr(,"package")
# [1] "sp"

spatial1
#      class
# 1    9     
# 2    9     
# 3    8     
# 4    8     
# 5    8     
# 6    9 

The second one is a matrix (matrix1) with six rows and three columns and with some NA values:

class(matrix1)
# [1] "matrix" "array" 

matrix1
#      S1     S2      S5
# [1,] 35     45      51                         
# [2,] 34     32      62                          
# [3,] 61     58      69                             
# [4,] 27     51      NA                       
# [5,] 37     72      25                        
# [6,] NA     NA      NA    

I would like identify the rows in matrix1 which contain ANY NA values and then I would like to remove those rows in matrix1 but also the corresponding rows in spatial1. "Corresponding" means here "same index/position".

Because row 4 and 6 of matrix1 contain NA, my solution should look like this:

spatial2
#      class
# 1    9     
# 2    9     
# 3    8     
# 5    8     

matrix2
#      S1     S2      S5
# [1,] 35     45      51                         
# [2,] 34     32      62                          
# [3,] 61     58      69                                            
# [5,] 37     72      25                        

I managed removing the rows in matrix1 with this code:

matrix2  <- na.omit(matrix1)

But I found it difficult to remove the corresponding rows in spatial1, because first I have to identify which rows in matrix1 contain values with NA ​​and then, with this knowledge, I have to remove the rows in spatial1.

To identify which rows of matrix1 contain any NA values I tried this:

matrix1_truefalse <- matrix1[rowSums(is.na(matrix1)) > 0,]

matrix1_truefalse
#  [1]  FALSE FALSE FALSE TRUE  FALSE  TRUE    

It tells me correctly that row 4 and 6 contain NA. But how can I use this knowledge to remove the rows in spatial1?

spatial2 <- spatial1[??????matrix1_truefalse??????,  ]

Please consider, that my real data is very big. So I can't just type spatial2 <- spatial1[c(1,2,3,5),].

library(tidyverse)

spatial1 <- data.frame("class" = rep(9,9))
                       
matrix1 <- as.matrix(data.frame("S1"=c(35,34,61,27,37,NA),
                      "S2"=c(45,32,58,51,72, NA),
                      "S5"=c(51,62,69,NA,25,NA)))


> spatial1
  class
1     9
2     9
3     9
4     9
5     9
6     9
7     9
8     9
9     9

> matrix1
     S1 S2 S5
[1,] 35 45 51
[2,] 34 32 62
[3,] 61 58 69
[4,] 27 51 NA
[5,] 37 72 25
[6,] NA NA NA

#Which rows in matrix1 have NA?
#Find out and store in in 'drop'
drop <- which(is.na(as.data.frame(matrix1)[[3]]))

#Make a filtered matrix dropping the rows we have established contain NAs
filtered_matrix1 <- matrix1[-drop,]

#Drop these same rows from spatial1
filtered_spatial1 <- data.frame("class"=spatial1[-drop,])

#Rows 4 and 6 are dropped from matrix1
> filtered_matrix1
     S1 S2 S5
[1,] 35 45 51
[2,] 34 32 62
[3,] 61 58 69
[4,] 37 72 25

#Rows 4 and 6 are dropped from spatial1
> filtered_spatial1
  spatial1[-drop, ]
1                 9
2                 9
3                 9
4                 9
5                 9
6                 9
7                 9

Unfortunately, this solution doesn't work for me. Because here you are indexing only column 3 of matrix1, but I need to consider ALL columns of matrix1.

That means if I would have this matrix:

matrix1 <- as.matrix(data.frame("S1"=c(35,34,61,27,37,NA),
                      "S2"=c(45,32,58,51,72, NA),
                      "S5"=c(51,62,69,NA,25,50)))

> matrix1
     S1 S2 S5
# [1,] 35 45 51
# [2,] 34 32 62
# [3,] 61 58 69
# [4,] 27 51 NA
# [5,] 37 72 25
# [6,] NA NA 50

Your provided solution will only drop row 4 but not row 6 then:

drop <- which(is.na(as.data.frame(matrix1)[[3]]))
drop

# [1]  4

However, I would like to drop row 4 AND 6 in my new example.

I am looking for a solution to drop ALL rows which contain NA in ANY(!) corresponding column; not only in a certain one. (I have a big matrix with 244 columns).

na.omit is elegant in that you can get out the row numbers and use them to exclude in the other object.

library(tidyverse)
library(sp)

set.seed(42)

xy <- SpatialPoints(cbind( xc = round(runif(6), 2),
                           yc = round(runif(6), 2)))


matrix1 <- as.matrix(data.frame("S1"=c(35,34,61,27,37,NA),
                                "S2"=c(45,32,58,51,72, NA),
                                "S5"=c(51,62,69,NA,25,NA)))

matrix2  <- na.omit(matrix1)
exclude_rows <- na.action(matrix2)

xy[-exclude_rows,]
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.