Remove multiple rows from a dataset

I have a data frame something like this (Sample dataset)

library(dplyr)
sample<-data.frame("id"=c(1:100,"sdasd","adsd","cxzc","erfdf","","fdgdf",NA)
sample<-sample_n(a,107)

records_to_rm=data.frame("records"=c(levels(factor(sample$id))[102:106],""))
records_to_rm=as.character(records_to_rm$records)

I have to remove characters and NA from the dataset.
I am trying to use these (Method 1)

b=subset(sample,id!=records_to_rm)

The error pops up

Error in id == records_to_rm : 
  comparison of these types is not implemented
In addition: Warning message:
In eval(e, x, parent.frame()) :
  Incompatible methods ("Ops.factor", "Ops.data.frame") for "=="

(Method 2)

b=sample$id[sample$id!=records_to_rm]

Error

 Warning messages:
1: In `==.default`(sample$id, records_to_rm) :
  longer object length is not a multiple of shorter object length
2: In is.na(e1) | is.na(e2) :
  longer object length is not a multiple of shorter object length

b=data.frame(b)
O/P    #No change in original dataset

(Method 3)

for(i in length(records_to_rm)){
+     
+     b=subset(sample,id!=records_to_rm[i])
+ }

O/P 
       id
31     31
47     47
18     18
69     69
93     93
66     66
3       3
94     94
99     99
84     84
6       6
2       2
32     32
20     20
83     83
1       1
49     49
81     81
106 fdgdf   # to remove
50     50
102  adds # to remove
26     26
76     76
10     10
75     75
73     73
97     97
90     90
74     74
56     56
77     77
33     33
7       7
23     23
40     40
34     34
8       8
19     19
28     28
98     98
46     46
96     96
22     22
12     12
60     60
55     55
41     41
67     67
29     29
61     61
62     62
65     65
42     42
38     38
45     45
43     43
9       9
5       5
36     36
16     16
68     68
103  cxzc #to remove
14     14
92     92
15     15
54     54
104 erfdf  # to remove
59     59
17     17
35     35
82     82
100   100
39     39
13     13
30     30
52     52
71     71
21     21
80     80
78     78
63     63
57     57
25     25
48     48
70     70
101 sdasd  # to remove
51     51
58     58
85     85
91     91
89     89
72     72
44     44
88     88
11     11
64     64
37     37
95     95
87     87
53     53
4       4
24     24
27     27
86     86
79     79  #characters still present

There are more than one characters and NA's distributed randomly in the dataset.

How can I remove the same, as subsetting each characters will be cumbersome.

I tried tor reproduce the example but there can be errors.
Please, can you help me understand what I was doing wrong.

Thank you for your time and concern.

If I understand you correctly, you want to remove the rows from sample where sample$id is an element of the vector records_to_rm? If so, use the %in% operator:

b=subset(sample,!(id %in% records_to_rm))

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.