Hi,
I have this df:
source <- data.frame(
stringsAsFactors = FALSE,
Resp = c(1,2,3,4,5,6,7,8,9,
10,11,12,13),
Neareststation = c("Abbey Road",
"Abbey Wood","Abbey Wood","Abbey Wood",
"Abbey Wood","Aber",
"Abercynon","Aberdare","Aberdare",
"Aberdare","Aberdare","Aberdare",
"Aberdare"),
Postcodearea = c("E","DA","SE","DA",
"SE","XY","CF","CF","CG",
"CF","SA","CF","SA")
)
Where respondents specify their Postcodearea. Some of them are incorrect.
Is it possible to remove less frequent responses and get something like this?
result <- data.frame(
stringsAsFactors = FALSE,
Neareststation = c("Abbey Road","Abbey Wood",
"Aber","Abercynon","Aberdare"),
Postcodearea = c("E", "DA", "XY", "CF", "CF")
)
The rule is: We are keeping the most frequent response. If there is draw (Abbey Wood has two DAs and two SEs), we select any of the two.
Is it easy to do?