I created a dictionary of unigrams, bigrams, trigrams, and 4-grams. The dictionary looks like this:
word1 word2 frequency
will 5153
like 5081
get 4821
let know 627
right now 575
look like 559
next week 478
social media 465
let_us know 194
let_know think 173
cinco_de mayo 172
manufacturer_custom built 171
custom_built painted 171
Here is a function to look up words in the dictionary (named dat)
The purpose of the following code is to look up a text string in the dictionary. Messages will be printed to the console, on whether the phrase appears in the dictionary or not.
predict<-nxtword1("new_manufacturer_custom_built")
if (row.names(predict)>0) {print("match found")
} else {print("no match found")}
predict
I knew that the phrase "new_manufacturer_custom_built" did not appear in the dictionary. Apparently R does not like my conditional statement. I was hoping that the else statement would be executed. Instead an error message appears
Error in if (row.names(predict) > 0) { : argument is of length zero
> else {print("no match found")}
> predict}
Apparently R does not like my conditional statement. How to fix this?
Here is some from the dictionary
word1 word2 frequency
let know 627
right now 575
look like 559
next week 478
social media 465
let_us know 194
let_know think 173
cinco_de mayo 172
manufacturer_custom built 171
custom_built painted 171
Thank you for your code. It worked:) I tried elaborating on it to get it to execute functions I had written before.
checkdic <-function(word) {ifelse(word %in% dat$word1, nxtword1(word), nxtword1(less1gram(word)))
}
When the word appeared in the dictionary, all was well
You really do have to pay attention to data types in R. Dataframes, lists, vectors, logical etc.
row.names(predict) is a vector of strings (possibly with length 0), so how can you compare it to > 0? nrow(predict) is probably what you want, as mentioned above.
The double square bracket shows that the nxtword1 function returns a list, and the first element [[1]] is a vector of strings.
However the less1gram function fails for words that are not in dat$word1 (which is a vector). You need to fix it to use %in% like you did with nxtword1, I suspect.
Thank you for your attention. After reading your example, I fixed my code. This time it worked!
following_word<- function(phrase){match <-nxtword1(phrase)
if(nrow(match)>0){print(match)
} else{(nxtword1(less1gram(phrase)))
}
}