Hi,
I want to use this code to identify spam, but I have a problem when I create a document term matrix for spam and easy_ham.
Sys.setenv(LANG = "en")
require(tm)
#> Loading required package: tm
#> Loading required package: NLP
suppressWarnings(require(RTextTools))
#> Loading required package: RTextTools
#> Loading required package: SparseM
#>
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>
#> backsolve
setwd("C:/Users/Maciek/Desktop/spamassasin")
spam <- Corpus(DirSource("spam"), readerControl = list(language="lat"))
easy_ham <- Corpus(DirSource("easy_ham"), readerControl = list(language="lat"))
if (file.exists("easy_ham/cmds")) file.remove("easy_ham/cmds")
if (file.exists("spam/cmds")) file.remove("spam/cmds")
meta(spam, tag = "type") <- "spam"
meta(easy_ham, tag = "type") <- "easy_ham"
combinedcorpusEasy <- c(spam,easy_ham, recursive=T)
combinedResampledCorpusEasy <- sample(combinedcorpusEasy, 750)
spamTDMEasy<- DocumentTermMatrix(combinedResampledCorpusEasy)
#> Error in UseMethod("TermDocumentMatrix", x): no applicable method for 'TermDocumentMatrix' applied to an object of class "character"
I dont know how to fix that, can someone help me?
Here is the database I use (at the bottom)