Error in UseMethod(TermDocumentMatrix, x), Mail Base

Hi,

I want to use this code to identify spam, but I have a problem when I create a document term matrix for spam and easy_ham.

Sys.setenv(LANG = "en")
require(tm)
#> Loading required package: tm
#> Loading required package: NLP
suppressWarnings(require(RTextTools))
#> Loading required package: RTextTools
#> Loading required package: SparseM
#> 
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#> 
#>     backsolve

setwd("C:/Users/Maciek/Desktop/spamassasin")

spam <- Corpus(DirSource("spam"), readerControl = list(language="lat"))
easy_ham <- Corpus(DirSource("easy_ham"), readerControl = list(language="lat"))
if (file.exists("easy_ham/cmds")) file.remove("easy_ham/cmds")
if (file.exists("spam/cmds")) file.remove("spam/cmds")
meta(spam, tag = "type") <- "spam"
meta(easy_ham, tag = "type") <- "easy_ham"
combinedcorpusEasy <- c(spam,easy_ham, recursive=T)
combinedResampledCorpusEasy <- sample(combinedcorpusEasy, 750)
spamTDMEasy<- DocumentTermMatrix(combinedResampledCorpusEasy)
#> Error in UseMethod("TermDocumentMatrix", x): no applicable method for 'TermDocumentMatrix' applied to an object of class "character"

I dont know how to fix that, can someone help me?

Here is the database I use (at the bottom)

Could you please turn this into a self-contained reprex (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

Right now the best way to install reprex is:

# install.packages("devtools")
devtools::install_github("tidyverse/reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

For pointers specific to the community site, check out the reprex FAQ, linked to below.

2 Likes

Your Corpus object has been turned into a a character vector. If you supply data and reproducible example I can help you out.

Here's a simple recreation using the crude corpus from the tm package:

library(tm)
library(RTextTools)

data(crude)

test1 <- crude[1:10]
test2 <- crude[11:20]
meta(test1, tag = "type") <- "test1"
meta(test2, tag = "type") <- "test2"

combinedcorpusEasy <- c(test1, test2, recursive=T)
combinedResampledCorpusEasy <- sample(combinedcorpusEasy, 10)
spamTDMEasy <- DocumentTermMatrix(combinedResampledCorpusEasy)

# This works...
spamTDMEasy
#> <<DocumentTermMatrix (documents: 10, terms: 654)>>
#> Non-/sparse entries: 1060/5480
#> Sparsity           : 84%
#> Maximal term length: 15
#> Weighting          : term frequency (tf)

# Recreating your error with a character vector...
DocumentTermMatrix("text")
#> Error in UseMethod("TermDocumentMatrix", x): no applicable method for 'TermDocumentMatrix' applied to an object of class "character"

I use that DATA
I have a code that works from book Machine Learning for Hackers here is a code its a Naive Bayes classify, but i want also classify with Support Vector Machine, but i dont know how, and that is the reason why i'm trying to use code from website that I placed in the first post.