Error in as.character(x$content):

Yousha_Adib · December 27, 2022, 9:29pm

Hi, I'm beginner in RStudio. I'm getting the following error when I ran Knit on RMarkdown in RStudio Cloud.

Error in as.character(x$content) :
cannot coerce type 'closure' to vector of type 'character'
Calls: ... eval_with_user_handlers -> eval -> eval -> Corpus -> SimpleCorpus
Execution halted

I am running the following code in RStudio Cloud:

Setup and Data Loading

# Install Packages
install.packages("tm")  # for text mining
install.packages("SnowballC") # for text stemming
install.packages("wordcloud") # word-cloud generator
install.packages("RColorBrewer") # color palettes
install.packages("syuzhet") # for sentiment analysis
install.packages("ggplot2") # for plotting graphs

# Load Libraries
library("tm")
library("SnowballC")
library("wordcloud")
library("RColorBrewer")
library("syuzhet")
library("ggplot2")

# Dataset Import
text <- readLines(file.choose())

# Load Data as Corpus
TextDoc <- Corpus(VectorSource(text))

Data Cleanup

As this is a web-scrapped data, it needs some basic cleanup. I'll remove some punctuation, symbols and English stop words from the data-set. Then I'll convert them all into lowercase and remove extra white space. Finally I will be converting the words to their root form by steaming.

toSpace <- content_transformer(function (x , pattern ) gsub(pattern, " ", x))
TextDoc <- tm_map(TextDoc, toSpace, "/") # Replace "/" with Space 
TextDoc <- tm_map(TextDoc, toSpace, "@") # Replace "@" with Space
TextDoc <- tm_map(TextDoc, toSpace, "\\|") # Replace "\\|" with Space
TextDoc <- tm_map(TextDoc, toSpace, "-") # Replace "-" with Space
TextDoc <- tm_map(TextDoc, removeNumbers) # Remove Numbers 
TextDoc <- tm_map(TextDoc, removePunctuation) # Remove Punctuation
TextDoc <- tm_map(TextDoc, content_transformer(tolower)) # Convert the text to lower case
TextDoc <- tm_map(TextDoc, stripWhitespace) # Remove White space
TextDoc <- tm_map(TextDoc, removeWords, stopwords("english")) # Remove common stop word
TextDoc <- tm_map(TextDoc, stemDocument) # Converting to Root Format

Preliminary Results

Now I'll be viewing the preliminary results to see if there are any un-wanted word counts, which needs to be removing. I'll be using the function TermDocumentMatrix() from the text mining package, it will show a table with frequency of the words. The results will be sorted in descended order and top 400 word frequency will be viewed.

TextDoc_dtm <- TermDocumentMatrix(TextDoc)
dtm_m <- as.matrix(TextDoc_dtm)
dtm_v <- sort(rowSums(dtm_m),decreasing=TRUE)
dtm_d <- data.frame(word = names(dtm_v),freq=dtm_v)
head(dtm_d, 400)

Removing Irrelevant Words

After initial inspection of the result, I'll be removing 61 words from the data-set. Then again load the TermDocumentMatrix() function and count the word frequency.

TextDoc <- tm_map(TextDoc, removeWords, c("data","experi","work",	"analyt",	"manag",	"skill",	"abil",	"strong",	"knowledg",	"use",	"understand",	"includ",	"process",	"develop",	"servic",	"client",	"technic",	"build",	"stakehold",	"requir",	"learn",	"effect",	"system",	"good",	"degre",	"year",	"intern",	"analysi",	"technolog",	"within",	"level",	"account",	"project",	"engin",	"financ",	"will",	"abl",	"perform",	"activ",	"relev",	"industri",	"visualis",	"solut",	"etc",	"creat",	"relationship",	"present",	"solv",	"comfort",	"coach",	"relat",	"provid",	"previous",	"written",	"organis",	"program",	"profici",	"queri",	"deliv",	"time","oper"))

Generating Word Cloud

Now that the result is per my satisfaction, I'll be visualizing the output with a Word Cloud with minimum frequency of 5 and maximum word count of 250 in the Descending Order.

#generate word cloud
set.seed(1234)
wordcloud(words = dtm_d$word, freq = dtm_d$freq, min.freq = 5,
          max.words=250, random.order=FALSE, rot.per=0.40, 
          colors=brewer.pal(8, "Dark2"))

Skills Association

From the previous results, I know which skills are in demand for Data Analyst. Now I need to find out the top skills association with other words, for the following skills, within co:relation of minimum 0.25.

findAssocs(TextDoc_dtm, terms = c("sql","python","alteryx","insight","excel", "azur", "tableau", "powerbi", "model","communic", "team"), corlimit = 0.25)

technocrat · December 27, 2022, 10:50pm

I'd say you are past beginner.

Somewhere whatever operation involves an object x$content is treating it as a function object. This can happen with user-created objects such as data frames named df, which is also the name of a built-in function. However, whether that happens depends on context—sometimes it's all good and sometimes it's an illegal operation on a closure (a bare function).

The difficulty is finding the problematic operation. I suspect it's coming from an attempted output to pdf format, in which case it has nothing to do with your data or what you've created from it. To test that, create a new rmarkdown data with the default options to output pdf. If that works, then it may have something to do with either non-default pdf output options or some code in a chunk. You can test chunks by running them one-by-one or by isolating any using as.character.

Yousha_Adib · December 27, 2022, 11:37pm

Thanks. Yes I was trying to Knit as PDF. I will follow you steps, hopefully no new error will pop up.

technocrat · December 28, 2022, 12:38am

Come back if you need more guesses or, if you figure it out, share with the community.

Yousha_Adib · December 29, 2022, 12:16am

Hi, the error caused for interactive dataset import function. I have replaced with the following function and it worked perfectly.

text <- readLines("Dataset.txt")

system · February 12, 2023, 12:16am

This topic was automatically closed 45 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.