MLP model with Keras and text2vec package

#1

Hello

I have built a MLP model using keras to classify text. In order to predict the class of a given text, again I did below steps which I did for "training" set in the model building process

  • q is some random question in text format
  • ls is a defined function to convert text into lower case and do stemming
token=itoken(q,preprocess_function=ls,tokenizer=word_tokenizer)
vtxt=create_vocabulary(token,stopwords=stemDocument(stopwords('english')),ngram=c(1,1))
vectorizer=vocab_vectorizer(vtxt)
# #Document term matrix
dtmq=create_dtm(token,vectorizer)

To get TF-IDF, which is parsed in MLP, I have used below steps

model1=TfIdf$new(smooth_idf = TRUE,norm="l2")
dtm=model1$fit_transform(dtm)

Now the dimension of 'dtmq' is (1,35) but my model requires input_shape of 1462, how can I convert the given text in such a way that my MLP model accepts.

0 Likes

#2

By using pruned vectorizer used for training part, we will be able to get same dimension.

#Preparation
tokenq=itoken(q,preprocess_function=ls,tokenizer=word_tokenizer)
vectorizer=vocab_vectorizer(pruned_vocab)

#Document term matrix
dtmq=create_dtm(tokenq,vectorizer)
dim(dtmq)
0 Likes

#3

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

0 Likes

closed #4

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

0 Likes