term document matrix from tm package, selecting columns

Using the tm package, I got a Term Document Matrix from a corpus of words. After using this code to get a dataframe:

dtm <- TermDocumentMatrix(tdocs)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word=names(v), freq=v)

d25<- d[1:25,]

The data frame looks like this:

head(d25)
word freq
get get 1699
just just 1656
good good 1437
like like 1257
know know 1186
day day 1174
names(d25)
[1] "word" "freq"

What about those words in the leftmost side? I want to get a dataframe with only the word and freq columns. How do I get rid of those extra words? I've tried using the d <-select(d, word, freq) in dplyr , but the extra column comes back

head(d25)
word freq
get get 1699
just just 1656
good good 1437
like like 1257
know know 1186
day day 1174

That is not an extra column, those are just row names, you can get rid of them with

rownames(d25) <- NULL

Yes it worked! Thank you! I didn't know that rows could have names.

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.